Creating Smart Classification Regex Pattern Groups

You can group regex patterns together in pattern groups, which enables you to add them to Content Classification Rules together. This is useful if you have multiple regex patterns that you frequently add together to rules, for example, if you have multiple rules that search for personally identifiable information (PII), you could add a regex pattern group that includes patterns for national ID number, passport number, and driver's license numbers, and add the group each time you add a new PII rule.

When you create a regex pattern group, you can either create new patterns and add them to the group as you create it or you can add existing patterns to the group.

Creating a regex pattern group

The steps for creating a regex pattern group below use the example of a company that is creating a pattern group for PII that includes:

  • The predefined patterns France Driver's License Number, France National ID Card, and France Passport Number
  • The new pattern Company ID, a 6 digit numerical pattern. 

To add a regex pattern group:

  1. In the Smart Classification screen, click Patterns in the menu bar, and then click Add pattern group.

    The New Pattern Group dialog box opens.
  2. Enter a Group name.
  3. Click Add pattern group.
    The new pattern group name is added above the list of patterns.
  4. Click the pattern group name.
    At this point, the pattern group is empty. 
  5. Click Add pattern to group.

    A drop-down list opens. It lists existing pattern groups for you to choose and enables you to add new pattern groups and include them.
  6. Click France Driver's License Number.
    The dialog box closes and France Driver's License Number is added to the list for the group. Another Add pattern to group link appears below it.
  7. Click the Add pattern to group link and click France National ID Card.
    France National ID Card is added to the group.
  8. Click the Add pattern to group link below it, and click France Passport Number.
    Now all three of the predefined patterns are added to the PII pattern group.
  9. Below the three patterns, click Add pattern to group.
  10. To create the new Company ID pattern, click + Pattern at the bottom of the drop-down list.

    A New Content Classification Pattern dialog box opens. The checkbox Add to PII Group is automatically checked to enable you to add the pattern and include it in the group at the same time.
  11. In Pattern name, enter Company ID.
  12. In Regular Expression (RegEx) enter [0-9]{6}.
  13. To open a test box, click Test the pattern.
    Notice that the box has two tabs: Default and PatternMatch which correspond to the Default and Solr Pattern Match classifiers. Test it in the tab that corresponds with the classifier you plan to use, or test it in both. For descriptions of the classifiers, see Guide to Classifiers.
  14. Type text that includes a 6-digit number into the Test the pattern box.
  15. To test the pattern, click Check.
    If the test is successful, the dialog box is similar to the following, with the 6-digit number highlighted and a message below it indicating the number of matches.
  16. If the test is successful, click Add pattern.
    The pattern is added to the pattern group.
     
    It is also added to the list All Patterns, so you can add it to a rule individually or as part of the pattern group.

Adding a regex pattern group to a rule

The following procedure shows you how to add a regex pattern group to a rule on the second page of the rule wizard. For full instructions on adding a rule, see Creating a Smart Classification Rule.

To add a regex pattern group to a rule:

  1. On the second page of the Add a Content Classification Rule wizard, choose either of the regex classifiers, Default or Solr Pattern Match.
  2. Click Add pattern, and choose Match pattern by group.

    Match pattern by group is now listed in Classifier patterns
  3. Click the drop-down list next to it.
    Any groups you have added are listed under the search box.
  4. Click the group you want to use for the rule:

    Once the group is selected, you can test the different patterns in the group together.
  5. To test the pattern group, click the comment check icon next to the group name.
  6. Enter a sentence or phrase that includes any number of patterns from the group, and click check.
    The test verifies if the patterns are working and how many times matching patterns appear: