Adding Smart Classification Regex Patterns

Smart Classification looks at a pattern you have added and checks if the content of uploaded and modified files includes that pattern. Patterns can be expressed in different ways - for example, in regex format, as text in a query, or as an AI match term.

In a large number of cases, the patterns to be matched are expressed in regex format and locate common identification information such as national ID numbers. To make it easier for you to set up rules, Smart Classification has predefined a number of these commonly-used regex patterns.

There may be regex patterns that are common in your organization but are not included in the predefined patterns, so Smart Classification enables you to define and save any number of regex patterns so you don't have to type them manually each time you enter them in a rule.

You also have the option of entering a regex pattern manually into a rule, which may be efficient for patterns that you are planning to use only once or infrequently.

Selecting from predefined regex patterns

The predefined patterns in Smart Classification are all in regex format. You can view all of them by clicking the Patterns link in the Smart Classification page menu bar. If you add your own patterns, they will also appear here. 

Next to the name of each pattern is the regex for the pattern. For example, if you choose to match on the EU Debit Card Number, Smart Classification looks for files with content that matches the regex [0-9]{16} (which is equivalent to any 16 numerals in a row).

If you are looking for a pattern that matches one of the predefined patterns, choose it in the Add Content Classification Rule wizard as the Classifier pattern by selecting Match pattern by name and choosing the pattern name.  Note: You can only choose regex patterns with the Default and Solr Pattern Match classifiers, which use regex patterns.

Creating your own regex pattern before adding it to a rule

The following procedure for creating and saving your own regex pattern uses the example of medical record numbers in the format of 6 digits in pairs of 2 separated by dashes, such as 12-34-56.

To create your own pattern:

  1. In the menu bar of the Smart Classification screen, click Patterns to open the Patterns screen, and then click Add Pattern.

    The New Content Classification Pattern dialog box opens:
  2. In Pattern name, enter a name for a pattern.
  3. In Regular Expression (RegEx) enter the regex for the pattern. 
    If you're not familiar with writing regular expressions, you can find a number of sites with information online, such as those at https://www.geeksforgeeks.org/write-regular-expressions/ and https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference.
  4. To see if your regex matches the correct patterns, click Test the pattern.
    The field expands.
  5. Type in some text that contains content matching your regex into the box and click Check.

    If your regex is working as expected, the matches you entered are highlighted and a count of the matches appears below the box.
  6. If the test for your pattern was successful, click Add pattern.
    The pattern appears in the list of patterns and in the drop-down list of the Add Content Classification Rule wizard when you choose a Classifier pattern for the Default or Solr Pattern Match classifier and choose Match pattern by name.

    The new Medical Record Number pattern appears in the list of patterns on the Patterns tab.


    The new Medical Record Number pattern appears in the Choose a pattern drop-down list in the Add Content Classification Rule wizard.

Adding a pattern as you create a rule

You are not required to give regex patterns names and add them to the list of saved patterns to use them in a content classification rule; instead, you can enter the regex manually when you create a rule that uses the Default or Solr Pattern Match classifier.
For instructions on adding a regex pattern manually to a rule, see the Match with Regex videos for the Default and Solr Pattern Match classifiers in Guide to Classifiers.