Links

Filter Profiles

Describes filter profiles and how they work.

Filter Profiles

The types of sensitive information identified by Philter and how that information is manipulated are controlled through files called filter profiles. A filter profile is a JSON file stored under Philter’s profiles directory which by default is located at /opt/philter/profiles/.
There are sample filter profiles available for immediate use or customization to fit your use-cases.
Each filter profile has a name that is used to tell Philter which filter profile to use when filtering sensitive information from text. The name is passed to Philter’s API along with the text to be filtered when submitting text to Philter. This provides flexibility and allows you to process different types of documents in differing manners with a single instance of Philter.
We recommend using Philter Studio to create and modify filter profiles. Using Philter Studio provides a more user-friendly experience for manipulating the filter profiles than manually as described on this page. Philter Studio is a Microsoft Windows application that provides a graphical interface for creating, modifying, and managing filter profiles.

Structure of a Filter Profile

A filter profile:
  • Must have a name that uniquely identifies it.
  • Must have a list of identifiers that are filters for sensitive information.
    • Each identifier , or filter, can have zero or more filter strategies. A filter strategy tells Philter how to manipulate that type of sensitive information when it is identified.
  • Can have an optional list of terms to be ignored.
  • Can have encryption keys to support encryption of sensitive information.

An Example Filter Profile

The following is a sample filter profile. In this sample you can see the types of sensitive information that are enabled and the strategy for manipulating each type when found. This filter profile identifies email addresses and phone numbers.
{
"name": "email-and-phone-numbers",
"identifiers": {
"emailAddress": {
"emailAddressFilterStrategies": [
{
"strategy": "REDACT",
"redactionFormat": "{{{REDACTED-%t}}}"
}
]
},
"phoneNumber": {
"phoneNumberFilterStrategies": [
{
"strategy": "REDACT",
"redactionFormat": "{{{REDACTED-%t}}}"
}
]
}
}
}
When an email address is identified it is replaced with the text {{{REDACTED-email-address}}}. The %t gets replaced by the type of the filter. Likewise, when a phone number is found it is replaced with the text {{{REDACTED-phone-number}}}. You are free to change the redaction formats to whatever fits your use-case. See Filter Strategies for all replacement options.
The name of the filter profile is email-and-phone-numbers. Filter profiles can be named anything you like but their names must be unique from all other filter profiles. As a best practice, the filter profile should be saved as [name].json, e.g. email-and-phone-numbers.json.

Applying a Filter Profile to Text

To use this filter profile we will save it as /opt/philter/profiles/email-and-phone-numbers.json. We must restart Philter for the new profile to be available for use. To apply the filter profile we will pass the filter profile's name to Philter when making a filter request, as shown in the example request below.
curl -k -X POST "https://localhost:8080/api/filter?c=context&p=email-and-phone-numbers" \
-d @file.txt -H Content-Type "text/plain"
In this command, we have provided the parameter p along with a value that is the name of the filter profile we want to use for this request. If we had multiple filter profiles in Philter we could choose a different filter profile for this request simply by changing the name given to the parameter p. For more details see Philter’s API.
Philter will process the contents of file.txt by applying the filter profile named email-and-phone-numbers. As we saw in the filter profile above, this filter profile redacts email addresses and phone numbers. Philter will return the redacted text in response to the API call.
To manipulate the sensitive information by methods other than redaction, see the Filter Strategies.