Links

Quick Start

Describes how to quickly get Philter running on AWS.
If you have any questions as you go through this guide please don’t hesitate to reach out.

Step 1: Run Philter

Select your cloud provider below to view links and instructions for how to launch Philter in that cloud. Philter is currently available on the AWS, Azure, and GCP marketplaces.
Amazon Web Services
Google Cloud
Microsoft Azure
Philter on AWS is a virtual machine-based product. It runs in EC2 on its own EC2 instance. A free trial period is available during which there is no charge for the Philter software but there may be charges for the underlying AWS infrastructure.
Launch Philter in AWS
  1. 1.
    Go to Philter in the AWS Marketplace. On this page you can see the Philter overview, the pricing, and the supported EC2 instance types.
  2. 2.
    Select an instance type. We recommend m5.large. The smaller instance types are intended only for testing and are not well-suited for production usage.
  3. 3.
    Click the “Continue to Subscribe” button.
  4. 4.
    View and accept Philter’s license agreement. Then click “Accept Terms.”
  5. 5.
    The subscription will now be created and you will be notified when it is ready! This usually only takes less than a minute.
  6. 6.
    Click the “Continue to Configuration” button to select the AMI, the version, and the region. We recommend using the newest version if multiple are available.
  7. 7.
    Click the “Continue to Launch” button to launch Philter in your AWS account!
AWS will automatically open ports 22 (SSH) and 8080 (Philter API) for the Philter instance's security group. These ports are required to be open but you may want to modify the security groups to limit their scope of availability by restricting access to specific CIDR ranges.
Here’s a brief screen cast showing how to launch Philter in AWS.
Congratulations! You have deployed Philter in AWS. You are now ready to filter text!
Launch Philter in Google Cloud
  1. 2.
    Click the "Launch on Compute Engine" button.
Virtual Machine Recommendations
The general purpose machine type is n2-standard-2 and this machine type should be adequate for most use-cases. We recommend 8 vCPUs and 8-16 GB of RAM for a production deployment.
Google Cloud will automatically open ports 22 (SSH) and 8080 (Philter API). These ports are required to be open but you may want to modify the security groups to limit their scope of availability by restricting access to specific CIDR ranges.
Congratulations! You have deployed Philter in Google Cloud. You are now ready to filter text!
Philter on Microsoft Azure is a virtual machine-based product. A free trial period is available during which there is no charge for the Philter software but there may be charges for the underlying Azure infrastructure.
  1. 2.
    Click the “Get It Now” button.
  2. 3.
    Review the information that is shown on the popup and click “Continue” when ready.
  3. 4.
    You will now be asked to log in to your Microsoft Azure account if you were not already logged in.
  4. 5.
    Click the “Create” button to begin making a Philter virtual machine.
  5. 6.
    Enter the required details of the virtual machine and click the “Review + create” button.
  6. 7.
    Review the virtual machine details and click “Create” when ready!
Your Philter virtual machine will now be launching.
Microsoft Azure will automatically open ports 22 (SSH) and 8080 (Philter API). These ports are required to be open but you may want to modify the security groups to limit their scope of availability by restricting access to specific CIDR ranges.
Congratulations! You have deployed Philter in Azure. You are now ready to filter text!
Philter on Google Cloud is a virtual machine-based product. A free trial period is available during which there is no charge for the Philter software but there may be charges for the underlying Google Cloud infrastructure.
Cloud virtual machines launched from a cloud marketplace may not be immediately suitable for a HIPAA environment. Refer to your compliance officer for your organization’s requirements to ensure compliance with all relevant regulations.

Step 2: Try it out!

With Philter now running we can take it for a spin. We will send some text to Philter and inspect at the response we get back. The Philter virtual machine running in your cloud account should have a public IP address (unless you customized the deployment). We will use that public IP address to interact with Philter.
Philter, by default, will be configured with an HTTPS listener on port 8080 using a self-signed certificate. It is recommended that prior to use in a production environment the self-signed certificate is replaced by a valid certificate owned by your organization.
In the command below, replace <PUBLIC_IP> with the virtual machine’s public IP address or the host name or IP address of the Docker host.
curl -k -X POST https://<PUBLIC_IP>:8080/api/filter --data "George Washington was a patient and his SSN is 123-45-6789." -H "Content-type: text/plain"
With this command we are sending the text in the command to Philter for filtering. Philter will identify the patient name (George Washington) and the SSN (123-45-6789) and redact those values in the response. You can always use curl to send text to Philter as in these examples but there are also SDKs you can use, too, to integrate Philter with your applications.

Identifying and Removing Sensitive Information from Text

The types of sensitive information that Philter identifies and removes is controlled by filter profiles. By default, Philter includes a filter profile that includes many of the types of sensitive information, such as names and social security numbers. We can send text to filter to Philter for filtering using this default filter profile with the following command:
curl -k -X POST https://localhost:8080/api/filter -d @file.txt -H "Content-Type: text/plain"
This command sends the contents of the file file.txt to Philter. Philter will apply the enabled filters and return a plain-text response consisting of the filtered text. (Replace localhost with the IP address or host name of Philter if you are not running the command where Philter is running.) You can also send text directly in the request instead of sending it as a file:
curl -k -X POST https://localhost:8080/api/filter --data "Your text goes here..." -H "Content-type: text/plain"
Based on the version of Philter and the cloud provider, Philter images available from the cloud marketplaces are built on either Amazon Linux 2, CentOS 7, or CentOS 8.

Next Steps

Philter is ready to go!
Now that you have Philter running and know how to send text to it you are ready to integrate Philter into your existing workflow and systems. Philter’s API details how to send files to Philter. Clients for some languages for Philter’s API are available on GitHub.
Be sure to check out Filter Profiles to see how you can customize the types of sensitive information Philter finds!

Example Uses

Here's a few examples showing how to use Philter with some common big-data and streaming applications.
Description
Technologies
Link
Remove sensitive information from text in an Apache NiFi dataflow.
Apache NiFi
Blog Post
Remove sensitive information from text using AWS Lambda in an Amazon Kinesis Firehose pipeline.
Amazon Kinesis. AWS Lambda
Blog Post
Remove sensitive information from text using a custom Apache Flink MapFunction.
Apache Flink
Blog Post
Remove sensitive information from text using a custom Apache Pulsar Function.
Apache Pulsar
Blog Post