Searching Confidence Scores

When data is classified for PCI, PHI, or PII, a Classification Score is applied to the data. The Classification Score is visible in the Results View. In the case of the first result shown below, the model is 80% sure that this item contains PCI—see Viewing Classified Data.

You can perform a Message, Attachment, Document, Federated, or Advanced Search to look for results. Simply navigate to the Search menu and make your Search selection. Then, navigate to the Tags tab. Here, you can set the Confidence Score range to match the Classification Score range that you would like to search. Setting the Confidence Score will allow you to search within this specific range of applied Classification Scores.

Until you know what results turn up, leave the Confidence Score Range at the default setting. This will cast the widest net. Review the results. How many false positives turn up?

Fine-tune the confidence range to find what you are looking for. Remember that with a range that starts too low, you’ll get higher false positives, and with a range that starts too high, you'll risk missing results.

Suggested method for fine-tuning the range

  • Bring the bottom slider to 80%, rerun classification, and review 2 results.
  • Bring the bottom slider to 75%, rerun classification, and review 2 results
  • Keep lowering the range in increments of 5% until the results are mostly not what you are looking for.
  • Return the bottom slider to the last 5% increment where most of the results are classified correctly.

This method will give you a good indication of the Confidence Score range you should use.

Once you have decided on a new Confidence Score range, return to the Search and make the adjustment.

For all details on searching—see About Searching.