Streaming Discovery De-Duplication Detailed Report

The De-Duplication Detailed report is available for Streaming Discovery and Streaming Imaging jobs.

This report contains detailed de-duplication similar to the reports for Data Extract and Processing Jobs and indicates the original file a document was a duplicate of.

The De-duplication setting (Custodian, Project, or None) in the Streaming Discovery Options dialog determines the report output.

The report, in the form of a .CSV file, contains the following information:

  • Relationship (Original and its Duplicates)

  • ItemID

  • DiscoveryPath (original location where the file was located at time of discovery)

  • ItemFileName (original document filename and the duplicate copy of the original document)

  • FileSizeKB

  • FamilyHash (for Streaming, this accounts for the additional layer of de-duplication for identifying duplicate families)

  • ItemHashValue (this value is the equivalent of the MD5 Hash; however, Streaming uses SHA1 for de-duplication)

  • ProjectName

  • ProjectDescription

  • CustodianName

  • CustodianDescription

  • JobName

  • OriginalJobName

To access the report for a Streaming Discovery job or a Streaming Imaging job:

  1. Right-click the desired job in the Client Management TreeView to display the context menu.

  2. Choose Reporting > Deduplication Detailed. The Windows Explorer appears and displays a named .CSV file containing the job type and job name.

  3. Accept or change the directory location.

  4. Click Save. The report generates, and Windows Explorer displays the saved .CSV file.

    Note: If there is no data, a prompt appears stating: Report contains no data. Click OK to close the prompt. No report generates.

  5. Open the .CSV file to view the report data.


