This glossary contains terms and acronyms related to eDiscovery.


A compliance audit report that displays results that match a compliance audit. Alerts are typically sent by email.

attachment search

A search that allows you to apply search criteria to attachments found in archived messages.

audit manager

Case managers can perform the following:

  • Create, track, manage, edit, export, and save cases.

  • Perform all eDiscovery tasks, including placing custodians on Legal Hold.

  • Assign Full Rights to Case Members.

audit search

A search that allows you to search for archived messages based on the auditing actions performed on the data.


Boolean operator

Modifiers that are used to connect and define the relationship between your search terms. When performing a search, Boolean operators help to either narrow or broaden your search criteria. Three common Boolean operators are: AND, OR, and NOT.

Boolean search

A search that allows you to combine keywords with operators (or modifiers) such as AND, NOT and OR to further produce more relevant results. For example, a Boolean search could be "hotel" AND "New York". This would limit the search results to only those documents containing the two keywords.Boolean searching is based on an algebraic system of logic formulated by George Boole, a 19th century English mathematician.

case manager

Case managers can perform the following:

  • Create, track, manage, edit, export, and save cases.

  • Perform all eDiscovery tasks, including placing custodians on Legal Hold.

  • Assign Full Rights to Case Members.

confidence threshold

The percentage cut-off that you set for the classification. The higher you set the threshold, the more accurate the results. However, a higher threshold also means that fewer items will be labeled by the classification.

dark data

A type of data that is hidden and unstructured, expensive to secure and store, but most companies do so because of compliance regulations; the credo being 'store everything just in case'. Some examples of data often left dark include server log files that can give clues to website visitor behavior and customer call detail records that can indicate consumer sentiment.

data controller

An entity, person, or group that determines the purposes and means of processing personal data.

data processing

Any operation performed on personal data from collection to destruction, including recording, organization, structuring, storage, adaptation, or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, and restriction.

data processor

An entity, person, or group that processes data on behalf of the data controller.

data subject

A person whose personal data is being collected, held, or processed.


Data Loss Prevention

A business strategy for ensuring users do not send sensitive or critical information outside the corporate network. Information is classified and protected so unauthorized users cannot accidentally or maliciously share data whose disclosure could put the organization at risk.


A type of semi-structured content contained in, for instance, an email or an attachment.

end user

A person who can only their own mail archives in IPRO Search.


Electronically Stored Information

Data that is created, altered, communicated and stored in digital form, including writings, graphs, charts, photographs, sound recordings, images, and other data compilations.

federated search

A federated search lets you simultaneously search for data across multiple searchable repositories. You can easily make a single query on both mailbox items and archived items, and find data quickly, navigating from thousands of search results to a handful.

Federated search supports multiple formats and search results are in its original formats, such as Word docs, PDFs, Excel spreadsheets, and file shares in a consolidated view, which identifies the knowledge type. Approximately 600 file types are supported.


A container for storing content and/or data.


Financial Industry Regulatory Authority

An organization dedicated to investor protection and market integrity by establishing rules and enforcing its compliance.

FINRA 3310

A supervision rule that states each FINRA member must establish and maintain a system to supervise the activities of each associated person. The system comply with securities laws and regulations, and applicable FINRA rules.

FINRA 4511

A general requirements rule that states each FINRA member must make and preserve books and records, and preserve them for a period of at least six years in an appropriate format and media.


General Data Protection Regulation

A regulation by which the European Parliament, the Council of the European Union and the European Commission intend to strengthen and unify data protection for all individuals within the European Union (EU).


A part of query building, groups are used for containing rules. For each group added, you can place as many rules inside as needed.


Health Information Portability Accountability Act

Industry-wide standards for health care information on electronic billing and other processes, and the protection and confidential handling of protected health information.


Intellectual Property

Creations of the mind, including inventions, literary and artistic works, and symbols, names, and images used in commerce.


A word or phrase - typically a phrase of two or three words - which has been identified as one which potential users would use when searching archives.

keyword search

A search that looks for specific words (keywords) in an archive. Keyword searches are helpful when you have incomplete information.

legal hold

A process that an organization uses to preserve all forms of relevant information when litigation is reasonably anticipated. The legal hold is initiated by a notice or communication from legal counsel to an organization that suspends the normal disposition or processing of records, such as archived media and other storage and management of email, electronic documents and information. A legal hold will be issued as a result of current or anticipated litigation, audit, government investigation or other such matter to avoid evidence spoliation.


Members have a subset of the rights of a case manager; they can view, search, comment, tag, save, and export cases–everything internal reviewers or external counsel may need to perform eDiscovery.

Members cannot open new cases or close existing cases, or place cases on Legal Hold.

Natural Language Processing (NLP)

A field of computer science that uses artificial intelligence and machine learning to process natural language data and situate search results within context. NLP is comprised of several syntax algorithms, including lemmatization, stemming, and other aspects of morphology – the study of how words are formed and their relationship to other words in the target language.

nested query

A search query that consists of different rules and operators inside a query, which together comprise a complex nested query.


See Natural Language Processing.


Payment card information is information related to transactions involving credit cards, prepaid cards, point-of-sale cards, e-purse, bank debit, and ATM cards. Sensitive information may include, credit card account numbers, PIN and CVD (card verification data) numbers, expiration dates, as well as information stored on magnetic strips and chips (data and RFID).

personal data

Any information that relates to an identified or identifiable natural person such as name, address, localization, online identifier, health information, income, cultural profile, and more.


Protected health information, also referred to as personal health information, refers to the past, present or future physical or mental health condition of an individual including their medical histories, test and laboratory results, insurance and payment information, and other data that a healthcare professional collects to identify an individual and determine appropriate care.


An attempt to acquire sensitive information, such as usernames, passwords, and credit card details, often for malicious reasons, by masquerading as a trustworthy entity in an electronic communication.


Personally identifiable information is any information about an individual, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as their name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) other information that is linked to an individual, such as medical, educational, financial, and employment records.


Personal Information Protection and Electronic Documents Act

The Canadian federal privacy law for private-sector organizations. It sets out the ground rules for how businesses must handle personal information in the course of commercial activity.

random sampling

Random sampling of the data in an organization is a FINRA requirement. The compliance officer must send, on a daily basis, a percentage of sample data for compliance verification. This same process may repeat on a monthly basis to include all the data from the previous month.


A collection of content and/or data, providing intelligence or evidence pertaining to the information preserved. See records management for more information.

Records Management

Encompasses Electronic Records Management and Traditional Records Management, their overlap, and their divergence. Records Management concerns the organizational strategy of the lifecycle of records; their conception, usage, archival, and disposition.


A member of the eDiscovery or data audit team who is assigned to work on the search results of cases or audits assigned to them by the case manager, audit manager, member, or auditor. They can view, comment, tag, and save cases or audits.


The role that a person plays in an eDiscovery case or data audit determines their rights in IPRO Search. That is, they can access to certain features in IPRO Search and not others. This alleviates system administration and security concerns around unauthorized data access. Roles include:

eDiscovery: case manager, member, and reviewer.

Data Audit: audit manager, auditor, and reviewer.

There is also an end user role and they can only access their own mail archive.

root stemming

In linguistic morphology and information retrieval, root stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form.

proximity search

A way to search for two or more words that occur within a certain number of words from each other. The proximity operators are composed of a letter (N or W) and a number (to specify the number of words).

reserved characters

Characters that are reserved for use by IPRO Search and must be escaped in order to include them in a search. These include the following:

Single Characters: \ + - ! ( ) { } [ ] ^ \ ~ :

Double Characters:S && and ||


Redundant, Obsolete, Trivial

Information that is repetitive, outdated and not strategically important for an organization, and can be deleted.


Rules define a search query. A rule is basically an "if-then" statement which determines how criteria is treated. A few basic parts make up an equation: the search parameter (called field), operator, and text/calendar.

SEC Rule 17a-3

Books and records requirements for brokers and dealers under the Securities and Exchange Act of 1934.

SEC Rule 17a-4

Records to be preserved by certain exchange members, brokers, and dealers.

stop word

A word that is very common and determined to be of little value in search results. These include: a, an, and, are, as, at, be, by, for, from, if, it, me, that, they, was, were, with, you, and so on. Also called noise words.


According to FINRA Rule 3310 (Supervision), organizations must establish and maintain a system to supervise all written business communications. in order to achieve compliance with the applicable securities laws and regulations and FINRA rules.


When reviewing a case or audit on an item-by-item basis, tagging refers to designating items that are or are not relevant for further investigation.

Predefined tags in IPRO Search include:

eDiscovery: Relevant, Privileged, Flagged, and Work Product.

Data Audit: Marked as Reviewed, Marked as Unreviewed, Quarantined, and Delete.


A set of predefined rules for some of the most commonly sought types of information. Using a template, you perform these kinds of searches: credit card, email, phone number, postal code, social security numbers (Canada, France, USA), and URLs.


A basic feature of Natural Language Processing is that character sequences are segmented into pieces or tokens. During this process, which makes other more complex NLP functions possible, punctuation is removed.

"Tokens" are usually individual words (at least in languages like English) and "tokenization" is taking a text or set of text and breaking it up into its individual words.

wild card (*)

A symbol that uses root stemming (root words) to find all variants and partial endings to words. Each asterisk (*) represents a partial word and is treated as a placeholder for it.

Example: org*

Finds items that include: organ, organized, organization, and so on.

Example: c*c*

Finds items that include: calculation, check

word list

A lexicon of terms for a particular job or industry. In IPRO Search, you can build a list of terms you can keep updated as you encounter more terms the further you go in a search.