Databergs Explained

Before beginning a data audit, you need to know about dark data. Simply put, this is the operational data in an organization that is not being used. Not surprisingly, every organization has a databerg with dark data.

Consulting and market research company Gartner Inc. describes dark data as "information assets that organizations collect, process and store in the course of their regular business activity, but generally fail to use for other purposes."

With the growing accumulation of structured, unstructured, and semi-structured data in organizations—increasingly through the adoption of big data applications—dark data can be seen as the operational data that is left unanalyzed or underused, and often lost altogether.

Why is dark data important? It is because it can be an opportunity for organizations if they can take advantage of it to drive new revenues or reduce internal costs. Here is the break down:

15%Business Critical Data—This data is strategically important to an organization's daily operations and success. It is usually proactively managed. This includes product road maps, business plans, and customer lists.

33%Redundant, Obsolete, Trivial (ROT) Data—This data has little or no business value and should be eliminated regularly to avoid the unnecessary storage space and costs associated with it.

52%Dark Data—This data is hidden and unstructured, expensive to secure and store, but most companies do so because of compliance regulations; the credo being 'store everything just in case'. Some examples of data often left dark include server log files that can give clues to website visitor behavior and customer call detail records that can indicate consumer sentiment.

There are risks with dark data:

  • Regulatory: Leaking or losing sensitive, dormant data and PII.
  • Intellectual Property (IP): Failing to protect IP.
  • Opportunity: Missing out on chances to improve.

The costs of dark data including loading, updating, storing, and managing unused data, which consumes personnel time and storage space. This time and infrastructure could be better spent on higher-value work.