We use cookies to improve your experience. Please read our cookies policy here.


Dark data: everything you need to know

3 minute read

Nadeem Khan
Nadeem Khan

In the age of technology driven consumption, gathering insight from data is a competitive advantage few are grappling with. ‘Data’ has evolved from a Latin noun datum meaning ‘something given’ and is today used in English to denote ‘facts or pieces of information’. When we send out messages, fill out an online survey, or even click on a product online our gadgets share data with the host. This data is nothing more than a very small raw text file – sometimes known as ‘cookies.’ These cookies are very safe; nonetheless, if analysed properly they can be used to unlock our potential consumers.

Although data and information are often used interchangeably, however, data becomes information when it is viewed in a context or is analysed to generate insight. Today with the rise of the Internet of Things (IoT), every industrial device that we can possibly use can have chips inserted into them that collect such data. From smart mobiles to smart watches, Gartner anticipates an explosive growth of 20.8 billion devices by 2020; every device relaying data that is connected to the internet.

Deloitte predicts that by 2020, the data gathered by such devices in the digital universe is expected to reach 44 zettabytes; one zettabyte is equal to one billion terabytes. Up to 90 percent of this data that is gathered from the IoT and non-traditional sources will be unstructured and often termed as dark data. In the broadest sense, analysing most of this untapped data is now possible, and this can assist businesses to predict their customers buying patterns and behaviour.

What is dark data?

Even with such advancements, most businesses only collect data without purpose and are overwhelmed by it. This brings us closer to the notion of dark data. Similar to the concept of dark matter in physics, dark data comprises of an organisations’ universe of information. Scientists consider dark matter to be everywhere around us, and are still understanding this very concept. Correspondingly, dark data consists of both structured and unstructured data available.

Organisations often retain dark data for compliance purposes and later this sits idle within archives of storage hardware. Storing and securing such data typically incurs more expense and sometimes even greater risk than value. Gartner defines dark data as the information assets organisations collect, process and store during regular business activities, but generally fail to use for other purposes such as, business analytics, nurture business relationships and direct monetising.

Dark analytics, therefore, emphases primarily on raw text-based data that has not been analysed— focusing on unstructured data, which may include things such as text messages, documents, emails, video and audio files, and images. For businesses, this dark data is something that is either untapped, hidden or undigested – because no potential value may be derived by it. However, the International Data Cooperation predicts that by 2020 businesses that analyse all relevant data and deliver actionable results on it will achieve an extra $430 billion in productivity gain over their less analytically oriented peers.

Let there be light – exploring dark data to unlock business value:

When we talk about unstructured data that is readily available within organisations but sits idle we may think about emails, notes, messages, documents, logs and notifications. These data sources remain largely untapped because until relatively recently we did not have the tools and techniques to leverage them efficiently. However, we now have software that uses algorithms to analyse text-based data to garner insights such as customer behaviour and information related to competitors.

Among other unstructured data sources, organisations also possess audio and video files, still images and sound recordings that can also be used for similar purposes. For example, a fast-food restaurant may be able to gain a deeper understanding of maximum customer foot-fall during the day and enhance their operations accordingly.

This can be accomplished by analysing both video images of buyers and their purchase order summaries. Similarly, people who visit your website draw with them geolocation and profile data that is stored safely. Today, we have software companies like Hotjar that enable businesses to uncover this untapped data to give insights on how users interact with our websites. There is however a caveat – using such data requires businesses to follow GDPR and follow suit.

Start small – key questions to explore with dark data

The purpose of this article is not to overwhelm, but to inform about the ways small businesses can use the power of data to gather insights that were impossible without it. To be clear, using dark data does not entail gathering all the vast volume of data available within a business and trying to analyse it. Doing such without purpose will lead to nothing but failure. What is required by businesses is to have a specific purpose in mind before starting to look for what data to find and where to look for. Like every analytics journey, a successful effort requires asking the right questions, such as:

  • What problems are we trying to solve?
  • What are the key data sources that we can tap into?
  • What can we do differently by looking at/analysing this data?

For small businesses, answering these key questions will make it possible for them to not only gather insights that are relevant and valuable but also improve performance along the way. You can learn more about data and GDPR here.

Nadeem Khan

Nadeem Khan is the Managing Director at Optimizhr – a human capital consultancy that offers services around People Analytics, Talent Strategy and Leadership Development. A fellow of the CIPD, Nadeem specialises in People Analytics and its business impact. While at Lancaster University Management School, he interviewed top C-Level Executives in FTSE 100 and management consultancies in the UK to unlock his research question 'Why HR Cannot Embrace HR Analytics.’ Nadeem is also the official trainer for the North & Western Lancashire Chamber of Commerce where he works with SMEs to improve their performance and is an official contributor for Tucana – Europe’s leading community of People Analytics thought leaders.

Sign up to the UK Domain newsletter

Get all our monthly news and updates direct to your inbox