India-First-Global-Insights-Analysis -Sharing-PlatformIndia-First-Global-Insights-Analysis -Sharing-Platform

BIG Information – Knowledge outside the ‘cells’ of spreadsheet

, November 27, 2014, 0 Comments

big data information-MarketExpress-inThe most sought after characters on-board the Star ship Enterprise’s ships of the famed ‘Star Trek’ were the Science Officers Spock and Data. Technology in the 23rd century has evolved to a stage where it allows the human race to “Boldly go where no man has gone before…”. Such an evolved scientific state necessitates huge data crunching capability and analyzing subsequent actionable based on these.

Even with highly powerful and smart computers on board, the captain of the ship relies on an asset like Spock or Data to make the final decision. Spock is known for his logical thinking and quick decision taking capability, even under conditions which require tough decisions to be taken instantly. He quickly prioritizes different options, churned out by the data processors, listing out the costs and benefits of each option. This enables the Captain of the ship to pick the most favourable option.

Stepping back in time a couple of centuries to present day – human beings have acquired advanced data processing capabilities, but these are yet a fraction of what is visualized in 23rd century and require an army of ‘data managers’ – programmers, database administrators, visualizers, data scientists, for the end user to generate actionable insights. Database administrator who converts the existing data into rows and columns is the indispensable link in the chain that connects the data with the end user. The data residing in these tables is classified as structured data – mainly numbers, lending itself to calculation.

Rich content data

Files that contain written information are called unstructured data files, a misnomer as we would see later in this info-article. One example is Portable Document Format (PDF) files which contain vast amounts of the world’s professional and business data, referred to as an unstructured format. The PDF is an International Standard, adopted worldwide as a common medium for electronic documents. A significant proportion of world’s knowledge and information resides in these files.

The world is full of information trapped inside digital resources. As per IDC Research, alleged ‘structured’ data is only 15% of the total data floating. Rest is unstructured; these are mainly “free format” text files in different languages. Unstructured data is rich in content, containing information about say the reasons for an event, experiences of industry experts and anecdotal evidences. This goes beyond the information contained in the structured data.

Despite such a vast amount of contextual knowledge residing outside the structured data, businesses are trapped into the cells formed by rows and columns; trying to extract information from these sources often at the expense of 85% of information sources. Entire organization, from a business analyst to the CXOs, relies on structured data sets to make decisions. The databases get updated on regular intervals, so does the information trickling out of these. Billions are spent in creating business intelligence solutions to sit atop this data and churn out dashboards.

There is vast legacy content sitting in repositories around the world, either in business systems, storage farms, or individual PCs. Most of such data simply gets ‘tagged’, ‘stored’ and at most ‘linked’. This stored information mostly remains ‘in storage’ information, remaining un-utilized with adverse impact on resources and added costs.

Reading the digital patterns

As the Knowledge Economy develops, enterprises will be necessitated to process and share data and more information. This emerging global knowledge economy will create new enterprises, businesses and activities. One significant challenge however will be how to handle the enormous volume of individual files and how to reach the valuable content easily, cost effectively, accurately, by so many people simultaneously.

Every digital file, including the so called ‘unstructured’ data, has an inherent structure and pattern of information; there are relationships within the file and between combinations of files. Enabling technology that harvests the necessary data files and successfully identifies the digital patterns within and relationships among these files would help leverage the information contained in these files. What this would also require is an army of ‘subject matter experts’ in place of ‘data managers’.

Move over BIG data, we need BIG information!