departments preside over a mother lode of unstructured data, which has largely
been overlooked as a source of insight in the gold rush for big data. Unstructured
data comprises documents like PowerPoint decks describing company strategy,
lead list spreadsheets, emails between coworkers, and social media interactions
of customers, and therefore represents valuable information.
workers spend a good portion of their days during the workweek creating and
managing PowerPoint presentations, spreadsheets, email, and other unstructured
data; this is a big investment in time, and the products of that labor are
invaluable. But all too often, the right data doesn’t reach the right people at
the right time, and insights are missed or the work is re-created. This
translates into lost time and money for organizations.
value from this data is not easy. Unstructured data can’t be sorted, searched,
visualized, or analyzed in the same way as, say, stock prices; it frequently requires
new tools and processes to extract intelligence, share information, and deliver
value. Organizations need a better way to mine for this gold. To get to that
value, you will need to consider these four steps.
1: Draw a map to the gold – Identify sources
crucial step is to identify the sources of unstructured data of importance to
your organization. These typically include file servers, collaboration tools
like SharePoint, and even virtual machines run by the IT department. Keep in
mind that each of these sources may have its own security settings when
analyzing and sharing this data.
2: Create a legend – Add context and automate
midsize organizations in particular need ways to cut the time it takes to
answer strategic business questions based on unstructured data. Marketing teams
want to know what content, customer, and collaboration trends are evident in
their social channels. Security and compliance leaders need faster discovery
methods. Healthcare professionals want to improve patient care by learning
which workflows are effective in other parts of their organizations. Lawyers
want to make connections between events to piece together complete pictures of
past activities. In short, business users need all the help they can get to
efficiently identify the information they need.
metadata (i.e., information about the data that describes who created it, what
it’s about, and what other documents it references) are key to this process. They
allow you to cross data silos and provide stakeholders with cohesive,
contextual, and complete answers. Generating this context can’t be done
manually at any kind of significant scale; instead, organizations need the
means to automatically track or calculate this in real time and do it in a way
that doesn’t overwhelm file servers, destroy storage budgets, or depend upon
expert data scientists.
3: Use your mining equipment – Visualize data
for unstructured data analysis will need to incorporate visualization if users
are to have any hope of deriving useful intelligence from their data.
Structured data analysis tools — or the mining equipment in this analogy — have
done this well for some time. Users get an executive-level view of information,
and then they have the option to drill down through greater and greater levels
of detail. Similar options are becoming available for unstructured data
analysis, as well. Through visualization tools, users can easily spot anomalies
and determine the information they need to respond quickly and accurately.
technology designed for unstructured data analysis must be able to synthesize
information from multiple sources and deliver the results in a unified fashion.
To increase its value to end users, such tools should be able to overlay
results with structured data, as well. That kind of capability could, for
example, allow an analyst to evaluate social media messages, news stories, and
other unstructured information during a certain time period and then relate it
to structured information about something quantifiable, such as stock price
4: Enable profit –
Act on the intelligence
IDC estimates that information
workers now burn about 20 percent of their time on the job just tracking down
data. It’s easy to see why. In most companies, data is created and accessed ad
hoc, and it isn’t organized to encourage accessibility. Without clearly defined
schemas for the vast majority of the data companies are generating, end users
can’t massage it, visualize it, or manipulate it in any meaningful way.
clusters of servers or hiring data scientists are not realistic solutions for
most companies. Instead, organizations need solutions capable of automatically
analyzing this unstructured data and presenting it visually for more effective
analysis. Once they have those tools, organizations can begin to act on these
insights and profit from their valuable stores of data and the intelligence
that was previously buried within.
Steve Kearns is the director of product
management for DataGravity,
focused on defining and delivering data intelligence. He has spoken at
conferences around the world about the power of search and analytics and has
worked with many of the world’s most successful companies and government
agencies implementing these technologies.