Nevertheless, graph databases are worth talking about in the big data and analytics context because, behind the scenes, the capabilities of graph databases improve the ability to analyze complex data relationships; these databases also give organizations greater ability to move reporting into a real-time or near-real-time mode. Both of these trends also characterize the big data movement today, so in a very real sense, the general shift of corporate reporting to a relational instead of a transactional context is likely an outcome of the corporate focus on big data and analytics.
"What makes a graph database so effective is its ability to be a highly intuitive data model and also to reflect how the world really operates by being able to find the relational connections between objects and data," said Ryan Boyd, head of developer relations North America for Neo4j, a graph database solutions provider.
Boyd said that graph databases are being adopted in companies because these databases can so effectively and intuitively describe the world through their data handling; because graphs can be very high-performance databases when compared with the performance of traditional relational databases; and because graph databases are agile and can easily optimize new and existing data models with less work. Why is this?
"In a relational database, every JOIN statement requires the application to look at another index to another dataset," said Boyd. "We have enterprise clients that tell us that some of their SQL queries might require over 20 of these JOINs—and this can make data queries really slow. With a graph database, you find a logical starting point and you branch out from there and identify the relationships. For instance, you might write a query that asks, 'Find all of the friends of the friends of John.' Instead of having to JOIN many different indexes, the graph database uses pointer arithmetic that is in-memory or in cache and performs the operation." The result is less compute-intensive and faster processing.
Boyd said that Neo4j has over 200 enterprise clients that are using the graph database so they can explore more complex data relationships and associations and also bring more of their analytics into a real-time processing mode. Even for organizations that rely primarily on batch analytics reporting (and most do), plugging up to a graph database can dramatically shrink the batch-processing window. This is why the movement into graph databases is an important bellwether for future analytics.
How are organizations putting graph databases to use?
"Financial services companies are using graph databases to assist them in discovering instances of both internal and external fraud," said Boyd. "In retail, companies are using the technology to help them with purchase recommendations for customers. In logistics, graph databases are being used to plan package routings — and in networking and IT, it is being used in root cause analysis."
Why graph databases should matter to data analysts
Even so, some might argue that the graph database is simply a NoSQL database alternative to traditional relational databases. Purists can also argue that graph databases focus more on transactional data, so they are technically not big data tools.
However, by advancing the case for real-time analytics along with the capacity for delving into highly complex data relationships, graph databases are raising overall corporate awareness about the importance of data analytics and being able to identify relationships and meanings of data from many different sources, which is what big data is all about.
- Going mainstream: Neo4j and the future of the graph database (ZDNet)
- Panama Papers: Finding connections using the power of the graph database (ZDNet)
- Neo4j 3.0 released: Graph technology looks here to stay (ZDNet)
- A new breed of database hopes to blend the best of NoSQL and RDBMS (TechRepublic)
- Redis Modules extend NoSQL database's capabilities, increase developer involvement (TechRepublic)
- Job description: Chief data officer (Tech Pro Research)
Mary E. Shacklett is president of Transworld Data, a technology research and market development firm. Prior to founding the company, Mary was Senior Vice President of Marketing and Technology at TCCU, Inc., a financial services firm; Vice President of Product Research and Software Development for Summit Information Systems, a computer software company; and Vice President of Strategic Planning and Technology at FSI International, a multinational manufacturing company in the semiconductor industry. Mary is a keynote speaker and has more than 1,000 articles, research studies, and technology publications in print.