Most organizations use a mix of structured and fixed record transactional data and unstructured big data from various data sources such as videos and photos, social media, graphics, and hardcopy records stored in file cabinets. The structured and unstructured data are then combined in a master data repository used for analytics, with the entry points into the data for analytics queries coming from the data keys found in transactional data.
But how should you properly integrate all that information to get the best use of it?
SEE: Hiring Kit: Market research analyst (TechRepublic Premium)
Let’s say that a company wants to learn everything about a particular product it is having trouble with. It takes the data key from the master item file in its transactional systems (such as an item number) and then queries its data repository that has a mix of both structured and unstructured data. Because the company has both structured and unstructured data about the item in its analytics data repository, the company can see the item’s order and returns history, as well as its CAD drawing, photos, bills of material and revision levels, and even videos of manufacturing processes.
The ability to view the item being studied from so many different vantage points enables a salesperson to view social media and returns history to see what customers didn’t like about the product. Alternately, a product engineer can take look at product specs, tolerances, design documents, material lists, etc., to see if there is a way to improve the product, and a manufacturing manager can take a look at videos that show the product being routed and assembled on the floor to determine if any manufacturing processes should be altered or improved.
SEE: Tableau business analytics platform: A cheat sheet (free PDF download) (TechRepublic)
To combine and tune structured and unstructured data so they work well together, the data must be properly integrated. This integration effort must also extend to other IT processes and assets.
Here are five IT integration areas that should be developed and tuned for transactional and big data to work well together:
1. The data
As described above, structured and unstructured data must be carefully selected and combined for the execution of queries in an analytics data repository. There are limitless ways of combining data for a data repository. It’s up to business users, business analysts, data scientists, and data administrators to determine the best types of data to select and combine, based on the business cases that the company wants to solve.
2. The people
Different company functions–end users, business analysts, data scientists, database professionals, etc.–must all work together as a cohesive team. This requires cross-organizational integration of people and functions into a single project. IT leadership plays a major role in facilitating this integration and supporting the team.
At the same time, IT technical staff must support two different types of processing and storage infrastructures—the kind that drives and supports transactional data; and the kind that supports unstructured, big data. Often the same team of system support personnel is cross-trained on both architectures. In other cases, transactional and unstructured data engineers come together to support all architectures and to interact with each other. This multifaceted infrastructure support team can exist in the data center, on the cloud, or in both. From an IT standpoint, support for two different data architectures requires some reorganization of the systems support team, as well as new determinations of the daily processes and functions that must be supplied by IT, and how they can all be integrated into a single support plan.
4. Frontline analysts
There must be team integration between business analysts, data scientists, and end users. It’s the only way to ensure that the essence of each business case remains properly aligned from start to finish in the analytics.
Whether data under management is structured or unstructured, the same rules for security, data safekeeping and retention, data handling, authorization and access, etc., must be followed. This is one of the most difficult integration points for IT, which must assure that big data projects, many of which have been run iteratively and in pilot, conform to the same corporate governance standards expected for transactional data.