As you plan your big data strategy for next year, keep these seven goals in mind.
In 2021, corporate big data leaders will be looking to improve data quality and turnaround of big data projects, as well as performance in meeting business objectives. While 2020 hasn't been a normal year for anyone, you still have to plan for the future and get ready for what may come. Here are seven key big data areas of focus for 2021.
1. Manage data better
Big data continues to enter corporate networks at torrential rates, with the amount of poor data that companies obtain or use costing the US economy an estimated $3.1 trillion annually. More effort needs to be made to screen data as it comes in, and to properly clean and prepare data before it is added to corporate data repositories.
At IBM Research Switzerland, artificial intelligence (AI) and machine learning assisted researchers in plowing through reams of scientific papers and journals in a search for relevant information pertaining to a molecular drug design. Researchers recognized that much of the worldwide information the AI would be reviewing would have no relevance to the problem they were trying to address. The company made a decision to eliminate importing data from non-relevant sources upfront. This saved hours of AI time, gave the researchers a high relevant set of data, and eliminated data storage waste.
Once the data passes incoming criteria, it should also be cleaned and properly prepared before it is uploaded into a data repository. This means checking for incomplete, duplicate, and inaccurate data, and also normalizing data so it can be blended with other source data for analytics.
2. Speed and monitor the process
By now, most organizations are well underway with an iterative, DevOps-style development approach for big data and analytics. Now it's time to formalize the process so users and IT/data science know when a big data analytics model is mature enough to be placed into and maintained in production.
The benchmark for corporate readiness is that big data analytics results must reach a threshold of 95% accuracy and must consistently deliver this level of performance. Since business and outside conditions change over time, it's possible that a big data application in production can start falling below 95% accuracy.
IT and data science should establish a maintenance policy that remeasures apps for accuracy each year to assure that the apps are still delivering accurate results.
3. Formalize a hybrid architecture for big data and analytics
IT, data science, and end users have all budgeted for and independently developed big data and analytics applications. Some of these systems run on premises, while others run on public and private cloud platforms.
As the need grows for more data to be pulled together from disparate sources, an over-arching hybrid cloud architecture that includes cloud and on-prem platforms should be formalized, and enterprise security and governance should be uniformly applied throughout. Few organizations have formalized this hybrid architecture for big data. 2021 is the year to do so.
4. Build bridges between IT, data science, and users
As more vendors simplify AI solutions, there has been growth in citizen AI, where business units develop their own AI and big data applications. Later, when users want to train these apps and integrate them with other company data and platforms, they need IT and data science departments to help them.
If IT and data science professionals actively collaborate with business users early in their application processes, many of these follow-on integration difficulties can be avoided. Developing productive relationships with business units throughout the company should be a major big data and analytics goal for IT.
5. Improve security, especially for IoT
Many Internet of Things (IoT) devices have proprietary operating systems and security presets that won't meet company security and governance standards.
With security intrusion attempts occurring daily, reviewing these devices, working with vendors, and assuring that settings can be made that conform to corporate security and governance standards, are all paramount.
6. Review dashboard and report results
The dashboards and drop-down reports that your analytics produce might work flawlessly from a technical standpoint, but are they remaining relevant? IT should visit with end users annually to review report usage and to determine whether dashboards and reports need to be revised or even replaced.
7. Improve communications with management
Although management is well aware of the importance of big data, AI, and IoT, it doesn't hurt to regularly brief them on projects and new developments. This keeps management in the loop and helps to assure ongoing support.
Regular project communications to management in plain English should be a goal in bold on every IT plan in 2021.
- How to become a data scientist: A cheat sheet (TechRepublic)
- Big data's role in COVID-19 (free PDF) (TechRepublic download)
- Power checklist: Local email server-to-cloud migration (TechRepublic Premium)
- Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
- Big data: More must-read coverage (TechRepublic on Flipboard)