A person looks at a projection of business data.
Image: conceptcafe/Adobe Stock

Acceldata has announced the release of a new open-source version of its data platform. Unlike many traditional data platforms, this new release is multilayered, promising integrated insights from data pipelines, data quality tools and data systems.

SEE: 40+ open source and Linux terms you need to know (TechRepublic Premium)

With the new open-source Acceldata platform, enterprises will be able to innovate with up-to-date data observability solutions at a lower cost. TechRepublic spoke to Chandrakant Sharma, senior director of customer engineering at Acceldata, as well as other members of the Acceldata team, to get the inside story on Acceldata’s open-source data platform, its capabilities, possible challenges and potential business opportunities.

Jump to:

The open-source and data observability markets

In a recent article for TechRepublic, Ali Azhar explained that data observability is becoming increasingly valuable and widely used in organizations, especially for the ones that want to “keep the data in good health through the entire data value chain.” Observability can provide accurate data to consumers, partners and decision-makers, giving them control over data usage and planning.

On a similar growth trajectory is corporate interest in open-source software implementations. As datasets grow and hoped-for use cases expand, many companies are turning to open-source software for its malleability and lower costs.

SEE: The Simple ML release and its big data implications for Sheets users (TechRepublic)

“Data explosion is showing no signs of a slowdown,” Sharma said. Citing Statista, Sharma added that the total amount of data created, captured, copied and consumed globally is forecast to increase from 64.2 zettabytes in 2020 to more than 180 zettabytes in 2025.

But despite this data growth and the parallel increase in demand for open-source data management software, many companies do not have the knowledge, access or developer skill sets to leverage open-source data solutions.

“For enterprises that continue to build and manage data products, the options to move to a completely open-source, community-based data platform are limited,” Sharma noted.

What Acceldata’s new open-source platform can do

The new open-source version of the Acceldata platform delivers stable and community-validated versions of the data platform and data observability libraries, and it supports public, private and hybrid environments to meet the changing requirements of enterprises.

SEE: Benefits of working with open-source data quality solutions (TechRepublic)

Acceldata’s new initiative includes a data platform and six projects, which are available under the Apache License Version 2.0 and can be downloaded for free. Acceldata has explained that large enterprises from the fintech, telecom and data solution provider spaces contributed to, verified and have already adopted this new open-source platform.

The benefits of using Acceldata’s open-source data platform

Solutions for several big data management problems

With the new Acceldata platform, data teams can solve the following problems:

  • Identify operational bottlenecks to optimize data and analytics platform scaling and detect performance issues earlier.
  • Provide operational visibility, guardrails and proactive alerts to prevent cost and resource overruns.
  • Monitor data reliability across the data supply chain to improve data quality and lessen the impact and frequency of data outages.

Open-source currency and low costs

The sheer fact that this new release is open source offers many benefits to users. First and foremost, open-source software like this makes it possible for companies to execute regular software updates while also collaborating with other users to make the product better.

“With the unrelenting data deluge, enterprise data observability continues to be an emerging, high-growth market,” said Ashwin Rajeeva, co-founder and CTO of Acceldata. “Our founding team has contributed extensively to Apache projects over the years.

“We know that a reliable, low-cost data platform that’s in step with the latest Apache open-source code will not only help to advance innovation in data observability but also benefit from the larger community collaboration with a shared mission.”

Customizability and elasticity

The new platform provides flexibility to adopt technologies that enable elasticity and on-demand services in any environment. Deployment is also automated, allowing manageability, observability and package management operations to occur.

Tenants can consume and upgrade services without impacting other tenants. To support efficient operations and consistency, the company offers the stability of components through change cycles.

The benefits of open-source data platforms

With an open-source data platform aligned with the latest Apache open-source code, companies can optimize data platform licensing costs and leverage a scalable open-source platform for large volumes of data, Sharma explained.

Additionally, they can expand their data footprint without being burdened with additional licensing costs, while focusing on building data products rather than operating the data platform.

“The open-source community continues to be as strong as ever, if not better than before, and new code finds its way consistently into the Apache Github repositories of popular projects such as Hive, Spark and Kafka, among many others,” Sharma said.

SEE: Open-source code for commercial software applications is ubiquitous, but so is the risk (TechRepublic)

Data-driven companies aligning data management with their business priorities demand highly customizable solutions, transparency and ownership. Naturally, open-source technology — with source code anyone can inspect, modify and augment — is appealing for advanced data teams.

“The dream of an open-source data platform has been a broken one until now,” said Rohit Choudhary, founder and CEO of Acceldata. “The guardians of open-source have a responsibility to be open over protectionism, and we take that role seriously as we continue to participate in, support and advance the community.”

Challenges and opportunities for open-source data management

Some of the biggest challenges of open-source software include operational efficiency, security and maintenance. Acceldata explains that their new release periodically synchronizes with open-source branches to ensure alignment with the latest code and new development. This approach allows flexibility for adding new components as community and industry innovation continues to evolve.

Acceldata recently surveyed over 200 chief data officers, vice presidents of data platforms, data engineers and other data leaders to learn more about pain points for data-driven companies.

Almost half (45%) of those surveyed admitted to having experienced data pipeline failure 11 to 25 times in the past two years due to data quality problems or errors discovered too late. These incidents affected customer experience in 63% of cases. Visibility was identified as the top problem by data industry experts.

Managing the complexities of cross-platform solutions

Today’s data systems and architectures leverage the benefits of cross-platform environments. Edge-cloud, on-premises and hybrid cloud environments can reduce costs, increase performance and provide companies with access to the latest technology. However, managing and visualizing data in these complex environments has presented many challenges, such as security.

SEE: Data governance checklist for your organization (TechRepublic Premium)

Acceldata assures this new version of their platform leverages the contributions of the community, many of whom work at large and small enterprises.

“We regularly scan the code base for vulnerabilities,” Sharma said. “The community decides what is important, what to work on, and develops a timeline for the projects. Our role is to collectively maintain the platform to keep the philosophy of providing access to open-source data platforms and libraries.”

The future of open-source data observability

What role will multilayered data observability play throughout 2023? According to Sharma, observability will be essential to overcome data sprawl, data tool proliferation and talent shortage.

In search of new solutions, data leaders are increasingly turning to open-source and data observability options to better understand what is happening to their data, its quality, costs and pipelines. The main goal? To enhance performance, modernize and drive business-critical operations.

Read next: Best data observability tools for your business (TechRepublic)

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays