Big data will be piquing the interest of the enterprise for the foreseeable future.

As time progresses, though, big data will go from a buzz word, to a more serious pursuit. And one step in increasing the role and impact of data within the enterprise is deciding on big data integration software.

TechRepublic spoke via email with Ashley Stirrup, CMO of open-source data integration software provider Talend.

Stirrup talked about what it takes to become a data-driven enterprise, and highlighted many of the questions businesses should be asking when selecting the right big data tool for the job.

TechRepublic: In your experience, what are the characteristics of data-driven organizations?

Ashley Stirrup: Simply stated, data-driven companies are able to harness the power of their data and leverage it as a key corporate asset and competitive differentiator. Rather than data being the domain of IT or a singular business team, data is a company-wide discipline, and it’s used to make informed decisions across the organization. The path to becoming data-driven requires a common set of steps. It starts with a strong business case and a clear understanding of how and why big data will be useful. Companies finding success are inevitably using big data to answer very important questions about their business, customers, or operations. Also consistent among data-driven companies is having a very well-defined project management process that includes everything from ensuring broad buy-in to identifying exactly the data sets to be used on any given projects.

TechRepublic: What are the major trends in your competitive space–data integration solutions?

Ashley Stirrup: First and foremost, the rise of Apache Spark is a major trend. We are seeing across-the-board interest in Spark as a result of its ability to enable faster data integration and real-time big data analytics. Companies are quick to realize the potential of Spark and the high value of quickly turning insight into action. Our recent benchmark tests against Informatica reinforce the advantages of Spark, with Talend Big Data completing projects as much as 10-times faster than Informatica Big Data Edition.

In addition, as more companies get into production with big data, we are also seeing first-hand the benefits of open source and code generation, with companies needing less staff and fewer skilled workers to tackle projects. It is also clear that enterprises no longer want to treat traditional data warehouse environments and big data projects separately. What they are seeking is a single solution to address all data management needs, not to mention bulk, batch and real-time streaming data. Finally, it’s impossible to ignore the growing impact of the cloud as it relates to data integration. Clearly, cloud is a trend that will continue and accelerate in 2016 with more companies looking to recognize the value and ease of cloud-based integration.

TechRepublic: What are the main reasons companies choose open-source providers of data integration?

Ashley Stirrup: Although happening more broadly than just data integration, the move to open source is largely driven by the growing number of companies that want to move away from walled-in, proprietary technologies. The growth of data and rapidly evolving business needs require the flexibility and agility that open source software can provide. In particular, this is true in big data where new innovations are being released on an almost weekly basis. Of course, customers are also motivated by the more attractive economics or lower total cost of ownership associated with open source.

TechRepublic: What’s your take on Hadoop becoming a mainstream enterprise tool?

Ashley Stirrup: Without question, Hadoop will become a mainstream enterprise tool. This time last year, we probably had more customers in trials than we did production. Today, the picture has changed dramatically with the majority of our customers entering into full-blown deployment. I think we are clearly entering into the early majority stage at this juncture. The economics are simply too good and the competitive threats too obvious for companies to stay on the sidelines any longer.

TechRepublic: How would you describe the Talend Big Data product to a potential client?

Ashley Stirrup: I usually describe what I believe are the key questions business should ask prior to making a decision about big data integration software.

  • Is it easy to use? Ask to see the user interface. Is it simple or complex? Does the application automatically generate code or does it force you to do it by hand? Can you perform tasks using drag-and-drop actions? Does the platform offer a single, consistent workflow and UI or does it look like a mix of separate applications?
  • Is it unified? Does the platform enable the integration of all types of data (cloud, on-premises, IoT, etc.) and can you perform both batch and real-time processing within the same solution?
  • Does it fully leverage the power of Hadoop? Some tools require that you process and transform data before loading into Hadoop. Not only does this data movement slow projects down, but it also means you are not fully exploiting the processing power of Hadoop.
  • Is it up to date? Is the software based on open source or is it proprietary? Open source solutions are proven to better keep pace with the rate of big data innovation and enable you to remain agile and more responsive to the needs of the business.
  • Is it fast? Does is utilize Spark and Spark streaming within Hadoop to process data? Or is it stuck in the days of YARN?
  • Is it cost effective? What’s the total cost of ownership? Is it reasonable and based on the number of developers or is it based on data volumes, connectors or CPUs?