Big Data

Microsoft streamlines big data analytics with new Azure services

New database offerings and services in Microsoft's Azure cloud platform were unveiled today at Microsoft's Connect(); 2017 conference, alongside GitHub's adoption of GVFS.

Microsoft today revealed a range of tools for developers, designed to simplify the process of extracting useful insights from big data.

The new database offerings and services in Microsoft's Azure cloud platform were unveiled today at Microsoft's Connect(); 2017 conference.

The major new announcement was the availability of Azure Databricks, which is designed to simplify big data analytics.

Azure Databricks is a service designed to provide one-click setup of analytics jobs on top of Apache Spark, the framework for analyzing large volumes of data distributed across compute clusters. The service, now available in preview, promises to streamline workflows when carrying out Spark analytics, and offers a workspace for interacting with jobs. The service is built upon the Databricks platform, which lets users run Spark workloads with less human intervention.

Azure Databricks has native integration with Azure SQL Data Warehouse, Azure Storage, Azure Cosmos DB, Azure Active Directory and Power BI.

"By using Spark to create modern analytics, customers can reason over any kind of data, as well as having the ability to inspect real-time data patterns," said Microsoft corporate VP of communications Frank Shaw, speaking ahead of event.

"This enables real-world scenarios, like a hotel being able to reason from structured and unstructured data, like video and sound, to discover the best type of lobby flow and check-in desk configuration for a better guest experience."

Microsoft says Databricks should simplify the creation of modern data warehouses, enabling organizations to provide self-service analytics, with enterprise-grade performance and governance.

Another database upgrade on Azure will see the globally distributed database platform, Azure Cosmos DB, making data available via an API for the Cassandra NoSQL database. Access to the API is available in preview, and expands the range of data models supported by Azure Cosmos DB, which includes document, graph, key-value, table, and columnar.

SEE: Research: How big data is driving business insights in 2017 (Tech Pro Research)

GitHub, the global code repository platform also announced that it will adopt Microsoft's open-source Git Virtual File System (GVFS). Microsoft uses the GVFS to allow it to use the Git version control system with its code for Windows, which runs to 3.5 million files weighing in at about 300GB. Git was never designed to be used with that large an amount of files, and using GVFS allows Git commands to be run on the codebase without taking an unreasonable amount of time. Microsoft says GitHub's decision to adopt GVFS makes the file-system the industry-standard for "enterprise-scale" Git repositories.

Microsoft has also joined the MariaDB foundation as a platinum member, and today announced an upcoming preview of the Azure Database for MariaDB, the popular open-source MySQL fork, which will offer a fully managed MariaDB service in the cloud.

Also see

About Nick Heath

Nick Heath is chief reporter for TechRepublic. He writes about the technology that IT decision makers need to know about, and the latest happenings in the European tech scene.

Editor's Picks

Free Newsletters, In your Inbox