8 Steps for a Developer to Learn Apache Spark with Delta Lake – DNU

For data engineers, building fast, reliable pipelines is only the beginning. Today, you also need to deliver clean, high quality data ready for downstream users to do BI and ML.

Apache Spark™ and Delta Lake deliver fast, reliable data to your data teams for all your data engineering, data science, machine learning, and business analytics use cases. And these projects are open source and use open formats, so you can easily access your data using your tools of choice.

  • Why Apache Spark and Delta Lake
  • Apache Spark and Delta Lake concepts, key terms and keywords
  • Advanced Apache Spark internals and core
  • DataFrames, Datasets and Spark SQL essentials
  • Graph processing with GraphFrames
  • Continuous applications with structured streaming
  • Machine learning for humans
  • Data reliability challenges for data lakes
  • Delta Lake for ACID transactions, schema enforcement and more
  • Unifying batch and streaming data pipelines
Read now

Subscribe to the Developer Insider Newsletter

From the hottest programming languages to commentary on the Linux OS, get the developer and open source news and tips you need to know. Delivered Tuesdays and Thursdays

Subscribe to the Developer Insider Newsletter

From the hottest programming languages to commentary on the Linux OS, get the developer and open source news and tips you need to know. Delivered Tuesdays and Thursdays

Resource Details

Databricks logo
Provided by:
Databricks
Topic:
Networking
Format:
PDF