This resource is no longer available

Cover Image

For data engineers, building fast, reliable pipelines is only the beginning. Today, you also need to deliver clean, high quality data ready for downstream users to do BI and ML.

Apache Spark and Delta Lake deliver fast, reliable data to your data teams for all your data engineering, data science, machine learning, and business analytics use cases. These projects are open source and use open formats, allowing you to easily access your data.


• Why Apache Spark and Delta Lake
• Apache Spark and Delta Lake concepts, key terms and keywords
• Advanced Apache Spark internals and core
• DataFrames, Datasets and Spark SQL essentials
• Machine learning for humans
• Data reliability challenges for data lakes
• Delta Lake for ACID transactions, schema enforcement and more
• Unifying batch and streaming data pipelines


Read now  

Vendor:
DataBricks
Posted:
Feb 8, 2021
Published:
May 21, 2020
Format:
PDF
Type:
White Paper

This resource is no longer available.