Introduction
Welcome
()
What you should know
()
Exercise files
()
Set up the environment
()
1. Data Engineering Overview
What is data engineering?
()
Stages of data engineering
()
Data engineering challenges with big data
()
Spark and Kafka for data engineering
()
2. Moving Data with Kafka
Use Kafka connectors
()
Code: Read to a file source
()
Code: Write to a HDFS sink
()
Code: Read for a JDBC source
()
Code: Write to a Spark sink
()
3. Spark High-Performance Processing
Data engineering with Spark
()
How Spark works
()
Optimize for lazy evaluation
()
Work with dependencies
()
Complex accumulators
()
4. Use Case Project
Problem statement
()
Solution overview
()
Process US sales data
()
Process EU sales data
()
Process web hits data
()
Process tweet data
()
Scale the solution
()
Ex_Files_Apache_Spark_EssT_BigData.zip
(105.0 MB)