Welcome to Apache Spark SQL for Data Analysts-About Apache Spark SQL for Data Analysts
Course goals
()
Before you begin
Spark makes big data easy-Introduction to Big Data
Introduction to module 2
()
What is big data?
()
Common struggles with big data
()
Big Data Needs
()
Spark makes big data easy-Introduction to Apache Spark(TM)
Apache Spark Intro
()
Spark SQL
()
Using Spark SQL on Databricks-Working through this course
Introduction to Module 3
()
Signing up for Databricks Community Edition
()
Preparing your workspace
()
Course Materials
Working with notebooks
()
Using course materials
()
Using Spark SQL on Databricks-Basic Queries with Spark SQL on Databricks
Basic queries with Spark SQL reading introduction
()
Basic Queries reading activity
Using Spark SQL on Databricks-Data Visualization
Data Visualization on Databricks reading introduction
()
Data Visualization reading activity
Data visualization tools
()
Using Spark SQL on Databricks-Lab: Exploratory Data Analysis
Exploratory Data Analysis lab introduction
()
Your turn! Exploratory Data Analysis lab
Spark Under the Hood-Spark SQL Powered Queries
Introduction to module 4
()
Understanding optimizations
()
The physical cluster
()
Spark Under the Hood-Using the Spark User Interface
The SparkUI and SQL tab
()
Optimizing query logic
()
Impact of Caching
()
Optimizing with selective data loading
()
Complex Queries-Manage nested data structures
Introduction to module 5
()
What is nested data?
()
Introduction to managing nested data
()
Managing Nested Data reading activity
Complex Queries-Manipulating Data
Introduction to Manipulating Data
()
Manipulating Data reading activity
Complex Queries-Lab: Data Munging
Introduction to Data Munging
()
5.3 Data Munging Lab
Applied Spark SQL-Higher Order Functions
Introduction to module 6
()
Complex data - common strategies
()
About higher-order functions
()
Higher-order functions introduction
()
Higher Order Functions reading activity
Applied Spark SQL-Aggregating and Summarizing
Introducing Aggregating and Summarizing Data
()
Aggregating and Summarizing Data reading activity
Applied Spark SQL-Partitioning Tables
Partitioning Tables Introduction
()
Partitioning Tables
Applied Spark SQL-Lab: Sharing Insights
Sharing Insights Lab Introduction
()
Sharing Insights
Data Storage and Optimization-Modern Data Storage
Introduction to module 7
()
A quick refresher
()
Data Warehouses
Data Lakes
Data Lakes vs Data Warehouses
Introducing a new data management paradigm
()
The Lakehouse
Data Storage and Optimization-Using Delta Lake
Introduction to the lesson
()
What is Delta Lake
()
Delta Lake with Spark SQL-Building and maintaining Delta tables
Introduction to the module
()
Intro to Using Delta reading
()
8.1 Using Delta
Managing Records in a Delta table
()
8.2 Managing records
Delta Lake with Spark SQL-Delta Engine Optimization
Delta Engine Optimization Introduction
()
8.3 Optimizing Delta
Delta Lake with Spark SQL-Delta Lake Lab
Delta Lake Lab Introduction
()
Delta Lab
SQL Coding Challenges-Completing Coding Challenges
SQL coding challenges