Overview and Introduction to PySpark-Course Overview
Welcome to Data Engineering Platforms with Python!
Meet your Co-Instructor: Kennedy Behrman
()
Meet your Co-Instructor: Noah Gift
()
Overview and Introduction to PySpark-Big Data Platforms: Hadoop and Spark
Overview of Big Data Platforms
()
Getting Started with Hadoop
()
What is Apache Hadoop?
Getting Started with Spark
()
What is Apache Spark?
Use Apache Spark in Azure Databricks (optional)
Choosing between Hadoop and Spark
Overview and Introduction to PySpark-Resilient Distributed Datasets
Introduction to Resilient Distributed Datasets (RDD)
()
What are RDDs?
Resilient Distributed Datasets (RDD) Demo
()
Getting Started: Creating RDD's with PySpark
Overview and Introduction to PySpark-Spark SQL
Introduction to Spark SQL
()
Spark SQL, Dataframes and Datasets
PySpark and Spark SQL
PySpark Dataframe Demo: Part 1
()
PySpark Dataframe Demo: Part 2
()
Snowflake-Snowflake Concepts and Architecture
What is Snowflake?
()
Accessing Snowflake
Detailed View Inside Snowflake
Snowflake Layers
()
Snowflake Web UI
()
Snowflake-Snowflake Demo
Navigating Snowflake
()
Snowsight: The Snowflake Web Interface
Creating a Table in Snowflake
()
Working with Warehouses
Snowflake Warehouses
()
Snowflake-Python Connector
Writing to Snowflake
()
Reading from Snowflake
()
Python Connector Documentation
Azure Databricks and MLFLow-Getting Started with Databricks and Spark
What is Azure Databricks?
Accessing Databricks
()
Spark Notebooks with Databricks
()
Using Data with Databricks
()
Working with Workspaces in Databricks
()
Advanced Capabilities of Databricks
()
Azure Databricks and MLFLow-Foundational PySpark Skills and Databricks
Introduction to Databricks Machine Learning
PySpark Introduction on Databricks
()
Exploring Databricks Azure Features
()
What is the Databricks File System (DBFS)?
Using the DBFS to AutoML Workflow
()
Load, Register and Deploy ML Models
()
Serverless Compute with Databricks
Databricks Model Registry
()
Model Serving on Databricks
()
Azure Databricks and MLFLow-Using MLFlow with Databricks
What is MLOps?
()
MLOps Workflow on Azure Databricks
Exploring Open-Source MLFlow Frameworks
()
Running MLFlow with Databricks
()
Run MLFlow Projects on Azure Databricks
End to End Databricks MLFlow
()
Databricks Autologging with MLFlow
()
Databricks Autologging
DataOps and Operations Methodologies-Getting Started with Kaizen Methodology for Data
Kaizen Methodology for Data
()
Introducing GitHub CodeSpaces
()
GitHub Codespaces Overview
Compiling Python in GitHub Codespaces
()
Getting Started with Amazon SageMaker Studio Lab
Walking through Sagemaker Studio Lab
()
Teaching MLOps at Scale with GitHub (Optional)
Pytest Master Class (Optional)
()
DataOps and Operations Methodologies-Getting Started with DevOps for Data
What is DevOps?
()
DevOps Key Concepts
()
Getting Started with DevOps and Cloud Computing
Continuous Integration Overview
()
Build an NLP in Cloud9 with Python
()
Build a Continuously Deployed Containerized FastAPI Microservice
()
Hugo Continuous Deploy on AWS
()
Container Based Continuous Delivery
()
DataOps and Operations Methodologies-Composing DataOps Solutions
What is DataOps?
()
DataOps and MLOps with Snowflake
()
Building Cloud Pipelines with Step Functions and Lambda
()
What is a Data Lake?
()
Data Warehouse vs. Feature Store
()
Big Data Challenges
()
Types of Big Data Processing
()
Real-World Data Engineering Pipeline
()
Data Feedback Loop
()
Benefits of Serverless ETL Technologies