Introduction to Data Privacy
What's private, and why do we care?
()
Data masking and data generation with Faker
()
Anonymizing with data generalization
()
Reducing identification risk with generalization
More on Privacy-Preserving Techniques
Anonymizing categorical data
()
Sampling from the same probability distribution
Anonymizing continuous data
()
Sampling from the best continuous distribution
Introduction to K-anonymity
()
Generalizing data using hierarchies
()
Using hierarchies for categorical data
Differential Privacy
Introduction to differential privacy
()
Privacy budgets
()
Exploring data with a privacy budget accountant
Differentially private machine learning models
()
Build a differentially private classifier
Predicting salaries
Differentially private clustering models
()
Anonymizing and Releasing Datasets
PCA for anonymization
()
Data masking with PCA
Generating realistic datasets with Faker
()
Creating synthetic datasets using scikit-learn
()
Generating datasets for classification
Generating datasets for clustering
Safely release datasets to the public
()
Great work!
()
HR_IBM.zip
(50 KB)
income-dataset.csv
(3.8 MB)
Mall_Customers.csv
(3 KB)
2017-18_NBA_salary.csv
(64 KB)