Introduction
Wrangling high-volume data with R
()
Sample data set
()
1. Problems and Opportunities with High-Volume Data
Perspectives on high-volume data
()
Big data and available memory
()
Code: Finding available memory
()
Big data and CPU cycles
()
Code: How fast is your computer?
()
2. Visualizing High-Volume Data
High-volume data and visualizations
()
Code: Graphs for high-volume data
()
Code: rug() and jitter()
()
Code: Applying statistics to plots
()
Code: Subsampled graphs for high-volume data
()
Code: Trellising data across multiple charts
()
3. Working within the R Programming Language
R programming tools for high-volume data
()
Downsampling
()
Profile R code to find inefficiencies
()
Code: Profile R code to find inefficiencies
()
Avoid the copy-on-modify problem with R
()
Code: Avoid copy-on-modify with data.table
()
Optimization versus readability
()
4. Advanced High-Volume Techniques
Compile R functions
()
Parallel processing with R
()
Code: Parallel R functions
()
bigmemory, LaF, and ff packages
()
5. Use R with External Big Data Solutions
Store high-volume data in a database
()
Code: R with databases
()
Cloud computing with R
()
Sparklyr with R
()
Code: R with Sparklyr
()
Conclusion
Summary of high-volume data with R
()
Ex_Files_R_Programming_Data_Volume.zip
(1.0 MB)