- Step 1: Teaching Sessions -- How-to sessions with basic experiments to get familiar with finding datasets and doing something useful with them.
- Step 2: Do-it-yourself -- UB-CSE mentors + HS Students
- Step 3: A contest of some sort (Data Hackathon)
- Step 4: Pursue 1-2 instances?
After completing the program, participants should know how to...
- ... perform basic statistical and visual analysis using tools like Excel, Google Sheets, and Google Fusion Tables.
- ... use analytics languages/tools like Python, NumPy, and Jupyter
- ... generate commonly used data visualizations including 2- and 3-D plots, scatter plots, histograms, and geospatial plots.
- ... locate datasets pertinent to a topic of interest, understand formatting conventions (JSON, CSV, XML), and obtain 'deep web' data through REST APIs
- ... prepare the data for analysis
- ... evaluate the quality and reliability of a dataset
1) Basic Data Vis
Tools: Google Sheets, Google Maps, Maaaaybe Fusion Tables? Concepts: Basic Data formats (CSV, TSV), Data errors (outliers, missing data), Summarizing data, Visualizing data, Visualizing spatial data in google earth
Suggested Resources for Open Data
2) Advanced Data Vis
Tools: Python, JuPyter, NumPy Concepts: Notebook UIs, Data import (JSON, XML), Programmatic Data Vis (NumPy)
3) Web APIs
Tools: Python, JuPyter, Twitter, WUnderground Concepts: REST, URL requests, Caching, Access Tokens/Developer Keys
4) Data On The Net
Tools: Python, JuPyter, Wrangler, Search engines Concepts: Finding Data, Preparing Data, Evaluating Data Quality