ds4ad

Data science for administrative datasets

This course is somewhat based on parts of the Software Carpentry curriculum.

Setup:

Day 1: Foundations

9 - 10: Introduction (Ariel)

10 - 12: Foundations of programming in Python (Jose)

Noon - 1 PM lunch

1 - 4 : git and GitHub (Bryna)

Homework day 1:

Set up your own GitHub project

Code Challenges

>>> print(outer('helium'))
hm
>>> print(fence('name', '*'))
*name*

Bring a use-case

Tomorrow, tell us about a data use-case that you have in mind for your work:

  1. What is the data?
  2. What are some questions you would like to answer with these data?
  3. How is the data currently stored?

Day 2: Data munging

9 - noon : Introducing Pandas (Bryna)

Noon - 1 PM: lunch

1 - 4 PM: Manipulating data with Pandas (Ariel)

Homework day 1:

Set up your own GitHub project

Day 3: Data analysis and data visualization

9 - noon : Computations and statistics with Pandas DataFrames (Ariel)

Noon - 1 PM: lunch

Afternoon:

1 PM - 2:30 Visualizing data with Matplotlib (Jose)

2:30 - 4 Next steps – where do we go from here? (Bryna + Jose (+ Ariel))