Sociology 504: Advanced data analysis for the social sciences

Princeton University, Spring 2015


Instructor: Matthew Salganik

Preceptor: Angela Dixon

Overview

This course is the second of the two-semeter sequence for Ph.D. students in Sociology. In this course, students will learn the statisical and computational principles necessary to perform modern, flexible, and creative analysis of quantiative social data. This course hopes to transfrom students from consumers of quantative research to producers of it.

See the logistics page for more information about time and location, prerequisites, software, code conventions, collaboration policy, Piazza, and inspirations.

Goals

By the end of the semester, you will be able to: Further, because we cannot possibly cover everything that you will need to know during your career as a researcher, there are two final long-term goals. After this course is over, you will be able to:

Assignments

There are three main types of assignments for students:

Github

All class materials are available from our class github page.

Open access

I have marked open access materials with a and closed access materials with a . If you do not have access to a university library, copies of many of the closed access articles can be found through Google Scholar.

Schedule

Week Monday Monday Lab Wednesday
1 Introduction Lab-Tranforming data with dplyr Doing data analysis: An introduction to software engineering
2 Visualization Lab-Visualizing data with ggplot2 Version control with git and github
3 Regression and diagnostics Lab-Running regressions Multiple regression and diagnostics
4 Dummy variables and interaction Lab-Dummy variables and interactions Dummy variables and interactions in practice
5 Statistical inference for regression Lab-Loops and functions Beyond star gazing
6 Matrix approach to regression Lab-Reproduction studio Maximum likelihood approach to regression
Spring Break
7 Causal inference and potential outcomes Lab-Turning tables into graphs Causal graphs
8 Conditioning and matching for causal inference Lab-Replication studio Regression, causal inference, and shoe leather
9 Logit and probit models for categorical response variables Lab-Working with logit and probit coefficients Logit and probit models: Not as simple as you thought
10 Models for polytomous data Lab-Working with models for polytomous data Generalized linear model and models for count data
11 Making simple (and complex) models more flexible and interesting Lab-Hurricanes! Multilevel modeling
12 Sampling, networks, and hidden populations Lab-Project presentations Cautions, warnings, and wisdom


Introduction, 2015-02-02

Before class: Optional after class:

Lab-Tranforming data with dplyr, 2015-02-02

Before class: Optional after class:

Doing data analysis: An introduction to software engineering, 2015-02-04

Before class: Optional after class:

Visualization, 2015-02-09

Before class: Optional after class:

Lab-Visualizing data with ggplot2, 2015-02-09

Before class: Optional after class:

Version control with git and github, 2015-02-11

Before class: Optional after class:

Regression and diagnostics, 2015-02-16

Before class: Optional after class:

Lab-Running regressions, 2015-02-16

Before class: Optional after class:

Multiple regression and diagnostics, 2015-02-18

Before class: Optional after class:

Dummy variables and interaction, 2015-02-23

Before class: Optional after class:

Lab-Dummy variables and interactions, 2015-02-23

Before class: Optional after class:

Dummy variables and interactions in practice, 2015-02-25

Before class: Optional after class:

Statistical inference for regression, 2015-03-02

Before class: Optional after class:

Lab-Loops and functions, 2015-03-02

Before class: Optional after class:

Beyond star gazing, 2015-03-04

Before class: Optional after class:

Matrix approach to regression, 2015-03-09

Before class: Optional after class:

Lab-Reproduction studio, 2015-03-09

Before class: Optional after class:

Maximum likelihood approach to regression, 2015-03-11

Before class: Optional after class:

Causal inference and potential outcomes, 2015-03-23

Before class:

Lab-Turning tables into graphs, 2015-03-23

Before class:

Causal graphs, 2015-03-25

Before class: Optional after class:

Conditioning and matching for causal inference, 2015-03-30

Before class: Optional after class:

Lab-Replication studio, 2015-03-30

Before class:

Regression, causal inference, and shoe leather, 2015-04-01

Before class:

Logit and probit models for categorical response variables, 2015-04-06

Before class:

Lab-Working with logit and probit coefficients, 2015-04-06

Before class:

Logit and probit models: Not as simple as you thought, 2015-04-08

Before class: Optional after class:

Models for polytomous data, 2015-04-13

Before class:

Lab-Working with models for polytomous data, 2015-04-13

Before class:

Generalized linear model and models for count data, 2015-04-15

Before class:

Making simple (and complex) models more flexible and interesting, 2015-04-20

Before class:

Lab-Hurricanes!, 2015-04-20

Before class:

Multilevel modeling, 2015-04-22

Before class:

Sampling, networks, and hidden populations, 2015-04-27

Before class: Optional after class:

Lab-Project presentations, 2015-04-27

Before class: NOTE: This lab will end at 4:30 so that we can attend the Tumin Lecture.

Cautions, warnings, and wisdom, 2015-04-29

Before class:


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.