Spring 2013

This course provides an introduction to social statistics.

There will be weekly problem sets given out each Wednesday and due the following Wednesday. Many of the problem sets will involve the statistical package R. You will be required to submit the code you used to complete your assignment and that code must comply with Google's R style guide. I know that the choice of R will lead to difficulties in the beginning of the semester, but there will be big payoffs later as you become more familiar with it. Regarding R, we will start slowly and assume that you do not have any programming experience.

You are encouraged to work together on the problem sets, but you must type all of your code yourself. That is, no copy-and-paste from other people's code. You would not copy-and-paste from someone else's paper, and you should treat code the same way.

In addition to problem sets, students will be expected to complete a final project. I will provide additional details about the final project in class. These final projects will be due Tuesday, May 14th (Dean's Day); no extensions will be given.

This class is required for all first year Ph.D. students in Sociology. If you are not a graduate student in sociology, please talk to me about whether this is the right course for you.

We will be using one book this semester:

- Applied Regression Analysis and Generalized Linear Models by John Fox (2nd edition).

We will also read chapters from the following books, but these will be available on Blackboard:

- Visual Explanations by Edward Tufte.
- Regression Analysis: A Constructive Critque by Richard Berk.
- Observational Studies by Paul Rosenbaum.
- Design of Observational Studies by Paul Rosenbaum.
- Counterfactuals and Causal Inference by Stephen Morgan and Christopher Winship.

**A note on the use of open access scholarship**: Because of the prohibitive cost of academic journals, many of assigned readings for this course are available only to people with access to a university library. I have marked these closed access articles with a . Fortunately, some of the more recent scholarship in this area is freely available to everyone in the world. I have marked these open access article with a . It is my hope that eventually I will be able to construct this syllabus using exclusively open access scholarship. In the meantime, copies of many of the closed access articles can be found through Google Scholar.

Below are the readings assignments for each week. You should come to class having looked at this material and you should read it roughly in the order listed. I will distinguish the Fox books by calling them Fox and Fox (R book).

- Vance, A. 2008. "Data Analysts Captivated by R's power."
*The New York Times*. - Google's R style guide
- Help with installing R and RStudio

- Tufte, Visual Explanations, Chapter 2. (Available from Blackboard)
- Gelman, A., Pasarica, C., and Dodhia, R. (2002). "Let's Practice What We Preach: Turning Tables into Graphs."
*The American Statistician*56(2):121-130. - Kastellec, J.P. and Leoni, E.L. (2007). "Using Graphs Instead of Tables in Political Science."
*Perspectives on Politics*5(4):755-771. [see also the code repository ] - Fox, Chapter 3.

- Google's R style guide
- Code repository from Kastellec, J.P. and Leoni, E.L. (2007). "Using Graphs Instead of Tables in Political Science."
*Perspectives on Politics*5(4):755-771.

- Fox, Chapter 5.2
- Berk, Chapter 6. (Available from Blackboard)
- Berk, Chapter 7. (Available from Blackboard)

- No new reading

- Taubes, G. (2007). "Do we really know what makes us healthy?"
*New York Times Magazine*. - Morgan S.L. and Winship, C. (2007).
*Counterfactuals and Causal Inference.*Chapters 1 and 2. (Available from Blackboard) - Rosenbaum, P. (2010).
*Design of Observational Studies*, Chapter 1. (Available from Blackboard)

- Cornfield, J. et al. (1959). "Smoking and lung cancer: Recent evidence and a discussion of some questions."
*Journal of the National Cancer Institute*, 22(1):173-203. (Available from Blackboard) - Nasar, S. (1993). "David Card and Alan Krueger; Two Economists Catch Clinton's Eye By Bucking the Common Wisdom."
*New York Times*. - Card, D. and Krueger, A. B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania."
*The American Economic Review*, 84(4):772-793. - Neumark, D. and Wascher, W. (2000). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Comment."
*The American Economic Review*, 90(5):1362-1396. - Card, D. and Krueger, A. B. (2000). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Reply."
*The American Economic Review*, 90(5):1397-1420. - Pager, D., Bonikowsi, B, and Western, B. (2009) "Discrimination in a Low-Wage Labor Market: A Field Experiment"
*American Sociological Review*, 74(5):777-799.

- No new reading

- Brambor, T., Clark, W.R., and Golder, M. (2006). Understanding Interaction Models: Improving Empirical Analyses.
*Political Analysis*14:63-82. - Greenman, E. and Xie, Y. (2008). Double Jeopardy? The Interaction of Gender and Race on Earning in the United States.
*Social Forces*86(3):1217-1244.

- No new reading

- Kline, R. (2004).
*Beyond Significance Testing*, Chapter 3 [Available from Blackboard]. - King, G. Tomz, M., and Wittenberg, J. (2000). Making the Most of Statistical Analyses: Improving Interpretation and Presentation.
*American Journal of Political Science*, 44(2):341-355. - Ward, M.D., Greenhill, B.D., and Bakke, K.M. (2010). The perils of policy by p-value: Predicting civil conflicts.
*Journal of Peace Research*, 47(4):363-375.

- No new reading

- Fox, Appendix B.1: Matricies.
- Fox, Chapter 9, Sections: 9.1 - 9.2. (skip sections 9.1.1 and 9.1.2)

- Little, R.J.A. and Rubin, D.B. (2002).
*Statistical Analysis with Missing Data*: Chapter 1 (Introduction). [Available from Blackboard]

- Fox, Chapter 11. (read all starred sections, Sec. 11.8 will be a good review of Chapter 9)
- Fox, Chapter 12. (read all starred sections)
- Fox, Chapter 13. (skip all starred sections)

- Fountain, H. (2006). The lonely American just got a bit lonelier.
*The New York Times*, July 2. - McPherson, M., Smith-Lovin, L., and Brashears, M.E. (2006). Social isolation in America: Changes in core discussion networks over two decades.
*American Sociological Review*, 71(3):353-375. - Fischer, C. (2009). The 2004 GSS finding of shrunken social networks: An artifact?
*American Sociological Review*, 74(4):657-669. - McPherson, M., Smith-Lovin, L., and Brashears, M.E. (2009). Models and marginals: Using survey evidence to study social networks.
*American Sociological Review*, 74(4):670-681. - Holford, T.R. (2005). Age-Period-Cohort Analysis. from
*Encyclopedia of Biostatistics*. (NOTE: Please only read section 1) - Jasso, G. (1985). Marital Coital Frequency and the Passage of Time: Estimating the Separate Effects of Spouses' Ages and Marital Duration, Birth and Marriage Cohorts, and Period Influences.
*American Sociological Review*, 50(2):224-241. - Kahn, J.R. and Udry J.R. (1986). Marital Coital Frequency: Unnoticed Outliers and Unspecified Interactions Lead to Erroneous Conclusions.
*American Sociological Review*, 51(5):734-737. - Jasso, G. (1986). Is It Outlier Deletion or Is It Sample Truncation? Notes on Science and Sexuality.
*American Sociological Review*, 51(5):738-742.

- No new reading

- Fox, Chapter 14.1
- Hamner, M.J. and Kalkan, K.O. (2013). Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models.
*American Journal of Political Science*, 57(1)263-277. - Greenhill, B., Ward, M.D., and Sacks, A. (2011). The Separation Plot: A new visual method for evaluating the fit of binary models.
*American Journal of Political Science*, 55(4):991-1002.

- Slez, A. (2012). What's the matter with logistic regression? Blog post at Bad Hessian.
- Mood, C. (2010). Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It.
*European Sociological Review*, 26(1):67-82. - Berry, W.D., DeMeritt, J.H.R., and Esarey, J. (2010). Testing for Interaction in Binary Logit and Probit Models: Is a Product Term Essential?
*American Journal of Political Science*, 54(1):248-266.

- No new reading

- Fox, Chapter 14.2.
- Quillian, L. and Pager, D. (2001). Black Neighbors, Higher Crime? The Role of Racial Stereotypes in Evaluations of Neighborhood Crime.
*American Journal of Sociology*, 107(3):717-767.

- Fox, Chapter 15.1.
- Zhang, T., Salganik, M.J., and Gelman, A. (2006). How Many People Do You Know in Prison?: Using Overdispersion in Count Data to Estimate Social Structure in Networks.
*Journal of the American Statistical Association*, 101(474):409-423.

- No new reading

- Snijder, T.A.B. and Bosker, R.J. (2012).
*Multilevel Analysis (2nd edition)*: Chapter 1 (Introduction), Chapter 2 (Multilevel Theories, Multistage Sampling, and Multilevel Models), Chapter 4 (The Random Intercept Model), Chapter 5 (The Hierarchical Linear Model). [Available from Blackboard] - Xie, Y. and Hannum, E. (1996). Regional Variation in Earnings Inueqality in Reform-Era Urban China.
*American Journal of Sociology*, 101(4):950-992. (Note: Focus especially on pages 950-972 and the appendix)

- Thompson, C. (2009). Are your friends making you fat?
*New York Times Magazine*, September 13. - Christakis, N.A. and Fowler, J.H. (2007). The spread of obesity in a large social network over 32 years.
*New England Journal of Medicine*, 357:370-379. - Kolata, G. (2011). Catching obesity from friends may not be so easy.
*New York Times*, August 8. - Lyons, R. (2011). The spread of evidence-poor medicine via flawed social-network analysis.
*Statistics, Politics and Policy*, 2(1): Article 2. - Wimmer, A. and Lewis, K. (2010). Beyond and below racial homophily: ERG models of a friendship network documented on Facebook.
*American Journal of Sociology*, 116(2):583-642.

- No new reading.

- Clampet-Lundquist, S. and Massey, D.S. (2008). Neighborhood Effects on Economic Self-Sufficiency: A Reconsideration of the Moving to Opportunity Experiment.
*American Journal of Sociology*, 114(1):107-143. - Ludwig, J. et al. (2008). What Can We Learn about Neighborhood Effects from the Moving to Opportunity Experiment?
*American Journal of Sociology*, 114(1):144-188. - Sampson, R.J. (2008). Moving to Inequality: Neighborhood Effects and Experiments Meet Social Structure.
*American Journal of Sociology*, 114(1):189-231.

- Fox, Chapter 1.
- Berk, R. (2003).
*Regression Analysis: A Constructive Critque*: Chapter 11 (What to do). [Available from Blackboard] - Rosenbaum, P.R. (2002).
*Observational Studies*: Chapters 11 (Planning an observational study) and 12 (Some strategic issues). [Available on Blackboard]

- Freese, J. (2009). "Secondary Analysis of Large Social Surveys." in
*Research Confidential: Solutions to Problems Most Social Scientists Pretend They Never Have*Hargittai, E. (ed). [Available from Blackboard]

- Gelman, A. (2007). "Struggles with Survey Weighting and Regression Modeling."
*Statistical Science*22(2):153-164. (see also discussions and rejoinder) - Winship, C. and Radbill, L. (1994). "Sampling Weights and Regression Analysis."
*Sociological Methods and Research*, 23(2):230-257. - Korn, E.L. and Graubard, B.I. (1995). "Examples of Differing Weighted and Unweighted Estimates from a Sample Survey."
*The American Statistician*, 49(3):291-295. - Survey analysis in R
- Kish, L. (1992). "Weighting for Unequal Pi."
*Journal of Official Statistics*, 8(2):183-200. - Karlson, K.B., Holm, A., and Breen R. (2012). Comparing Regression Coefficients Between Same-sample Nested Models Using Logit and Probit: A New Method.
*Sociological Methodology*, 42(1):286-313.

This course material is licensed under a Creative Commons Attribution 3.0 Unported License.