(Sociology 596) Computational Social Science: Social Research in the Digital Age

Note: This course is officially listed as "Web-based Social Research"
Princeton University
Fall 2012
Time: Tuesday 2pm-5pm (second half of semester)
Location: 190 Wallace Hall
Instructor: Matthew Salganik

In the last decade we have witnessed the birth and rapid growth of Wikipedia, Google, Facebook, iPhones, Wi-Fi, YouTube, Twitter, and numerous other marvels of the digital age. In addition to changing the way we live, these tools---and the technological revolution they are a part of---have fundamentally changed the way that we can learn about the social world. We can now collect data about human behavior on a scale never before possible and with tremendous granularity and precision. The ability to collect and process "big data" enables researchers to address core questions in the social sciences in new ways and opens up new areas of inquiry.

This course on computational social science will emphasize social science rather than computation. We will focus on how traditional concepts of research design in the social sciences can inform our understanding of new data sources, and how these new data sources might require us to update our thinking on research design.

Now a little about mechanics. Each three hour class will consist of a general discussion based on several readings. Then, students will take turns presenting specific papers that apply the ideas from the general discussion. Students are expected to come to class prepared for the general discussion as well as present a few articles during the course of the semester. There will be no exam.

Your grade will be based on the following components:

There are no official prerequisites for the course, and students from all departments are welcome. Undergraduates interested in taking the course should contact the instructor for permission.

Meeting 1 (11/6/12) Introduction and Ethics

In this first class we will cover a broad overview of web-based research, focusing on both strengths and weaknesses. We will also discuss ethical issues that will arise throughout the course.

For general discussion

  • Lazer, D. et al. 2009. Computational social science. Science, 323:721-723.
  • Watts, D.J. 2007. A twenty-first century science. Nature, 445:489.
  • Anderson, C. 2008. The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired.
  • Simonite, T. 2012. What Facebook Knows. MIT Technology Review.
  • Barbaro and Zeller. 2006. A Face is Exposed for AOL Searcher No. 4417749. New York Times, August 9.
  • Nissenbaum, H. 2010. Privacy in Context, Stanford University Press. Introduction.
  • King, G. 2011. Ensuring the Data-Rich Future of the Social Sciences. Science, 331(6018):719-721.
  • Zimmer, M. 2010. "But the data is already public": on the ethics of research in Facebook. Ethics and Information Technology, 12:313-325.
  • boyd and Crawford. 2011. Six Provocations for Big Data. Working paper.
  • Meeting 2 (11/13/12) Individual experiments

    The web offers numerous advantages over the traditional laboratory for the conduct of social science experiments. First, the web allows researchers to conduct experiments on a completely different scale; lab experiments are limited to hundreds of participants, but web-based experiments involving tens of thousands of participants have already been conducted and larger experiments are becoming increasingly practical. The web also allows researchers access to a much broader pool of participants and allows researchers to study decision making in a more natural environment. But, conducting experiments on the web also includes some drawbacks including unknown participant pools and limited control over participants. In this meeting we will discuss four types of web-based experiments where the unit of analysis is an individual: A/B tests on existing sites, overlayed experiments on existing sites, quasi-experiments, and experiments using micro-payment platforms (e.g. Amazon's Mechanical Turk). The strengths and weaknesses of the various approaches will be compared.

    For general discussion

  • Doleac and Stein. 2010. The Visible Hand: Race and Online Market Outcomes. Working paper.
  • Kohavi, Deng, Frasca, Longbotham, Walker, and Yu. 2012. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained. KDD.
  • Bakshy, Eckles, Yan, and Rosenn. 2012. Social Influence in Social Advertising: Evidence from Field Experiments. EC.
  • Mas and Moretti. 2009. Peers at Work. American Economic Review, 99: 112-45.
  • Einav, Kuchler, Levin, and Sundaresan. 2011. Learning from Seller Experiments in Online Markets. NBER Working Paper No. 17385.
  • Berinsky, Huber, and Lenz. 2012. Evaluating Online Labor Markets for Experimental Research: Amazon.com's Mechanical Turk. Political Analysis, 20:351-368.
  • Horton, Rand, and Zeckhauser. 2011. The online laboratory: conducting experiments in a real labor market. Experimental Economics, 14(3):399-425.
  • For presentation

  • Kohli, Bachrach, Stillwell, Kearns, Herbrich, and Graepel. 2012. Colonel Blotto on Facebook: The Effect on Social Relations on Strategic Interaction. Web Science'12.
  • Bakshy, Rosenn, Marlow, and Adamic. 2012. The role of social networks in information diffusion. WWW.
  • Aral and Walker. 2011. Creating Social Contagion Through Viral Product Design: A Randomized Trial of Peer Influence in Networks. Management Science, 57: 1623-1639.
  • Restivo and van de Rijt. 2012. Experimental Study of Informal Rewards in Peer Production. PLoS ONE, 7:e34358.
  • Broockman and Green. 2012. Can Facebook Advertisements Increase Political Candidates' Name Recognition and Favorability? Evidence from a Randomized Field Experiment. Working paper.
  • Horton. 2012. Computer-Mediated Matchmaking: Facilitating Employer Search and Screening. Working paper.
  • Tucker and Zhang. 2011. How Does Popularity Information Affect Choices? A Field Experiment. Management Science, 57:828-842.
  • Mason and Watts. 2009. Financial incentives and the performance of crowds. KDD.
  • Horton. 2010. Employer Expectations, Peer Effects and Productivity: Evidence from a Series of Field Experiments. Working paper.
  • Mason and Suri. 2012. A Guide to Behavioral Experiments on Mechanical Turk. Behavior Research Methods, 44(1), 1-23.
  • Meeting 3 (11/20/12) Collective experiments

    The web also allows for collective experiments, where the unit of analysis is a group, not an individual. These collective experiments introduce numerious logistical complications, but can be used to address questions that are otherwise extremely difficult to study.

    For general discussion

  • Hedstrom. 2006. Experimental macro sociology: Predicting the next best seller. Science, 311:786-787.
  • Salganik, Dodds, and Watts 2006. Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311:854-856 (also read supporting online materials).
  • Suri and Watts. 2011. Cooperation and contagion in web-based, networked public goods experiments. PLoS One, 6:e16836.
  • van der Leij. 2011. Experimenting with Buddies. Science, 334:1220-1221.
  • Centola. 2010. The spread of behavior in an online social network experiment. Science, 329:1194-1197 (also read supporting online materials).
  • Centola. 2011. An Experimental Study of Homophily in the Adoption of Health Behavior. Science, 334:1269-1272 (also read supporting online material).
  • For presentation

  • Salganik and Watts. 2008. Leading the herd astray: Experimental study of self-fulfilling prophecies in an artificial cultural market. Social Psychology Quarterly, 71:338-355.
  • Salganik and Watts. 2009. Web-Based Experiments for the Study of Collective Social Dynamics in Cultural Markets. Topics in Cognitive Science, 1:439-468.
  • Mason and Watts. 2012. Collaborative learning in networks. Proceedings of the National Academy of Sciences, 109(3):764-769.
  • Wang, Suri, and Watts. 2012. Cooperation and assortativity with dynamic partner updating Proceedings of the National Academy of Sciences, 109(36):14363-14368.
  • Isaac, Walker, and Williams. 1994. Group size and the voluntary provision of public goods: Experimental evidence utilizing large groups. Journal of Public Economics, 54:1-36.
  • Meeting 4 (11/27/12) Mobile phones and wearable sensors

    There are approximately four billion mobile phones in the world. While these devices are often thought of as "phones," the newest wave of "smart phones" that are increasingly dominant in developed countries are actually sophisticated mobile computers that offer amazing opportunities for researchers. In this class we will discuss the two main forms of research using mobile phones and wearable sensors: research that uses data collected from individual devices and research that uses aggregate data collected by mobile phone companies. Within the category of research that users individual devices, we will distinguish between research that uses phones and research that uses custom-build devices. We will also distinguish between active and possive data collection.

    For general discussion

  • Eagle. 2010. Mobile Phones as Sensors for Social Research. in The Handbook of Emergent Technologies in Social Research. Hesse-Biber (Ed.). [to be posted on blackboard]
  • Kaplan and Stone. 2012. Bringing the Laboratory and Clinic to the Community: Mobile Technologies for Health Promotion and Disease Prevention. Annual Review of Psychology, in press.
  • Palmer, Espenshade, Bartumeus, Chung, Ozgencil, and Li New Approaches to Human Mobility: Using Mobile Phones for Demographic Research. Demography, forthcoming.
  • Gething and Tatem. 2011. Can Mobile Phone Data Improve Emergency Response to Natural Disasters? PLoS Medicine, 8(8):e1001085.
  • Bengtsson, Lu, Thorson, Garfield, von Schreeb. 2011. Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti. PLoS Medicine 8(8):e1001083.
  • Salate, Kazandjieva, Lee, Levis, Feldman, Jones. 2010. A high-resolution human contact network for infectious disease transmission. Proceedings of the National Academy of Sciences, 107(51):22020-22025.
  • For presentation

  • Miller. 2012. The Smartphone Psychology Manifesto. Perspectives on Psychological Science, 7(3):221-237.
  • Raento, Oulasvirta, and Eagle. 2009. Smartphones: An Emerging Tool for Social Scientists. Sociological Methods and Research, 37(3):426-454.
  • Kazandjieva, Lee, Salathe, Feldman, Jones, Levis. 2010. Experiences in Measuring Human Contact Network for Epidemiological Research. HotEmNets.
  • Chittaranjan, Blom, and Gatica-Perez. 2011. Mining large-scale smartphone data for personality studies. Personal and Ubiquitous Computing.
  • Blumenstock, Eagle, and Fafchamps, 2011. Charity and Reciprocity in Mobile Phone-Based Giving: Evidence in the Aftermath of Earthqaukes and Natural Disasters. Working paper.
  • Wyatt, Choudhury, Bilmes, and Kitts. 2011. Inferring Colocation and Conversation Networks from Privacy-Sensitive Audio with Implications for Computational Social Science. ACM Transactions on Intelligent Systems and Technology, 2(1).
  • Bagrow, Wang, and Barabasi. 2011. Collective Response of Human Populations to Large-Scale Emergencies. PLoS ONE, 6(3):e17680.
  • Onnela, et al. 2008. Structure and tie strengths in mobile communications networks. Proceedings of the National Academy of Sciences, 104(18):7332-7336.
  • Wuchty. 2009. What is a social tie? Proceedings of the National Academy of Sciences, 106(36):15099-15100.
  • Eagle, Pentland, Lazer. 2009. Inferring social network structure using mobile phone data. Proceedings of the National Academy of Sciences, 106(36):15274-15278. with Comment and Reply.
  • Wesolowski, Eagle, Noor, Snow, and Buckee. 2012. Heterogeneous Mobile Phone Ownership and Usage Patterns in Kenya. PLoS ONE 7(4): e35319.
  • Wesolowski and Eagle. 2010. Parameterizing the dynamics of slums. 2010 AAAI Spring Symposium Series.
  • Meeting 5 (12/4/12) Digital traces

    Human behavior in the digital age often leaves behind traces, and these traces are being aggregated on a scale that is difficult to comprehend. In this meeting we will discuss the strengths and weaknesses of using these traces for social research.

    For general discussion

  • Polgreen, Chen, Pennock, Nelson, Weinstein. 2008. Using Internet Searches for Influenza Surveillance. Clinical Infectious Disease, 47(11):1443-1448.
  • Helft. 2008. Google Uses Searches to Track Flu's Spread. New York Times.
  • Ginsberg, Mohebbi, Patel, Brammer, Smolinski, and Brilliant. 2008. Detecting influenza epidemics using search engine query data. Nature, 457:1012-1014.
  • Butler. 2008. Web data predict flu. Nature, 456, 287-288.
  • Goel, Hofman, Lahaie, Pennock, and Watts. 2010. Predicting consumer behavior with Web search. Proceedings of the National Academy of Sciences, 107(41):17486-17490.
  • Cook, Conrad, Fowlkes, and Mohebbi. 2011. Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic. PLoS ONE, 6(8):e23610.
  • Google Correlate: The Comic Book.
  • Ugander, Backstrom, Marlow, and Kleinberg. 2012. Structural diversity in social contagion. Proceedings of the National Academy of Sciences, 109(16):5962-5966.
  • Kossinets, G. and Watts, D.J. (2009). Origins of Homophily in an Evolving Social Network. American Journal of Sociology, 115(2):405-450.
  • For presentation

  • Schneider and Buckley. 2002. What do parents want from schools? Evidence from the internet. Education Evaluation and Policy Analysis, 24(2):133-144.
  • Backstrom, Sun, and Marlow. 2010. Find Me If You Can: Improving Geographical Prediction with Social and Spatial Proximity. WWW.
  • Wimmer and Lewis. 2010. Beyond and below racial homophily. ERG models of a friendship network documented on Facebook. American Journal of Sociology, 116(2):583-642.
  • Wuchty and Uzzi. 2011. Human Communication Dynamics in Digital Footsteps: A Study of the Agreement between Self-Reported Ties and Email Networks. PLoS ONE, 6(11):e26972.
  • De Choudhury, Mason, Hofman, Watts. 2010. Inferring relevant social networks from interpersonal communication. WWW.
  • Aral and Van Alstyne. 2011. The Diversity-Bandwidth Trade-off. American Journal of Sociology, 117(1):90-171.
  • Lewis, Gonzalez, and Kaufman. 2012. Social selection and peer influence in an online social network. Proceedings of the National Academy of Sciences, 109(1):68-72.
  • Baker and Fradkin. 2011. What drives job search? Evidence from Google search data. Working paper.
  • Stephens-Davidowitz. 2012. The Effects of Racial Animus on a Black Presidential Candidate: Using Google Search Data to Find What Surveys Miss. Working paper.
  • Golder and Macy. 2011. Diurnal and Seasonal Mood Vary with Work, Sleep and Daylength Across Diverse Cultures. Science, 333:1878-81.
  • Meeting 6 (12/11/12) Crowdsourcing, Citizen Science, and Conclusions

    Anyone who has used Wikipedia understands the power of large-scale social collaboration. How can we harness this collective power for other intellectual challenges?

    For general discussion

  • Watch Luis von Ahn's talk at google on human computation
  • The Economist. 2007. Spreading the load. The Economist, Dec 8.
  • Markoff. 2010. In a Video Game, Tackling the Complexities of Protein Folding. New York Times August 9.
  • Barker. 2008. Trying to Design a Truly Entertaining Game Can Defeat Even a Certified Genius. Wired.
  • Cooper et al. 2010. Predicting protein structures with a multiplayer game. Nature 466(5):756-760.
  • Fortson, Masters, Nichol, Borne, Edmondson, Lintott, Raddick, Schwainski, and Wallin. 2011. GalaxyZoo: Morphological Classifications and Citizen Science. Advances in Machine Learning and Data Mining for Astronomy, in press.
  • Tuite, Snavley, Hsiao, Tabing, and Popovic. 2011. PhotoCity: Training Experts at Large-scale Image Acquisition Through a Competetive Game. CHI.
  • For presentation

  • von Ahn and Dabbish. 2008. Designing games with a purpose. Communications of the ACM, 58-67.
  • von Ahn, et al. 2008. reCAPTCHA: Human-based character recognition via web security measures. Science, 321(5895):1465-1468.
  • Khatiba, Cooper, Tykaa, Xu, Makedon, Popovic, Baker, and FoldIt Players. 2011. Algorithm discovery by protein folding game players Proceedings of the National Academy of Sciences, 108:(47):18949-18953.
  • Thompson. 2008. If You Liked This, You're Sure to Love That. New York Times.
  • Bell, Koren, and Volinsky. 2010. All Together Now: A Perspective on the Netflix Prize. Chance, 23(1):24-29.