Sociology 596: Web-based Social Research
Princeton University
Fall 2010
Time: Tuesday 1pm-4pm (second half of semester)
Location: 190 Wallace Hall
Instructor: Matthew Salganik
The World Wide Web has changed the way we live and work, but it has only started to change the way we conduct social science research. This six week seminar will provide students with an overview of the new types of research that the web makes possible including online experiments, digital trace data, and crowd-sourcing. These new data collection possibilities should help researchers better understand core issues in the social sciences related to both individual behavior and collective social dynamics. There are no official prerequisites for the course, and students from other departments are welcome. Undergraduates interested in taking the course should contact the instructor for permission.
Now a little about mechanics. Each three hour class will consist of a general discussion based on several readings. Then, students will take turns presenting specific papers that apply the ideas from the general discussion. Students are expected to come to class prepared for the general discussion as well as present a few articles during the course of the semester. There will be no exam, but students will be expected to complete a final paper or project.
Your grade will be based on the following components:
- Class participation and in-class presentations: 25%
Each student will be expected to present a few articles during the course of the year. Each presentation should begin with a 30-second summary of the article, and then move to a more elaborate discussion of the key issues in the paper. The student presenter will be expected to answer any questions that come up from the class.
- Response papers: 25%
Each student will be expected to write two short response papers (2-3 pages) that address the readings of the week. Students should view them as a chance to play with the ideas in the readings: look for contradictions, establish connections to your own research, develop empirical tests, etc. The response papers should not be simple summaries of the readings. Students can choose the two weeks to which they would like to respond, and all papers should be sent to me by Monday at midnight on the day preceding the class.
- Final project: 50%
Each student will do some kind of final project. Since the class is only six weeks, these projects will obviously be somewhat limited in scope. One natural project would be to attempt to reproduce the analysis of a paper that you really like as this can often lead to new ideas and new papers. Another idea would be to try your own data collection on Mechanical Turk. Many of you already have research interests, and you should view these papers as a chance to further develop those interests. We will talk more about the final projects in class. A one paragraph draft proposal will be due the third week of class.
Introduction (11/9/10)
In this first class we will cover a broad overview of web-based research, focusing on both strengths and weaknesses.
For general discussion
- Skitka L.J. and Sargis, E.G. (2006). The internet as psychological laboratory. Annual Review of Psychology, 57:529-555.
- Watts, D.J. (2007). A twenty-first century science. Nature, 445:489.
- Check out some existing on-line studies
- Lazer, D. et al. (2009). Computational Social Science. Science, 323(5915):721-723.
- Bell, G., Hey, T. and Szalay, A. (2009) Beyond the Data Deluge. Science, 323(5919):1297-1298.
- Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired.
- Nosek, B.A., Banaji, M.R., and Greenwald, A.G. (2002). E-research: Ethics, security, design and control in psychological research on the internet. Journal of Social Issues, 58(1):161-176.
For presentation
Experiments: part one (11/16/10)
The web offers numerous advantages over the traditional laboratory for the conduct of social science experiments. First, the web allows researchers to conduct experiments on a completely different scale; lab experiments are limited to hundreds of participants, but web-based experiments involving tens of thousands of participants have already been conducted and larger experiments are becoming increasingly practical. The web also allows researchers access to a much broader pool of participants and allows researchers to study decision making in a more natural environment. But, conducting experiments on the web also includes some drawbacks including unknown participant pools and limited control over participants. Over the next two meetings, we will discuss five types of web-based experiments: those on self-standing sites, A/B tests on existing sites, "parasitic" experiments on existing sites, traditional lab experiments that use the web, and experiments using micro-payment platforms (e.g. Amazon's Mechanical Turk). The strengths and weaknesses of the various approaches will be compared.
For general discussion
- Reips, U.D. (2002). Standards for internet-based experimenting. Experimental Psychology, 49(4):243-256.
- Salganik, M.J., Dodds, P.S., Watts, D.J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311:854-856 (also read supporting online materials).
- Salganik, M.J., and Watts, D.J. (2008). Leading the herd astray: Experimental study of self-fulfilling prophecies in an artificial cultural market. Social Psychology Quarterly, 71:338-355.
- Centola, D. (2010). The spread of behavior in an online social network experiment. Science, 329(5996):1194-1197 (also read supporting online materials).
- Hanson, W.A. and Putler, D.S. (1996). Hits and misses: Herd behavior and online product popularity. Marketing Letters, 7(4):297-305.
For presentation
Self-standing experiments
- Dodds, P. S., Muhamad, R., and Watts, D. J. (2003). An experimental study of search in global social networks. Science, 301:827-829 (also read supporting online materials).
- Nosek, B. A., Banaji, M. R., and Greenwald, A. G. (2002). Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics, 6(1):101-115.
- Salganik, M.J. and Watts, D.J. (2009). Web-Based Experiments for the Study of Collective Social Dynamics in Cultural Markets. Topics in Cognitive Science, 329(5996):1194-1197.
Experiemnts on other sites
Experiments: part two (11/23/10)
For general discussion
- Check out Mechanical Turk
- Paolacci, G., Chandler, J., Ipeirotis, P.G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making (in press).
- Mason, W. and Suri, S. (2010). Conducting Behavioral Research on Amazon's Mechanical Turk Working paper.
- Horton, J.J., Rand, D.G., Zeckhauser, R.J. (2010). The Online Laboratory: Conducting Experiments in a Real Labor Market. Working paper.
- Buhrmester, M. D., Kwang, T., & Gosling, S. D. (in press). Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality data? Perspectives on Psychological Science.
- Berkowitz, L. and Donnerstein, E. (1982). External Validity is More than Skin Deep: Some Answers to Criticism of Laboratory Experiments American Psychologist, 37(3):245-257.
- Kohavi, R., Longbotham, R., Sommerfield, D. and Henne, R.M. (2009). Controlled experiments on the web: survey and practical guide. Data Mining and Knowledge Discovery, 18(1):140-181.
For presentation
Experiments using micro-payment platforms
- Chesney, T., Chuaha, S, and Hoffmann, R. (2009). Virtual world experimentation: An exploratory study. Journal of Economic Behavior & Organization, 72(1):618-635.
- Mason, W. and Watts, D. J. (2009). Financial incentives and the performance of crowds. SIGKDD Workshop on Human Computation, 77-85.
- Suri, S. and Watts, D.J. (2010) Cooperation and Contagion in Networked Public Goods Experiments. Working paper.
- Heer, J. and Bostock, M. (2010). Crowdsourcing graphical perception: using mechanical turk to assess visualization design. CHI.
- Henrich, J., Heine, S.J., and Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33:61-83, and see also: Gosling, S.D., Sandy, C.J., John, O.P., and Potter, J. (2010). Wired but not WEIRD: The promise of the Internet in reaching more diverse samples.
A/B tests
Lab experiments that use the web
Digital traces, click-streams, wearable sensors, and surveys: part one (11/30/10)
In addition to experiments, the web---and new electronic technology more generally---allow researchers to record human behavior at a massive scale and with incredible granularity. The web also is an enormous corpus of text and images waiting to mined. All of this data presents researchers with a number of opportunities, challenges, and ethical issues. These topics will be covered in the next two weeks and I've tried to roughly divide the readings into stuff that you can do if you partner with a large company and stuff that you can do yourself.
For discussion
- The Economist. (2007). Learning to live with Big Brother. The Economist, Sept. 27.
- Barbaro, M. and Zeller, T. (2006). A Face is Exposed for AOL Searcher No. 4417749. New York Times, August 9.
- Bucklin, R. E., et al. (2002). Choice and the internet: From clickstream to research stream. Marketing Letters, 13(3):245-258.
- Schneider, M. and Buckley, J. (2002). What do parents want from schools? Evidence from the internet. Education Evaluation and Policy Analysis, 24(2):133-144.
- Butler D. (2007). Data sharing threatens privacy. Nature, 449:644-645.
- Goel, S. et al. (2010). Predicting consumer behavior with Web search. PNAS, 107(41):17486-17490. (see blog post and if you are interested in graphs see this blog post)
- Play with Mouselab WEB
For presentation
- Ohm, P. (2009). Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. Working paper.
- Polgreen, P.M. Chen, Y., Pennock, D.M., and Nelson, F.D. (2008) Using Internet Searches for Influenza Surveillance. Clinical Infectious Diseases, 47:1443-1448.
- Ginsberg, J. et al. (2009) Detecting influenza epidemics using search engine query data. Nature, 457:1012-1014.
- Kossinets, G. and Watts, D. J. (2006). Empirical analysis of an evolving social network. Science, 301:88-90.
- Onnela, J.P. et al. (2007). Structure and tie strength in mobile communication networks. PNAS, 104(18):7332-7336.
- Hitsch, G.J., Hortacsu, A., and Ariely, D. (2006). What makes you click? Mate preferences in online dating. Quantitative Marketing and Economics, (in press).
- Bollen, J. et al. (2009). Clickstream Data Yields High-Resolution Maps of Science. PLoS One, 4(3):4803.
- Gentzkow, M. and Shaprio, J.M. (2010) Ideological segregation online and offline. Working paper.
- Castronova, E. (2006). On the research value of large games: Natural experiments in Norrath and Camelot. Games and Culture, 1(2):163-186.
- Lofgren, E.T. and Fefferman, N.H. (2007). The untapped potential of virtual game world to shed light on real world epidemics. The Lancet Infectious Diseases, 7:625-629.
- Balicer, R.D. (2007). Modeling infectious diseases dissemination through online role playing-games. Epidemiology, 18:260-261.
- Burt, R.S. (2010). Structural Holes in Virtual Worlds. Working paper.
- Rivers, D. (2007). Sampling for web surveys. Presented at Joint Statistical Meetings [available from blackboard].
Digital traces, click-streams, wearable sensors, and surveys: part two (12/7/10)
For general discussion
- Dodds, P.S. and Danforth, C.M. (2009). Measuring the Happiness of Large-Scale Written Expression: Songs, Blogs, and Presidents. Journal of Happiness Studies.
- Ingram, P. and Morris, M.W. (2007). Do people mix at mixers? Structure, homophily, and the "life of the party." Administrative Science Quarterly, 52:558-585.
- Goel, S. Mason, W., and Watts, D.J. (2010) Real and Perceived Attitude Agreement in Social Networks. Journal of Personality and Social Psychology (in press).
For presentation
- Chiasson, M.A. et al. (2006). HIV Behavioral Research Online. Journal of Urban Health, 83(1):73-85.
- Srivastava, S., John, O.P., Gosling, S.D., and Porter, J. (2003). Development of personality in early and middle adulthood: Set like plaster or persistent change? Journal of Personality and Social Psychology, 84(5):1041-1053.
- Saiz, A. and Simonsohn, U. (2008). Downloading Wisdom from Online Crowds. Working paper.
- Foster, A. M., et al. (2006). Providing medical abortion information to diverse communities: Use patterns of a multilingual web site. Contraception, 74:264-271.
- Wynn, L. and Trussell, J. (2005). The morning after on the internet: Usage and questions to the emergency contraception website. Contraception, 72:5-13.
- Stewart, D. (2005) Social status in an open-source community. American Sociological Review, 70:823-842.
- Eagle, N., Pentland, A. and Lazer, D. (2009). Inferring Social Network Structure using Mobile Phone Data. PNAS, 106(36):15274-15278 with Comment and Reply.
- Olivola, C.Y. and Todorov, A. (2010). Fooled by first impressions? Reexamining the diagnostic value of appearance-based inferences. Journal of Experimental Social Psychology 46:315-327.
- Raento, M, Oulasvirta, A, and Eagle, N. (2009). Smartphones: An Emerging Tool for Social Scientists Sociological Methods & Research 37:426.
- Wesolowski, A. and Eagle, N. Parameterizing the Dynamics of Slums, 2010 AAAI Spring Symposium Series.
- Berger, J. and Milkman K. (2010). Social Transmission, Emotion, and the Virality of Online Content. Working paper.
- Hargittai, E and Karr, C. (2009). "Wat R U Doin?: Studying the Thumb Generation Using Text Messaging" in Research Confidential: Solutions to Problems Most Social Scientists Pretend They Never Have.
- Williams, D. and Xiong, L. (2009). "Herding Cats Online: Real Studies of Virtual Communities" in Research Confidential: Solutions to Problems Most Social Scientists Pretend They Never Have.
Games, crowd-sourcing, wikis, and citizen science (12/14/10)
Anyone who has used wikipedia understands the power of large-scale social collaboration. Is it possible to harness this collective power for research?
For general discussion
- Watch Luis von Ahn's talk at google on human computation
- von Ahn, L. and Dabbish L. (2008). General techniques for designing games with a purpose. Communications of the ACM, 58-67.
- The Economist. (2007). Spreading the load. The Economist, Dec 8.
- Markoff, J. (2010). In a Video Game, Tackling the Complexities of Protein Folding. New York Times August 9.
- Giles, J. (2007). Life's a game. Nature, 445:18-20
- Barker, C. (2008). Trying to Design a Truly Entertaining Game Can Defeat Even a Certified Genius. Wired.
- Shneiderman, B. (2009). A National Initiative for Social Participation. Science, 323(5920):1426-1427.
- Shneiderman, B. and Preece, J. (2007). 911.gov. Science, 315(5814):944.
- Check out Galaxy Zoo.
For presentation
- Cooper, S. et al. (2010). Predicting protein structures with a multiplayer game Nature 466(5):756-760.
- von Ahn, L. et al. (2008). reCAPTCHA: Human-based character recognition via web security measures. Science, 321(5895):1465-1468.
- Doan, A., Ramakrishnan, R., and Halevy, A. (2010). Mass Collaboration Systems on the World-Wide Web. Working paper.
- Malone, T.W., Laubacher, R., and Dellarocas, C. (2009) Harnessing Crowds: Mapping the Genome of Collective Intelligence. Working Paper.
- Bovey, J. and Rodgers, P. (2007). A method for testing graph visualizations using games, in Visualization and Data Analysis 2007, volume 6495 of Proceedings Electronic Imaging. SPIE.
- Brockmann, D., Hufnagel, L. and Geisel, T. (2006). The scaling laws of human travel, Nature, 439:462-465.
- Beenen, G., et al. (2004). Using social psychology to motive contributions to online communities. CSCW '04: Proceedings of the 2004 ACM conference on Computer supported cooperative work, 212-221.
- Lintott, C.J. et al. (2008). Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey Monthly Notices of the Royal Astronomical Society, 389(3):1179-1189.
- Raddick, M.J., et al. (2010). Galaxy Zoo: Exploring the Motivations of Citizen Science Volunteers. Astronomy Education Review, 9(1), 010103.
- Banerji, M. et al. (2010). Galaxy Zoo: Reproducing Galaxy Morphologies Via Machine Learning. Working paper.
- Snavely, N., Garg, R., Seitz, S.M, and Szeliski. R. (2008). Finding Paths through the World's Photos. ACM Transactions on Graphics (SIGGRAPH 2008).[see also video]
- D. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg. (2009) Mapping the World's Photos. Proc. 18th International World Wide Web Conference + examples.
- Cheshire, C. and Antin, J. (2008). The Social Psychological Effect of Feedback on the Production of Internet Information Pools Journal of Computer-Mediated Communication 13:705-727.
- Aanensen, D.M. et al. (2009). EpiCollect: Linking Smartphones to Web Applciations for Epidemiology, Ecology, and Community Data Collection., PLoS One, 4(9):e6968.