Sociology 596: Web-based Social Research

Princeton University
Spring 2008
Tuesday, 2:30-5:30
Location: 190 Wallace Hall
Instructor: Matthew Salganik

The Internet has already changed the way we live and work, but it has only started to change the way we conduct social science research. This seminar will provide students with an overview of the new types of research that the Internet makes possible including online experiments, games, wikis, and web-crawling. These new data collection possibilities may help researchers better understand both individual behavior and collective social dynamics.

Each class will consist of a general discussion of several readings followed by student presentations of specific research projects. Students are expected to come to class prepared for discussion as well as present a few articles during the course of the semester. There will be no exam, but students will be expected to complete a final paper or project.

1. Introduction (Feb 5)

In this first class we will cover a broad overview of web-based research, focusing on both strengths and weaknesses.

2. Experiments (Feb 12, but will be changed)

The web offers numerous advantages over the traditional laboratory for the conduct of social science experiments. First, the web allows researchers to conduct experiments on a completely different scale; lab experiments are limited to hundreds of participants, but web-based experiments involving tens of thousands of participants have already been conducted and larger experiments are becoming increasingly practical. The web also allows researchers access to a much broader pool of participants and allows researchers to study decision making in a more natural environment. But, conducting experiments on the web also includes some drawbacks mostly related to limited control over experimental participants. We will discuss two types of web-based experiments: those on self-standing sites and those on pre-existing sites.

For general discussion

For presentation

Self-standing experiments

Experiments on other sites

3. Screen scraping, APIs, web crawling, and text mining (Feb 19)

There is a tremendous amount of data available on the web, but it's often not in the format that we want. For example, Godard and Mears were interested in the movement of models between fashion houses and style.com presents photos from fashion shows where all the models are labeled (here are labeled photos from the Spring 2008 Marc Jacobs ready-to-wear runway show). It would be possible to extract the names of these models by hand, but that would be laborious and time consuming. Instead, we can write a program to extract that information for us automatically.

For general discussion

For presentation

4. Virtual worlds, MMORPGs, and large games (Feb 26)

Virtual worlds, such as World of Warcraft and Second Life, are environments where thousands of people come together and interact, offering us a chance to study social dynamics on a massive scale. Edward Castronova has gone so far as to predict that these virtual worlds will become "the supercolliders of social science."

For general discussion

For presentation

5. Clickstream and digital traces (March 4)

Tremendous amounts of data about our behavior, purchases, and whereabouts are now being collected automatically. In addition to raising privacy concerns, these automatically collected "digital traces" provided exciting opportunities for research. Clickstream data -- records of click behavior at websites that are automatically recorded in the server log files of the website owner -- are a particularly exciting new avenue for studying choice behavior in a natural environment.

For general discussion

For presentation

6. Games, crowdsourcing, wikis, and citizen science (March 11)

Anyone who has used wikipedia understands the power of large-scale social collaboration. Is it possible to harness this collective power for other types of research?

For general discussion

For presentation

Stuff for next year

This is interesting stuff that I found while the course was progressing. Hopefully, we can include this in a future class.