The Playing Field Shifts: Predicting the Seats-Votes Curve in the 2008 U.S. House Election (with Andrew Gelman and Jamie Chandler). 2008. PS: Political Science & Politics. 41(4):729-32.



Replication Information

Post-Election Analysis


Abstract: This paper predicts the seats-votes curve for the 2008 U.S. House elections. We document how the electoral playing field has shifted from a Republican advantage between 1996 and 2004 to a Democratic tilt today. Due to the shift in incumbency advantage from the Republicans to the Democrats, compounded by a greater number of retirements among Republican members, we show that the Democrats now enjoy a partisan bias, and can expect to win more seats than votes for the first time since 1992. While this bias is not as large as the advantage the Republicans held in 2006, it is likely to help the Democrats win more seats than votes and thus expand their majority.

Click here to download a pdf copy of the paper.

Click here for our paper predicting the 2006 seats-votes curve.

Replication Information


With the data and code described below, researchers can replicate our results and use the data for further study.  Note that all the files referenced below, including csv versions of the datasets, can be found in this zip file.



We used three datasets in the paper: a district-level dataset containing information on every election in each House election from 1946 to 2004; an aggregate-level dataset containing information on the total number of votes and seats gained by each party in the same elections; and a dataset containing information on each district that we used to make predictions for the 2006 election. 

a)     Individual House Races Data, 1946-2006

This dataset, which was given to us by Gary Jacobson, contains various information on every House race from 1946-2004, such as the vote share of the Democratic candidate and incumbency status; complete coding information is available here.  We modified and recoded this data using this Stata do-file.  Coding information for the updated dataset, which we use for the analysis that appears in the paper, is available here.

b)     Individual House Race Data for Predicting 2006 and 2008

This dataset, which we used for our paper predicting the 2006 seats-votes curve, contains information about the 2006 election, including incumbency status lagged vote leading up to the election, along with information about the winner and vote margins in the 2006 election. Coding information is available here (this coding also applies to 2008 datasets).

For the analyses used in the paper predicting the 2008 seats-votes curve, we used information available as of July 2008.  That database is available here.  After the election, we updated the dataset to include vote totals and information on uncontested races.  That dataset is available here. Note that the vote totals we used are unofficial results, as reported by CNN.

Statistical Code

All statistical analysis that appears in the paper was conducted using R.  Code for the pre-election analysis is available here. Code for the post-election analysis is available here.


Post-Election Analysis

We re-estimated the seats-votes curve after the election using information on uncontested races.   An updated version of Figure 1 in the paper appears below, along with the actual election results and seat shares.  As the left figure makes clear, the Democrats won less seats in 2008 than our seats-votes curve predicted, even though they significantly increased their overall seat share.  For more analysis on the House races, see here, here and here.