Currie highlights importance of 'Big Data' for child-health research

Princeton University professor Janet Currie tackles big questions about child health with the help of big sets of data.

Using birth records of more than 1 million babies and their mothers, she and a fellow researcher correlated maternal weight gain with child birth weight, a predictor of birth complications and future obesity of the child. Such records also helped her show that implementing electronic toll collection cut down on exhaust from idling engines and reduced the number of low-birth-weight and premature babies born to women who live near toll plazas, a sign of the impact of air pollution.

Currie, Princeton's Henry Putnam Professor of Economics and Public Affairs and director of the Center for Health and Wellbeing, focuses her research on the health and well-being of children. She has written about early intervention programs, programs to expand health insurance and improve health care, public housing, and food and nutrition programs. Her current research focuses on socioeconomic differences in child health, and on environmental threats to children's health from sources such as toxic pollutants.

She recently wrote an article in the journal Pediatrics highlighting the importance of providing researchers access to so-called "Big Data," like that she has utilized, while safeguarding patient privacy. Currie describes the benefits and risks of research using Big Data:

"There have lately been many discussions of the potential of Big Data to answer important questions. In medicine, much of this discussion revolves around gene sequencing and the use of bio-samples. But Big Data also exist in the form of birth and death records, hospital records, insurance claims, disease registries, and other administrative records. It is important for researchers to have access to these data while safeguarding patient privacy.  

"Some of my previous work shows just how useful these records can be. For example, in an article in the Lancet, David Ludwig, a professor of pediatrics and nutrition at Harvard University, and I used information on birth records to compare siblings and showed using a sample of over 500,000 women and 1.1 million babies that the odds of giving birth to a baby over 4,000 grams (about 9 pounds) were over twice as large for women gaining over 24 kilograms (53 pounds) during pregnancy, relative to women gaining only 8 to 10 kilograms (18-22 pounds) during pregnancy. Larger birth weights are important because they are predictive of complications of labor and delivery, and of a higher probability of obesity later in life.

"In another study with Reed Walker, an assistant professor of economics at the University of California-Berkeley, I examined the effect of electronic toll collections (E-ZPass) on infant health in New Jersey and Pennsylvania. E-ZPass greatly reduced idling and emissions around toll plazas. By focusing on infants born to women living near toll plazas before and after E-ZPass implementation and comparing them to women located along the same busy highways, but farther away from toll plazas, we were able to show that the implementation of E-ZPass reduced the incidence of low birth weight and prematurity by 8 to 10 percent in the vicinity of the toll plazas. This result sheds light on the harmful effects of air pollution from automobile traffic on infant health.

"The use of administrative medical records for research raises ethical issues having to do with weighing the benefits of the research against risks to subjects. The main risk is that sensitive medical information could be disclosed. I have taken many different approaches to minimizing the risk of disclosure, ranging from working within state governments (where most such information is housed) to create anonymized data for research to adding small amounts of 'noise' to key variables (like latitude and longitude of residence) so individuals cannot be identified.

"Good stewardship demands that we use health information efficiently in order to advance the public good. If a question could easily be answered using existing data, it is perhaps immoral to needlessly subject a new set of subjects to the risk of medical experiments.

"Here are five examples of questions that could be answered using existing administrative data sets: 

  • Using linked birth records, hospital discharge data and emergency room visit records, it would be possible to ask whether children born with the aid of assisted reproductive technology are more likely than other children (or their own siblings) to have subsequent health problems;
  • Using birth records linked to educational records, researchers could ask whether children whose mothers smoked during pregnancy were more likely to have Attention Deficit Hyperactivity Disorder than siblings born when the mothers did not smoke; 
  • Using hospital records linked with education records, it would be possible to examine the impacts of head injuries such as concussion on educational outcomes; 
  • Using data on hospital and ER visits for asthma linked to data from air pollution monitors, it would be possible to see whether new cases of asthma were more likely to develop in high pollution areas, as well as how children with asthma responded to variations in pollution levels; and 
  • Using birth data linked to data from autism registries and special education records, it would be possible to see when children who eventually end up in autism registries enter the special education system and what sort of diagnoses they receive.

"These are the types of questions I am pursuing in my continuing research."