Frontiers of health: Deep in data
Integrated approach reveals clues to cancer in genetic databases
“We have millions of times more biological data now than we did just a few years ago,” says Olga Troyanskaya. “But we don’t know millions of times more about biology - at least, not yet.”
The assistant professor of computer science and genomics is doing her part to change that by designing computer systems that analyze massive amounts of genetic data. The freely available systems are providing researchers throughout the world with the ability to generate new insights into cancer and a multitude of other diseases.
“Olga’s discoveries reveal how much useful information is hiding in the published data,” said Maitreya Dunham, a Fellow in Princeton’s Lewis-Sigler Institute for Integrative Genomics. “It is also very forward-thinking of her to make the tools and programs developed by her lab really accessible to the average biologist. You don’t have to be a programmer to interact with her systems.”
Troyanskaya and Dunham are part of a team that created a quick, effective and powerful way to detect small chromosomal changes that take place when cells become cancerous. They are now working on a version of the system for clinical use.
A different project, called Biological Process Inference from Experimental Interaction Evidence—bioPIXIE, for short—combines data sets from a number of sources (e.g., microarrays, DNA sequences, protein interaction maps) and assigns them an appropriate weight, depending upon the question being addressed. Launched in 2005, the system already has shed light on the chromosome segregation process in yeast, which could have implications for human genetic disorders. Another recent bioPIXIE analysis revealed information about the Hsp90 protein, a promising target for anticancer drugs.
Collaborations between computer scientists and biologists are essential to ensure that biological understanding keeps pace with technology, according to Troyanskaya. Her own research partnerships on campus include ongoing projects with Dunham, molecular biologists Hilary Coller, Manuel Llinas, and Ihor Lemischka, chemist Joshua Rabinowitz, and David Botstein, director of the Lewis-Sigler Institute.
Troyanskaya also works with computer science professors David Blei, Robert Schapire, and Kai Li to further improve the analysis of genetic data. With Li, the Charles C. Fitzmorris Professor of Computer Science, she is developing techniques to visually display information from multiple data sets at once. These represent a major improvement over existing methods, which cannot handle information generated from more than a single study.