Princeton researchers develop faster, inexpensive way to sequence genes

While the ability to sequence genomes has revolutionized the way biologists conduct research, the work can be time-consuming and expensive. Princeton researchers have developed a new straightforward, cost-effective method that is providing key data in days rather than months.

The approach, described in the March 9 issue of Science, involves using a new type of microarray and innovative computational techniques to compare sequences and identify subtle differences between genomes.

"It really is like finding that needle in a haystack," said Leonid Kruglyak, a professor of ecology and evolutionary biology and a resident faculty member in Princeton's Lewis-Sigler Institute for Integrative Genomics, who is the corresponding author of the paper.

The microarrays -- small, computer-chip-like devices spotted with segments of DNA -- enable scientists to quickly establish which genes are expressed at what level in a cell. Until now, microarray experiments to detect small differences in sequence have only been feasible for individual human genes and small genomes. With the new method, Princeton scientists have harnessed large amounts of DNA sequence information with very high precision in a short time.

Kruglyak's interests in the genetic basis of variation intersected with those of David Gresham, a research associate in the group of David Botstein, professor of molecular biology and director of the Lewis-Sigler Institute. The group had been searching for new ways of finding mutations, and Kruglyak realized that a company called Affymetrix, which developed the first DNA microarray, was offering a new product that could potentially solve the problem.

The research focused on yeast, a commonly used model system for various cellular processes, such as DNA repair and cell division. The yeast genome is a few hundred times larger than any sequence previously analyzed in this way. It contains 12 million pairs of nucleotides -- the basic building block of DNA. Affymetrix's new product, made available to Princeton researchers in July 2005, offered for the first time the entire yeast genome on one microarray at sufficient density to carry out these types of experiments.

The availability of that microarray enabled the scientists to do large-scale comparisons between strains of yeast. In a typical experiment, Gresham and researcher Joseph Schacherer apply DNA extracted from mutant yeast to the microarray, and then measure the interactions between the DNA strands using a laser that scans the chip.

Complementary DNA strands will bind together, as they would in a double helix. If there is a mismatch, this particular microarray is sensitive to the difference, and the DNA strands won’t stick together as well. Based on where the weak interaction is on the microarray, the group can map the mutation to a precise location.

The situation is complicated by the fact that one experiment can generate some 2.5 million measurements. In order to analyze this large volume of data, Douglas Ruderfer and Stephen Pratt, researchers in ecology and evolutionary biology and the Lewis-Sigler Institute, developed and executed a computer program that sifts through each data point, measuring how well the DNA strands stick to each other and comparing that to expected values for different matches and mismatches. The method pinpointed differences as small as one nucleotide between individual yeast genomes.

"This is a really great example of how traditional biology and computational biology can work together to complement each other," said Gresham.

For the biggest test of the method, Maitreya Dunham, a Lewis-Sigler Fellow in Princeton's Center for Quantitative Biology, contributed yeast that had been deprived of certain nutrients in the lab, and had experimentally evolved to adapt to the deficiency. The mutations that enabled the yeast to survive and grow were completely unknown, but the experiment successfully identified the changes.

It only takes one person, using one microarray chip that costs $500 to $600, to conduct the experiment over one or two days, according to Kruglyak and Botstein. "If we did an experiment like this by doing traditional DNA sequencing and then comparing the sequences, we’d be looking at weeks or months of time invested, at a cost in the range of many tens of thousands of dollars," said Kruglyak, adding that only a small number of large institutes are equipped to do that work.

While this project studies changes between individual yeast genomes, Kruglyak is confident that the computer program could be applied to other genomes of similar size and complexity, once the right microarrays become available. The human genome is 300 times the size of the yeast genome and is more complex, but one eventual goal would be to develop the technology further -- to the point that it could detect single nucleotide changes underlying human diseases.

"This research really exemplifies the ideas behind the Lewis-Sigler Institute and the Center for Quantitative Biology," said Botstein. The institute was established in 1998 to innovate in research and teaching at the interface of modern biology and the more quantitative sciences.

In the immediate future, Botstein looks forward to the possibility of using the microarray chips as a teaching tool, bringing the interface of natural and quantitative science into the classroom. "Our undergraduates will initially have unique access to this class of experiments."

The research was funded by grants from the National Institute of Mental Health, the James S. McDonnell Foundation, the National Institute for General Medical Sciences and National Human Genome Research Institute.