Tracking genes on the path to genetic treatment

By John Sullivan, Office of Engineering Communications on Feb. 27, 2014, 9:15 a.m.

Before doctors like Matthias Kretzler can begin using the results of molecular research to treat patients, they need science to find an effective way to match genes with the specific cells involved in disease. As Kretzler explains, finding that link would eventually let physicians create far more effective diagnostic tools and treatments.

"Among many uses, it would allow us to develop cell-type targeted therapies," said Kretzler, a University of Michigan professor of internal medicine and computational medicine and bioinformatics. He recently collaborated with Princeton University professor Olga Troyanskaya on a way to match genes to cells. "If you identify a [disease] that is in the liver or in the kidney, you could target those areas and not affect other parts of the body," he said.

Although scientists have decoded the human genome — the list of all the genes in human cells — they still have great difficulty determining the specific genes that are activated to make a kidney cell as opposed to a liver or heart cell.

In theory, an easy way to link genes to cells would be to isolate a cell and test it. However, solid human tissue is so closely packed that even the finest surgical techniques cannot separate types of cells efficiently enough for analysis. A kidney biopsy, for example, produces a mix of several different types of cells that Kretzler dismisses as "kidney soup."

That is where computers come in. Troyanskaya and her postdoctoral fellow and graduate students at Princeton have developed a system that allows computers to "virtually dissect" a kidney in a way that surgery cannot. The machine uses data from an array of gene-activity measurements in patients' kidney biopsies to separate cells mathematically and identify genes that are turned on in a specific cell type.

Princeton University and University of Michigan researchers have developed a system that allows computers to "virtually dissect" a kidney in a way that surgery cannot. The machine uses data from an array of gene-activity measurements in patients' kidney biopsies to mathematically separate cells and identify genes that are turned on in a specific cell type. The researchers identified 136 genes involved in the creation of a critical kidney cell called a podocyte, tiny cells that serve as filters in the kidneys and are frequently involved in kidney disease. The above image is of podocytes taken with a scanning electron microscope. The overlaying curve and formula provide an artistic representation of the computational approach that can separate podocyte-specific genes from genes expressed in other kidney cells. (Micrograph image courtesy of Matthias Kretzler, University of Michigan; art courtesy of Ruth Dannenfelser, Princeton University)

"We call it in-silico nano-dissection," said Troyanskaya, a professor of computer science and the Lewis-Sigler Institute for Integrative Genomics. Using a large database of such gene-activity measurements to track genetic lineage allows scientists to refine their analysis through thousands of measurements, something that would be impossible with individual cell cultures, she said.

The method has proven far faster and significantly more effective than current techniques. In findings published Nov. 3 in the journal Genome Research, researchers from Kretzler's lab at Michigan and Troyanskaya's at Princeton reported that they had identified 136 genes involved in the creation of a critical kidney cell called a podocyte. In decades of research, only 46 had been previously identified.

"The potential for this is huge," said Behzad Najafian, a University of Washington assistant professor of pathology who specializes in renal pathology. "I believe this novel technique, which is a significant improvement in cell lineage-specific gene-expression analysis, will not only help us understand the pathophysiology of kidney diseases better through biopsy studies, but also provides a strong tool for discovery or validation of cell-specific urine or plasma biomarkers."

The researchers focused on the glomerulus, an area of the kidney where the podocyte cells filter the waste from blood that will eventually leave the body as urine. One of the main reasons the researchers chose to track the podocytes is that the tiny cells are frequently involved in kidney disease. The researchers wanted to identify genes active in the podocytes and thus determine which genes cause the cell to be able to perform the podocyte's filtering function, differentiating it from other cell types in the kidneys.

It is not an easy job. Even a biopsy precise enough to sample only the glomerulus leaves doctors with a mix of four cell types including the podocyte. This "soup" yields activity measurements for tens of thousands of molecular markers, called RNA.

"It's a little more complicated than this, but you can think of RNA as the instructions that come from the DNA, and we need to identify which of these instructions are active in the podocytes" said Casey Greene, who worked on the project as a postdoctoral researcher with Troyanskaya and is now an assistant professor of genetics at Dartmouth College.

Kretzler's team in Michigan first obtained data from the biopsies of 452 patients, each containing RNA from roughly 20,000 genes. The more RNA found in the sample from a particular gene, the more active that gene.

The problem was that there was no easy way to link the 20,000 RNA markers in the "soup" with the cells that they came from. The researchers' task was to fit the pieces of this puzzle together and to trace connections between cells and their corresponding genes. They started with some knowledge: years of previous studies of people with hereditary kidney disease had identified 46 genes associated with podocytes.

To begin making connections beyond those 46, the researchers organized the data as a giant matrix. "Each column is a patient. Each row is a gene, indicated by an RNA level," Troyanskaya said. "The problem is which of these is specifically coming from the podocytes."

Troyanskaya, Kretzler and their collaborators took advantage of each patient's sample being unique. Patients' genetic backgrounds are different and their personal histories include many small perturbations — caused by inflammation, disease, medication and many other environmental differences — that can change which genes are turned on in their samples and to what extent. Those variations allowed Troyanskaya and her team to identify patterns in the behaviors of the 46 known genes and look for that pattern in the activities of the thousands of unknown ones.

The team was able to identify 136 genes linked to the podocytes. Two of those genes have been shown experimentally to be able to cause kidney disease. The computer's identification of genes linked to podocytes was verified by staining the cell samples with antibodies — each of which reacts to a specific protein constructed from the RNA instructions. The researchers found that the computer's predictions were 65 percent accurate. The accuracy of the best existing method, which involves experimentally isolating the podocyte cells in mice and measuring their expression patterns, is only 23 percent.

Troyanskaya said the goal is to train the computer to come up with a mathematical formula that identifies links between similar patterns and what distinguishes them from other, unrelated patterns. It is essentially the general type of approach that companies use to evaluate customers' buying habits to suggest new movies or purchases. "The genes that we know are specifically active in podocytes — they are the movies that we like," Troyanskaya said.

Although the researchers used kidney cells, Troyanskaya said the program also will work with other cell types, including other solid tissues that cannot be experimentally micro-dissected in humans. The program is available free to researchers on Princeton's website.

"We are very excited about these results and applying this approach to a variety of cell types and disease settings," Troyanskaya said.

Besides Troyanskaya and Kretzler, the researchers involved in the work included: Casey Greene, who worked on the project as a postdoctoral researcher with Troyanskaya and is now an assistant professor of genetics at Dartmouth College; Young-suk Lee and Qian Zhu, graduate students in computer science and genomics at Princeton; Markus Bitzer, Felix Eichinger, Jeffrey Hodgin, Song Jiang, Viji Nair, Wenjun Ju, of the University of Michigan; Masami Kehata, Min Li, and Maria Pia Rastaldi of Fondazione IRCCS Ca' Grande Ospedale Maggiore Policlinico, Milan, Italy; and Clemens Cohen of the University of Zurich.

Support for the work was provided by the National Institutes of Health and the National Science Foundation. Troyanskaya is a senior fellow of the Canadian Institute for Advanced Research.

Tracking genes on the path to genetic treatment

Data science tool that reveals molecular causes of disease shows power in infant cancer analysis .

In the tissues of a tiny worm, a close-up view of where genes are working .

Researchers uncover potential cancer-causing mutations in genes’ control switches .

Beyond genomics, biologists and engineers decode the next frontier .

Artificial intelligence detects a new class of mutations behind autism .

New website to help translate genetic data into medical therapies .

Bioinformatics points the way to treating deadly pancreatic cancer .