A four-year project to quantify relationships between the genotype and phenotype of colorectal cancer using neural network analysis is under way. This project is identified as Specific Aim 3 of Project 3 in the Program Project Grant, New Methods for Cancer Detection, sponsored by the National Cancer Institute. Combining external databases with those created in the overall program, we are developing mathematical models and numerical analyses for classifying colon cancer in a clinical setting. We are using the results of DNA and RNA analysis collected by other members of the research team, together with available clinical information, to identify groups of genes that belong to the same pathway or network. Particular emphasis is placed on identifying possible predictors for metastasis and for genetic signatures of the transitions between different stages of tumor characterization and progression. The principal focus is on supervised methods of data analysis. Team members come from Princeton's Department of Molecular Biology, Weill Medical College of Cornell University, Memorial Sloan Kettering Cancer Center, Rockefeller University, University of Medicine and Dentistry of New Jersey, and Weizmann Institute of Science.
Our immediate aim is to develop algorithms that reliably classify genetic and epigenetic data according to known properties of the tissue from which samples have been obtained. Such algorithms can play an important role in molecular profiling of colon cancer, as they establish critical relationships between genotype and phenotype. Particular attention is paid to processing data from oligonucleotide and cDNA microarrays. Early application of the methodology will focus on gene discovery, materially assisting the selection of genes and ESTs for analysis and for inclusion on a custom microarray. Later application will address the prediction of outcomes, including the likelihood of early-stage colorectal tumors progressing to metastasis. It will be shown that the neural networks can be incorporated in gene regulation networks, which describe signaling and causal pathways between genes and cells. Our fundamental approach is to apply supervised neural networks to this analysis.