On the computational properties of gene assembly

Laura Landweber and Lila Kari

 

The process of gene unscrambling in hypotrichous ciliates represents one of nature’s ingenious solutions to the computational problem of gene assembly. With some essential genes fragmented in as many as 50 pieces, these organisms rely on a set of sequence and structural clues to detangle their coding regions. For example, pointer sequences present at the junctions between coding and noncoding sequences permit reassembly of the functional copy. As the process of gene unscrambling appears to follow a precise algorithm or set of algorithms, the question remains: what is the actual problem being solved?

The first proposed step in gene unscrambling—alignment or combinatorial pattern matching—may involve searches through several possible matches, via either intramolecular or intermolecular strand associations. We have conjectured that this part could be similar to Adleman’s (1994) DNA solution of a directed Hamiltonian path problem. The second step—homologous recombination at aligned repeats—involves the choice of whether to retain the coding or the noncoding segment between each pair of recombination junctions. This decision process could even be equivalent to solving an n-bit instance of a satisfiability problem, where n is the number of scrambled segments. We use our knowledge of the first step to develop a model for the guided homologous recombinations and prove that such a model has the computational power of a Turing machine, the accepted formal model of computation. This indicates that, in principle, these unicellular organisms may have the capacity to perform at least any computation carried out by an electronic computer.

figures (high resolution slides, big file): ftp://rnaworld.princeton.edu/pub/export/Ciliate_PNAS_folder/figures.pdf

related paper (includes figures, smaller file): ftp://rnaworld.princeton.edu/pub/export/Ciliate_PNAS_folder/LandweberCiliate.pdf