EVOLUTION:
Tracking the History of the Genetic Code

Gretchen Vogel

Computer analyses and experiments with RNA molecules offer new insight into the forces that may have shaped the genetic code over time

VANCOUVER--For the 3 decades since biologists cracked the genetic code--the key to translating DNA into proteins--they have debated its origins. Some claimed it must be a random accident forever frozen in time, while others argued that the code, like all other features of organisms, was shaped by natural selection. Most of those debates have been philosophical, with little data to back up either side. But at the annual meeting of the Society for the Study of Evolution held here last month, two speakers presented evidence suggesting that forces other than chance shaped the code's origin and history.

Experiments with RNA have shown that chemical attractions between the genetic material and the components of proteins may have helped shape the original code, reported one speaker. Another researcher, using powerful computer analyses, suggested that the modern code is the product of evolution because it is so error-proof: Only one in a million other possible codes is better at producing a workable protein even when the DNA carries mistakes.

Doubters such as evolutionary biologist Niles Lehman of the State University of New York, Albany, still remain unconvinced that the code is anything but an accident. But he and others say that new studies such as these, as well as other work probing the history of individual genetic "words" (see sidebar, p. 330), are beginning to make a dent in their skepticism. "We're at a turning point" for probing the origins and history of the code, says Lehman.

Living things use DNA to store the instructions for making the proteins that build cells and direct them to develop into a complete organism. The four different subunits, or bases, that make up the DNA chain are grouped into three-letter "words" ; called codons, and each codon specifies a protein's amino acid building block. Specialized cellular machinery copies the DNA code into RNA--which has a similar code--and then reads the RNA to piece together the amino acids to make proteins. A codon "means" the same thing in a koala as it does in a rose or a bacterium. Yet there's no clear pattern in the pairing of codons and amino acids, which has persuaded many scientists that the code arose by accident.

But test tube experiments now suggest that before cellular machinery had evolved to read the code and build proteins, the code could have been shaped by affinities between specific base sequences and amino acids. Many scientists have speculated about such a scenario, but new data from experiments in which short strands of RNA are chosen based on their affinity for an amino acid are allowing them to test the idea. Several years ago, Michael Yarus of the University of Colorado, Boulder, noticed that in his experiments, the RNA strands that were best at binding a given amino acid tended to contain codons for that amino acid. But because the three-base codons often show up at random, the data were inconclusive.

Now evolutionary biologist Laura Landweber and graduate student Rob Knight of Princeton University have done a more careful analysis, looking specifically at where the amino acid arginine binds to random RNA strands generated in several researchers' experiments. If there is no real affinity, they reasoned, codons for arginine will appear as often in the regions where the amino acid does not bind as in regions that arginine homes in on. They found, instead, that while arginine codons made up 30% of the non binding RNA sites--the expected percentage, given that arginine has many possible codons--they made up 72% of the sequences in the binding regions. That suggests, says Landweber, that it's no accident that these codons specify arginine.

The arginine evidence is intriguing, says evolutionary biologist Leslie Orgel of the Salk Institute in La Jolla, California. "But it's premature to draw any very strong conclusions" from data on the affinities of a single amino acid, he says. Researchers are delighted, however, that experimenters are now tackling the question. "Previously we had to rely solely on theory," says Lehman, but "if [Landweber's analysis] holds up, it will provide a convincing body of evidence" that basic chemical forces helped to shape the code.

Once the code was born, a different kind of pressure, the need to minimize errors, might have refined it. While some researchers have argued that any changes to the code over its 3.5-billion-year history would have been like switching the keys on a typewriter, leading to hopelessly garbled proteins, others argued that the existing code is so good at its job that it must have been shaped by natural selection. For example, in 1991, evolutionary biologists Laurence Hurst of the University of Bath in England and David Haig of Harvard University showed that of all the possible codes made from the four bases and the 20 amino acids, the natural code is among the best at minimizing the effect of mutations. They found that single-base changes in a codon are likely to substitute a chemically similar amino acid and therefore make only minimal changes to the final protein.

Now Hurst's graduate student Stephen Freeland at Cambridge University in England has taken the analysis a step farther by taking into account the kinds of mistakes that are most likely to occur. First, the bases fall into two size classes, and mutations that swap bases of similar size are more common than mutations that switch base sizes. Second, during protein synthesis the first and third members of a codon are much more likely to be misread than the second one. When those mistake frequencies are factored in, the natural code looks even better: Only one of a million randomly generated codes was more error-proof.

That suggests, Freeland says, that the code has been optimized over the eons and isn't simply the product of chance. Lehman agrees that the one-in-a-million result looks impressive, but cautions that the statistics could be misleading. A high degree of similarity within one clan of amino acids could account for the code's apparent resistance to error, and the rest of the code could be random, he says.

With both the genesis and history of the code looking less and less accidental, Landweber and Freeland plan to collaborate next year, hoping to "build a grand scheme of the code's raison d'être, " Landweber says--whether it be accident or design.

Volume 281, Number 5375 Issue of 17 Jul 1998, p 329
©1998 by The American Association for the Advancement of Science.