How the genetic code became degenerate
September 22, 2010 Leave a comment
Our genetic code consists of 64 different combinations of four RNA nucleotides—adenine, guanine, cytosine, and uracil. These four molecules can be arranged in groups of three in 64 different ways; the mathematical representation of this relationship is 4 x 4 x 4 to illustrate the number of possible combinations.
Shorthand for the language of proteins
This code is cellular shorthand for the language of proteins. A group of three nucleotides—called a codon—is a code word for an amino acid. A protein is, at its simplest level, a string of amino acids, which are its building blocks. So a string of codons provides the language that the cell can “read” to build a protein. When the code is copied from the DNA, the process is called transcription, and the resulting string of nucleotides is messenger RNA. This messenger takes the code from the nucleus to the cytoplasm in eukaryotes, where it is decoded in a process called translation. During translation, the code is “read,” and amino acids assembled in the sequence the code indicates.
The puzzling degeneracy of genetics
So given that there are 64 possible triplet combinations for these codons, you might think that there are 64 amino acids, one per codon. But that’s not the case. Instead, our code is “degenerate;” in some cases, more than one triplet of nucleotides provides a code word for an amino acid. Thus, these redundant codons are all synonyms for the same protein building block. For example, six different codons indicate the amino acid leucine: UUA, UUG, CUA, CUG, CUC, and CUU. When any one of these codons turns up in the message, the cellular protein-building machinery inserts a leucine into the growing amino acid chain.
This degeneracy of the genetic code has puzzled biologists since the code was cracked. Why would Nature produce redundancies like this? One suggestion is that Nature did not use a triplet code originally, but a doublet code. Francis Crick, of double-helix fame, posited that a two-letter code probably preceded the three-letter code. But he did not devise a theory to explain how Nature made the universal shift from two to three letters.
A two-letter code?
There are some intriguing bits of evidence for a two-letter code. One of the players in translation is transfer RNA (tRNA), a special sequence of nucleotides that carries triplet codes complementary to those in the messenger RNA. In addition to this complementary triplet, called an anticodon, each tRNA also carries a single amino acid that matches the codon it complements. Thus, when a codon for leucine—UUA for example—is “read” during translation, a tRNA with the anticodon AAU will donate the leucine it carries to the growing amino acid chain.
Aminoacyl tRNA synthetases are enzymes that link an amino acid with the appropriate tRNA anticodon. Each type of tRNA has its specific synthetase, and some of these synthetases use only the first two nucleotide bases of the anticodon to decide which amino acid to attach. If you look at the code words for leucine, for example, you’ll see that all four begin with “CU.” The only difference among these four is the third position in the codon—A, U, G, or C. Thus, these synthetases need to rely only on the doublets to be correct.
Math and doublets
Scientists at Harvard believe that they have solved the evolutionary mystery of how the triplet form arose from the doublet. They suggest that the doublet code was actually read in groups of three doublets, but with only the first two “prefix” or last two “suffix” pairs actually being read. Using mathematical modeling, these researchers have shown that all but two amino acids can be coded for using two, four, or six doublet codons.
Too hot in the early Earth kitchen for some
The two exceptions are glutamine and asparagine, which at high temperatures break down into the amino acids glutamic acid and aspartic acid. The inability of glutamine and asparagine to retain structure in hot environments suggests that the in the early days of life on Earth when doublet codes were in use, the primordial soup must have been too hot for stable synthesis of heat-intolerant, triplet-coded amino acids like glutamine and asparagine.