It may be well over 60 years since Francis Crick and James Watson first published on the structure of DNA but we’re still discovering whole new layers of coding in the molecule. We’ve known for a long time that the sequence of the 4 ‘letters’ of the code; A, T, C & G; are what determines the ‘words’ of the code and that those words, which we call codons, are all only thee letters long. Each word codes for an amino acid and it’s long strings of these amino acids that form proteins. The function of any given protein is highly dependent upon its shape and its shape is governed by the way the chain of amino acids is folded as it is created. This will all be relevant in a moment.
Now, amino acids are coded for by a sequence of three bases in the code, for example the amino acid methionine is encoded by the sequence ATG, however, nearly all amino acids have several codons that will code for it. Check out the table below.
This is the table us geneticists use to tell us which codons give which amino acids. Let’s do an example, the codon CGT. The first base is a C so on the lefthand side look for the C row; the second base is G so along the top row look for G; the final base is a T so go down the list until you see the T. You’ll see that this sequence codes for the amino acid Arg which is short for arginine; but you’ll also notice that that whole box codes for arginine, this means that it didn’t actually matter what the 3rd base was the resulting amino acid would have been exactly the same. Again, if you look in the box below you’ll see that there are two more possible codons for arginine, AGA and AGG; that makes six in total. This is because there is redundancy in the code, the total number of possible codon combinations is 64 (43, 4 possible letters in 3 possible positions) but there are only 21 amino acids to code for. This is a good thing because it means that there are some mutations that, although they result in a change in our DNA code, they don’t actually result in an altered protein and so our bodies are able to carry on functioning. These are called silent mutations and when we find one at work we can be fairly confident that it isn’t pathogenic because if it doesn’t effect the protein sequence then it can’t be doing very much. It also gives Mother Nature a bit of wriggle room, this flexibility in the code is one of the reasons we can evolve. Something we’ve also known for some time is that different organisms have different preferences for which codons they use for their proteins, this is the case for every organism we’ve ever checked in. Which is a lot. So Pan troglodytes (the common chimp) might prefer the codon CGC to encode arginine whilst Mus musculus (the house mouse) might prefer AGA, the result is the same. I’ve made up those examples but you get the picture.
So far so good, none of this is new knowledge. What researchers from the University of Texas have shown, though, is that we may have the entire concept of a silent mutation wrong. It’s possible that these mutations do effect protein function even though the protein sequence has not changed. This could make the interpretation of genetic variants hugely more complicated and opens up a whole new field of genetic research. Let me explain how.
The way we get from DNA to a protein is this: DNA is translated into an intermediary we call RNA, the RNA is able to leave the cell nucleus (if we’re dealing with a eukaryote) and head off into the cell. Here, small structures called ribosomes are able to latch onto the RNA molecule and recruit the appropriate amino acid depending on which codons have been used. The ribosome shuttles along the molecule adding amino acid after amino acid to the chain. As it does so the amino acid chain folds up to give the protein its unique, and functionally crucial, shape. The new paper, published in Molecular Cell, has shown that organisms have a preference for certain amino acids for a reason. It takes a certain amount of time for a ribosome to translate a given codon into it’s corresponding amino acid and it turns out that that period of time varies between codons that code for the same amino acid. This has a knock on effect on protein folding, if it takes too long for the next link in the chain to be added then it stops the protein being able to fold properly and if a protein isn’t the right shape then it won’t work.
This is potentially a very significant finding. It could turn the field of diagnostic mutation analysis on it’s head by forcing us to consider a mechanism that we weren’t even aware of before. Unfortunately it will be some time before this could ever be taken into account in our analyses; we don’t have any data yet on how these timings might play out in humans, whether it completely ruins the protein or just makes it slightly less effective and a host of other questions many of which I suspect we don’t even know to ask yet. This is also working on the assumption that this work holds up across the board. They were working with the mould Neurospora crassa and so we’ll have to see how much this applies to other species. Personally, I think it will hold up. They did a good job of proving the effect in the mould and, as all life on earth comes from one common ancestor, the process of translation is mostly unaltered across the whole Shrubbery of Life. It’s certainly something we need to keep an eye on as more data becomes available.
From a purely academic curiosity point of view it’s also fascinating. There are still masses we do not know about genetics, lots of known unknowns; but, without wanting to get too Rumsfeldian on you, I don’t think there were many people out there that thought there was still an unknown unknown of this scale and significance lurking around the corner. This is a whole new tier of coding that we just hadn’t spotted. It’s also why any good scientist will always caveat any factual claim they make with something like ‘to the best of our knowledge so far’ or ‘as far as the current data will allow us’. Scientists are often maligned with accusations about being dogmatic, closed to new ideas, having an unwillingness to change. Nothing could be further from the truth. Present us with new, high quality evidence and any good scientist will happily, excitedly even, adopt the new idea into their thinking and practices. It’s what we live for, it’s what we love to do. We want to know that what we know and what we do is correct and is an honest and accurate reflection of reality. That’s the whole point of science.