Puzzles of the Genome

December, 2006
David Plaisted PhD

ImageThe genome of an animal contains the DNA that specifies the characteristics of the animal. This is in the form of a sequence of four bases; the sequence of the human genome is over three billion bases long. Of course, different individuals have different sequences. A few years ago the human genome project completed a description of the sequence of the human genome, and several other animals’ genomes have been sequenced since then. Scientists sometimes claim that these genomes provide evidence for the theory of evolution. However, recent results show how little we really know about the genome, and therefore it is unreasonable to assert that the genome provides evidence for evolution, when we understand it so poorly. 

The DNA specifies the characteristics of an organism in the following way. The DNA contains genes that are transcribed (translated) into RNA. RNA is a very long molecule much like DNA but with a somewhat different makeup. Then this “messenger RNA” (mRNA) is translated into proteins, which are very large molecules that guide the body’s metabolism and comprise much of its structure. Recently it has been found that humans have only about 20,000 to 25,000 genes, not many more than other organisms that are much simpler. Even evolutionists are puzzled by this, because humans at least appear to be much more complex than these other organisms. However, we now know that one gene can produce more than one protein by a mechanism known as “alternative splicing.” This mechanism is more common in humans than in lower organisms, and scientists now believe that the human genome can manufacture 90,000 or more proteins. Moreover, these proteins can be modified by the addition of various chemical groups, so that the complexity of the organism is much larger than the number of genes would indicate. Also, a recent article in Science admitted that we still don’t know what a gene is or how many of them there are.1 Another article admits that “the identification of genes based solely on the human genome sequence…will not be practical in the foreseeable future.”2

The protein coding portions of genes comprise only about 1.5 percent of the human genome, raising the question of the purpose of the remaining 98.5 percent of the genome. For a long time it was assumed that this was mostly “junk” DNA with little or no purpose. However, this view is now being questioned. For example, about 5 percent of the human genome has corresponding sequences in other mammals, suggesting that at least 5 percent of the human genome has a function of some kind.3 Another amazing fact is that the great majority of the human genome is transcribed into RNA.4 In fact, millions of distinct transcribed RNA sequences are known for the human genome.5 Why is so much of the genome transcribed, if only 1.5 percent of it has a function? What is the purpose of this transcription? Now, a single gene can produce more than one distinct sequence. However, even after clustering these sequences according to their similarity, one compilation contains over 800,000 distinct clusters! 5 An average of 10 percent of the genome is included in such sequences.6 And, parts of these non-coding RNAs (ncRNAs) are more conserved across organisms than the corresponding parts of protein coding genes, suggesting that they have an important function. 7 Can it be that 10 percent or more of the genome has a function?

Scientists have recently learned that much of the DNA is transcribed into RNA that is not translated into proteins but apparently has a function in regulating the genes. There are several known types of such RNA, and more may exist. One type is microRNA (or miRNA), which are single stranded RNA molecules of about 21-23 nucleotides in length. These miRNA bind to messenger RNA and prevent the messenger RNA from being translated into protein. Thus this miRNA appears to play a role in gene regulation. Each miRNA is thought to regulate multiple genes, and it is believed that higher organisms have hundreds of miRNA, so they can have a significant impact on the organism. Also, miRNA seems to have a role in the development of cancer. miRNA also seem to have a role in early development, cell proliferation and cell death, cell suicide (apoptosis), fat metabolism, and cell differentiation. There may also be a role in brain development, leukemia, viral infection, and neurodevelopment. Another kind of RNA known as short interfering RNA (siRNA) has been found in plants and appears to prevent viral RNA from functioning in the plant cells. A third type of such RNA, called piRNA (Piwi interacting RNA) has recently been discovered and appears to play a role in germline development.8 This RNA was found in a portion of the genome that was not thought to be transcribed. There is even a possibility that small RNA genes may activate protein coding genes instead of only repressing them.9

Another puzzle of the genome is that the same DNA can be transcribed into more than one gene. It is as if two genes overlap, using the same DNA in different ways, something like a crossword puzzle. In fact, a large proportion of the genome can produce transcripts from both strands (in both directions).10 How could evolution produce a genetic code that could be read in two directions? The constraints are so severe that this seems to be an impossibility. It is also now known that the nucleus has transcription “factories” where DNA is copied into messenger RNA. Furthermore, related genes that are very widely separated on the genome somehow appear to converge on the same factory.11 It appears that the DNA has to bend in some way to make this possible.

Another puzzle of the genome are the “transposons” which are portions of DNA that can move from one place in the DNA to another. In fact, some scientists believe that as much of half of the human genome consists of such transposons.12 Recently it has been suggested that these transposons play a role in the developing brain, causing each individual to have a distinct makeup because of the manner in which the transposons alter the DNA during brain development.13

Considering all of these facts sheds additional light on the human-chimp similarity question. It is not just the protein coding genes that matter, but also the RNA genes. In fact, it was recently found that the human genome differs from that of chimpanzees by about 5 percent, 14 not the one percent that has often been quoted. Even more puzzling, there are a few large insertions and deletions between the human and chimpanzee genomes. How can this be explained on the basis of the theory of evolution?

Another issue is the question of harmful mutations. If a large proportion of the genome is functional, then more mutations will be harmful. This means that under reasonable assumptions, known rates of mutation will introduce so many errors into the genome as to cause a rapid degeneration of the human species, instead of evolution to greater fitness as specified by the theory of evolution. However, a proper consideration of this question involves some deep issues in population genetics.

Clearly there are mysteries in the operation of life that we are only beginning to understand. Will we ever fully understand the operation of life, or will it continue to reveal new layers of complexity to our study? It always seems as if we are about to understand the answer, but then we find that we are not quite there. At one time we thought that just about the only thing that mattered in the DNA was the protein coding genes, and we thought that we could identify them. Now we realize that we don’t know what the function of much of the DNA is. We don’t even really know what a gene is, or how many of them there are. We don’t know why so much of the genome is transcribed into RNA if it does not have a function, and if it does have a function, we don’t know what it is. We are finding that much of the DNA can be read in more than one direction. We are continuing to discover new types of RNA genes and their functions. How could such a marvelous and breathtaking system evolve by blind chance? And yet scientists are sure that the genome provides evidence for the theory of evolution. In the face of such overwhelming complexity and evidence of a Master Designer, it seems that the proper attitude would be one of complete humility and admitting that we cannot say that the genomes that have been sequenced so far provide evidence in favor of the theory of evolution. 


1 Pennisi E (2003) Gene Counters Struggle to Get the Right Answer. Science 301 (5636):1040 

2 Snyder M, Gerstein M (2003) Defining Genes in the Ge- nomics Era. Science 300 (5617):258

3 Fox M (2003 Aug 13) Genome Hunt Shows Humans Closer to Rats than Cats. Accessed 2006 Nov 22

4 Mattick J (2005) The Functional Genomics of Noncoding RNA. Science 309 (5740):1527-1528

5 Nekrutenko A (2004) Reconciling the Numbers: ESTs Ver- sus Protein-Coding Genes. Mol. Biol. Evol. 21(7):1278-1282

6 Claverie J (2005) Fewer Genes, More Noncoding RNA. Sci- ence 309 (5740):1529-1530

7 Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, et al. from The FANTOM Consortium (2005) The transcriptional landscape of the mammalian genome. Science 309 (5740):1559-1563

8 Carthew R (2006) A New RNA Dimension to Genome Con- trol. Science 313 (5785):305

9 Garber K (2006) Small RNAs Reveal an Activating Side. Science 314 (5800):741

10 Katayama S, Tomaru Y, Kasukawa T, Waki K, Nakanishi M, Nakamura M, Nishida H, Yap CC, et al. from the RIKEN Genome Exploration Research Group (2005) Antisense tran- scription in the mammalian transcriptome. Science 309 (5740):1564-1566

11 Pennisi E (2006) Genes Commute to Factories Before They Start Work. Science 312 (5778):1304

12 Gould F (2006) The Dark Side of DNA. Am. Sci. 94:552.

13 Miller G (2005) Jumping DNA Mixes It Up in the Devel- oping Brain. Science 308 (5729):1729

14 Britten RJ (2002) Divergence between samples of chimpan- zee and human DNA sequences is 5% counting indels. Proc. Nat. Acad. Sci. 99:13633-13635