What's the percentage similarity between human individuals (and other primates) when comparing only exons?

What's the percentage similarity between human individuals (and other primates) when comparing only exons?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

In popular science books and articles, I often see it stated that humans are >99 % similar to each other (wikipedia has it a 99.5 %, referencing Craig Venter and this PLOS Biology article) and ~96-99 % similar to chimpanzees or bonobos (Smithsonian Institute, National geographic). I have previously thought that this referred to the entire genome, but the wording on the Smithsonian Institute page I linked, makes it appear that it might refer to genes only, which also seems to be the case according to the answers to this post Do apes and humans share 99% of DNA or 99% of genes? What is the difference?.

How big are these percentages when comparing only the coding regions (exons) instead of entire genes, both within our species and compared to other species, such as primates? And, not as importantly, what are these numbers when including the entire genome with all non-coding DNA?

One of the references to the chimpanzee genome paper linked by Maximillian is Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene Trios, where the authors seems to have focused on exons only:

Here we apply evolutionary tests to identify genes and pathways from a new collection of more than 200,000 chimpanzee exonic sequences that show patterns of divergence consistent with natural selection along the human and chimpanzee lineages.

They conclude that:

Perhaps the best way to understand the relation between DNA sequence divergence and the differences between human and chimpanzee physiology and morphology is to compare these differences to the variability among humans. Human-chimp DNA sequence divergence is roughly 10 times the divergence between random pairs of humans.

Unfortunately, they didn't compare this to the fold-difference of non-exonic DNA, so the comparative quantification between coding vs non-coding regions remains to be determined.

However, this does directly answer the question of how big the genomic differences are between individuals of the same species vs those of other species, when comparing only coding regions. I also found it useful to quantify the differences in these relative terms in addition to the previously stated percentages.

This paper seems to have relevant estimates for human-ape divergences. Note again that it isn't using the entire genome, but selected regions of exons, introns, pseudogenes, etc. The chimpanzee genome paper has a nice bullet-point list towards the beginning of relevant statistics for you for that species specifically, that is on the order of 1% overall. The section on "gene evolution" has some specific numbers for coding sequences.

For among-human variation, there is a bionumbers page on that subject based on 1000 genomes project numbers, that comes out at 0.1% divergence among humans.

A New Way To Compare Human And Other Primate Genomes

BERKELEY, CA -- Scientists with the U.S. Department of Energy's Joint Genome Institute (JGI) in Walnut Creek, Calif., and the Lawrence Berkeley National Laboratory (Berkeley Lab) have developed a powerful new technique for deciphering biological information encoded in the human genome.

Called "phylogenetic shadowing," this technique enables scientists to make meaningful comparisons between DNA sequences in the human genome and sequences in the genomes of apes, monkeys, and other nonhuman primates. With phylogenetic shadowing, scientists can now study biological traits that are unique to members of the primate family.

"Now that the sequence of the human genome has almost been completed the next challenge will be the development of a vocabulary to read and interpret that sequence," says Edward Rubin, M.D., director of the Joint Genome Institute (JGI) for the U.S. Department of Energy, and Berkeley Lab's Genomics Division, who led the development of the phylogenetic shadowing technique.

"The ability to compare DNA sequences in the human genome to sequences in nonhuman primates will enable us in some ways to better understand ourselves than the study of evolutionarily far-distant relatives such as the mouse or the rat," Rubin adds. "This is important because as valuable as models like the mouse have been, there are many physical and biochemical attributes of humans that only other primates share."

Using phylogenetic shadowing, Rubin and his colleagues were able to identify the DNA sequences that regulate the activation or "expression" of a gene that is an important indicator of the risk for heart disease and is found only in primates. The results of this research are reported in a paper published the February 28 issues of the journal Science. Co-authoring the paper with Rubin were Dario Boffelli, Dmitriy Ovcharenko, Keith Lewis and Ivan Ovcharenko of Berkeley Lab, plus Jon McAuliffe and Lior Pachter, of the University of California at Berkeley.

Comparative genomics, comparing segments of DNA in the human genome to DNA segments in the genomes of other organisms that have been sequenced, such as the mouse, the puffer fish or the sea squirt, has proven to be an effective means of identifying genes, the DNA sequences that code for proteins, and gene regulatory sequences, the DNA sequences which control when a gene is turned on or off.

"The rationale for comparing the genomes of different animals to identify those sequences that are important is based on the understanding that today's different animals arose from common ancestors tens of millions of years ago," Rubin explains. "If segments of the genomes of two different organisms have been conserved (meaning the sequences are the same in both) over the millions of years since those organisms diverged, then the DNA sequences within those segments probably encode important biological functions."

The search for functional DNA sequences that have been conserved between two different organisms across a large distance in evolution is the classical approach to comparative genomics that has been used to interpret the information in the human genome. In order for this technique to work, the conserved functional sequences have to stand out as distinct from the nonfunctional sequences which were not conserved. That degree of distinction requires the passage of time - lots of it - in order for mutations and the lack of selection pressures to cause the nonfunctional sequences in the two genomes to drift apart.

For example, mice and humans last shared a common ancestor about 75 million years ago, plenty of time for the nonfunctional sequences in their respective genomes to go their separate ways. Only about five-percent of the two genomes are conserved and it has been shown that most of the genes and regulatory sequences that have been discovered lie within these conserved DNA segments. On the other hand, humans and nonhuman primates shared common ancestors as recently as 6 to 14 million years ago for apes, 25 million years ago for Old World (African) monkeys, and 40 million years ago for New World (South American) monkeys. This is insufficient time for much genetic divergence to have taken place. Consequently, nonhuman primates have been largely ignored in the effort to interpret the human genome.

"Comparative genomics studies between evolutionarily distant species will readily identify regions of the human genome performing basic biological functions shared with most mammals," says Rubin. "However, it will invariably miss recent changes in DNA sequence that account for primate-specific biological traits."

Rubin has likened comparisons between the human and mouse genomes to comparisons between an automobile and a go-cart: "Only the very basic parts and design features are similar." Whereas, he argues, comparing the human genome to that of a chimp or a baboon, is like comparing a sedan to a station wagon: "Nearly all the parts and design features are almost interchangeable."

Until now, however, comparing the human genome to that of a chimp or baboon has been a problem since both genomes are so much alike.

As Boffelli, who works with Rubin at both Berkeley Lab and JGI explains, "There is only about a 5-percent difference between the human and the baboon genomes. When you run comparisons between the two, all of the sequences look just about the same. We can't distinguish function from nonfunctional sequences."

Rubin and his colleagues overcame this lack of distinction by comparing segments of the human genome to segments of not one but anywhere from 5 to 15 different genomes of nonhuman primates, including chimpanzees and gorillas, orangutans, baboons, and Old World and New World monkeys. By sequencing specific segments within each of the genomes of the different primates being analyzed, the researchers found enough small differences from genome to genome in the nonhuman primates that could be combined to create a phylogenetic "shadow" which could then be compared to the human genome.

"The additive collective sequence differences or divergence of these nonhuman primates as a group was comparable to that of humans and mice," Rubin says. "This suggests that deep sequence comparisons of numerous primate species should be sufficient to identify significant regions of conservation that encode functional elements shared by all primates including humans."

The phylogenetic shadow that Rubin and his colleagues created was distinct enough for them to see the boundaries between exons (protein-coding DNA sequences) and introns (noncoding DNA sequences) for several genes in addition to discovering the regulatory elements for a gene named "apo(a)" which is associated with low-density lipoproteins (LDLs) in the blood stream of humans. An evolutionary new-comer, apo(a) is found in humans, apes, and Old World monkeys but appears to be lacking in nearly all other mammals. Biomedical researchers want to know the regulatory sequences of apo(a) because high blood levels of apo(a) are an important risk predictor for cardiovascular disease. The desire to study apo(a) is the reason Rubin and his research group began the development of their phylogenetic shadowing technique.

"We could not study apo(a) by comparing human DNA sequences to the sequences of evolutionarily distant species as those species don't have apo(a) so we had to find an alternative method," Rubin says.

Rubin's research group at Berkeley Lab has been at the forefront of using transgenic mice and the mouse genome to decipher the human genome and to identify and study important genetic risk factors in the development of human heart disease. He and his group believe that the ability to do comparative genomic studies with nonhuman primates will prove especially beneficial to human medical research. Their data from this study suggests that sequencing the genomes of as few as four to six primate species in addition to humans may be enough to identify much of the conserved functional DNA sequences in the human genome.

"The argument for sequencing a broad variety of evolutionarily distant species, like the mouse and puffer fish, has been that they would be needed for us to gain a good understanding of the human genome," Rubin says. "These evolutionarily distant creatures have been incredibly useful but maybe now we should be focusing our effort on sequencing the genomes of not one but several different nonhuman primates. Their collective sequences will tell us things about the human genome that we will never to able to learn from our more distant relatives in the animal kingdom."

This research was funded by a grant from the National Heart, Lung, and Blood Institute.

Story Source:

Materials provided by Lawrence Berkeley National Laboratory. Note: Content may be edited for style and length.

Analysis of Rhesus Monkey Genome Uncovers Genetic Differences With Humans, Chimps

DNA Comparison Provides New Clues to Primate Biology.

An international consortium of researchers has published the genome sequence of the rhesus macaque monkey and aligned it with the chimpanzee and human genomes. Published April 13 in a special section of the journal Science, the analysis reveals that the three primate species share about 93 percent of their DNA, yet have some significant differences among their genes.

In its paper, the Rhesus Macaque Genome Sequence and Analysis Consortium, supported in part by the National Human Genome Research Institute (NHGRI), one of the National Institutes of Health (NIH), compared the genome sequences of rhesus macaque (Macaca mulatta) with that of human (Homo sapiens) and chimp (Pan troglodytes), the primate most closely related to humans. Four companion papers that relied on the rhesus sequence also appear in the same issue.

The rhesus genome is the second non-human primate, after the chimp, to have its genome sequenced and is the first of the Old World monkeys to have its DNA deciphered.

"The sequencing of the rhesus macaque genome, combined with the availability of the chimp and human genomes, provides researchers with another powerful tool to advance our understanding of human biology in health and disease," said NHGRI Director Francis S. Collins, M.D., Ph.D. "As we build upon the foundation laid by the Human Genome Project, it has become clear that comparing our genome with the genomes of other organisms is crucial to identifying what makes the human genome unique."

The rhesus, because of its response to the simian immunodeficiency virus (SIV), is widely recognized as the best animal model for human immunodeficiency virus (HIV) infection. The rhesus genome sequence will also serve to enhance essential research in neuroscience, behavioral biology, reproductive physiology, endocrinology and cardiovascular studies. In addition, the rhesus serves as a valuable model for studying other human infectious diseases and for vaccine research.

The sequencing of the rhesus genome was conducted at the Baylor College of Medicine Human Genome Sequencing Center in Houston, the Genome Sequencing Center at Washington University School of Medicine in St. Louis and the J. Craig Venter Institute in Rockville, Md., which are part of the NHGRI-supported Large-Scale Sequencing Research Network. The DNA used in the sequencing was obtained from a female rhesus macaque at the Southwest National Primate Research Center (NPRC) in San Antonio, which is supported by the National Center for Research Resources, part of NIH.

Independent assemblies of the rhesus genome data were carried out at each of the three sequencing centers using different and complementary approaches and then combined into a single "melded assembly." In their analysis, scientists from 35 institutions compared this melded assembly to the reference sequence of the human genome, a newer unpublished draft sequence of the chimp genome, the sequence of more than a dozen other more distant species already in the public databases, the human HapMap, and the Human Gene Mutation Database that lists known human mutations that lead to genetic disease.

"This study of the rhesus genome is invaluable because it gives researchers a perspective to observe what has been added or deleted in each primate genome during evolution of rhesus, chimp, and the human from their common ancestors," said Richard Gibbs, Ph.D., director of Baylor College of Medicine's Human Genome Sequencing Center in Houston and the project leader.

One of the most useful features of the rhesus genome is that it is less closely related to the human genome than to the chimp genome. This means that important features that have been conserved in primates over time can be more easily seen by comparing rhesus to human, than chimp to human.

By adding the rhesus genome to the primate comparison, researchers identified nearly 200 genes likely to be key players in determining differences among primate species. These include genes involved in hair formation, immune response, membrane proteins and sperm-egg fusion. Many of these genes are located in areas of the primate genome that have been subject to duplication, indicating that having an extra copy of a gene may enable it to evolve more rapidly and that small duplications are a key feature of primate evolution.

The analysis also revealed a few instances in which whole families of genes were radically different in the rhesus, containing more copies of certain genes than in the chimp or human. These gene families include important immune related genes, as well as genes with functions not yet fully known.

In addition to comparing the rhesus with the chimp and human genomes, the group also studied genetic variation in macaque populations, and developed a set of "single nucleotide polymorphisms" or SNPs (single base DNA differences) that can be used for future analysis of inheritance of biomedically important traits in rhesus. The rhesus genomic DNA samples used for these studies were contributed by the California NPRC, Oregon NPRC, Southwest NPRC and Yerkes NPRC. This advance in macaque genetics will enhance the use of macaques for the study of genetic diseases of man.

What's the percentage similarity between human individuals (and other primates) when comparing only exons? - Biology

Understanding the similarities and differences among people occupies psychologists, anthropologists, artists, doctors and, of course, many biologists. Even when zooming in on only the genetic differences among people there is a dazzling range of issues to discuss. The day that DNA extracted at a crime scene can lead to a mug shot portrait seems to have already arrived, at least according to a recent publication on modeling 3D facial shape from DNA (P. Claes et al, PLOS Genetics, 10:e1004224, 2014). In the spirit of cell biology by the numbers, can we get some basic intuition from logically analyzing the implications of a few key numbers that pertain to the question of genetic diversity in humans.

We begin by focusing on single base pair differences, or polymorphisms (SNPs). Other components of variation like insertions and deletions, varying number of gene repeats (part of what are known as copy number variations, or CNVs) and transposable elements will be touched upon below. How many single base pair variations would you expect between yourself and a randomly selected person from a street corner? Sequencing efforts such as the 1000 genomes project give us a rule of thumb. They find about one SNP per 1000 bases. That is, other components set aside, the basis for the claim that people are 99.9% genetically similar. But this genetic similarity begs the question: how come we feel so different from that person we run into on the street? Well, keep on reading to learn of other genetic differences, but one should also appreciate how our brains are tuned to notice and amplify differences and dispense the unifying properties such as all of us having two hands, one nose, a big brain and so forth. To an alien we probably would all look identical, just like you may see two mice and if their fur coat is the same they would seem like clones even if one is the Richard Feynman of his clan and the other the Winston Churchill.

Back to the numbers. Let’s check on the accuracy and implications of the rule of thumb of one SNP per 1000 bases. The human genome is about 3 Gbp long. This suggests about 3 million SNPs among two random people. This is indeed the reported value to within 10% which is no surprise as this is the origin of the rule of thumb (BNID 110117). What else can we say about this number? With about 20,000 genes each having a coding sequence (exons) about 1.5 kb long (i.e. about 500 amino acids long protein on average), the human coding sequence covers 30 Mbp or about 1 percent of the genome. If SNPs were randomly distributed along the genome that will suggest about 30,000 SNP across the genome coding sequence, or just over 1 per gene coding sequence. The measured value is about 20,000 SNPs which gives a sense of how wrong we were in our assumption that the SNPs are distributed randomly. So we are statistically wrong, as any statistical test would give an impressively low probability for this lower value to appear by chance. This is probably an indication of stronger purifying selection on coding regions. At the same time, for our practical terms this less than 2 fold variation suggests that this bias is not very strong and that the 1 SNP per gene is a reasonable rule of thumb.

How does this distribution of SNPs translate into changes in amino acid in proteins? Let’s again assume homogenous distribution among amino acid changing mutations (non-synonymous) and those that do not affect the amino acid identity (synonymous). From the genetic code the number of non-synonymous changes when there is no selection or bias of any sort should be about four times that of synonymous mutations (i.e. synonymous mutations are about 20% of the possible mutations, BNID 111167). That is because there are more base substitutions that change an amino acid than ones that keep the amino acid identity the same. What does one find in reality? About 10,000 mutations of each type are actually found (BNID 110117) showing that indeed there is a bias towards under representation of non-synonymous mutations but in our order of magnitude world view it is not a major one.

One type of mutation that can be especially important though is the nonsense mutation that creates a stop codon that will terminate translation early. How often might we naively expect to find such mutations given the overall load of SNPs? Three of the 64 codons are stop codons, so we would crudely expect 20,000*3/64 ≈ 1000 early stop mutations. Observations show about 100 such nonsense mutations, indicating a strong selective bias against such mutations. Still, we find it interesting to look at the person next to us and think what 100 proteins in our genomes are differentially truncated. Thanks to the diploid nature of our genomes, there is usually another fully intact copy of the gene (the situation is known as heterozygosity) that can serve as backup.

How different is your genotype from each of your parents? Assuming they have unrelated genotypes, the values above should be cut in half as you share half of your father and mother genomes. So still quite a few truncated genes and substituted amino acids. The situation with your brother or sister is quantitatively similar as you again share, on average, half of your genomes (assuming you are not identical twins…). Actually, for about 1/4 of your genome, you and your sibling are like identical twins, i.e. you have the same two parental copies of the DNA. Insertions and deletions (nicknamed indels) of up to about 100 bases are harder to enumerate but an order of magnitude of 1 million per genome is observed, about 3000 of them in coding regions (so an underrepresentation of about half an order of magnitude). Larger variations of longer stretches including copy number variations are in the tens of thousands per genome but because they are such long stretches their summed length might be longer than the number of bases in SNPs.

The ability to comprehensively characterize these variations is a very recent scientific achievement, starting only in the third millennia with the memorable race between the human genome project consortia and the group led by Craig Venter. In comparing the results between these two teams, one finds that in comparing the genome of Craig Venter to that of the consensus human genome reference sequence, there is about 1.2% difference when indels and CNVs are considered, 0.1% when SNPs are considered: ≈0.3% when inversions are considered — a grand total of 1.6% (BNID 110248). In the decade that followed the sequencing of the human genome, technologies were moving forward extremely rapidly leading to the 1000 Genomes Project that might seem like a rotation project to some of our readers by the time they read these words. Who knows how soon the reader could actually check on our quoted numbers by loading his or her genome from their medical report and compare it to some random friend.

The Institute for Creation Research

In 2003, the human genome was heralded as a near-complete DNA sequence, except for the repetitive regions that could not be resolved due to the limitations of the prevailing DNA sequencing technologies. 1 The chimpanzee genome was subsequently finished in 2005 with the hope that its completion would provide clear-cut DNA similarity evidence for an ape-human common ancestry. 2 This similarity is frequently cited as proof of man's evolutionary origins, but a more objective explanation tells a different story, one that is more complex than evolutionary scientists seem willing to admit.

Genomics and the DNA Revolution

One of the main problems with a comparative evolutionary analysis between human and chimp DNA is that some of the most critical DNA sequence is often omitted from the scope of the analysis. Another problem is that only similar DNA sequences are selected for analysis. As a result, estimates of similarity become biased towards the high side. An inflated level of overall DNA sequence similarity between humans and chimps is then reported to the general public, which obviously supports the case for human evolution. Since most people are not equipped to investigate the details of DNA analysis, the data remains unchallenged.

The supposed fact that human DNA is 98 to 99 percent similar to chimpanzee DNA is actually misleading.

The availability of the chimp genome sequence in 2005 has provided a more realistic comparison. It should be noted that the chimp genome was sequenced to a much less stringent level than the human genome, and when completed it initially consisted of a large set of small un-oriented and random fragments. To assemble these DNA fragments into contiguous sections that represented large regions of chromosomes, the human genome was used as a guide or framework to anchor and orient the chimp sequence. Thus, the evolutionary assumption of a supposed ape to human transition was used to assemble the otherwise random chimp genome.

At this point in time, a completely unbiased whole genome comparison between chimp and human has not been done and certainly should be. Despite this fact, several studies have been performed where targeted regions of the genomes were compared and overall similarity estimates as low as 86 percent were obtained. 3 Once again, keep in mind that these regions were hand-picked because they already showed similarity at some level. The fact remains that there are large blocks of sequence anomalies between chimp and human that are not directly comparable and would actually give a similarity of 0 percent in some regions. In addition, the loss and addition of large DNA sequence blocks are present in humans and gorillas, but not in chimps and vice versa. This is difficult to explain in evolutionary terms since the gorilla is lower on the primate tree than the chimp and supposedly more distant to humans. How could these large blocks of DNA--from an evolutionary perspective--appear first in gorillas, disappear in chimps, and then reappear in humans?

Analyzing the Source of Similarity

So how exactly did scientists come up with the highly-touted 98 to 99 percent similarity estimates?

First, they used only human and chimp DNA sequence fragments that already exhibited a high level of similarity. Sections that didn't line up were tossed out of the mix. Next, they only used the protein coding portions of genes for their comparison. Most of the DNA sequence across the chromosomal region encompassing a gene is not used for protein coding, but rather for gene regulation, like the instructions in a recipe that specify what to do with the raw ingredients. 3 The genetic information that is functional and regulatory is stored in "non-coding regions," which are essential for the proper functioning of all cells, ensuring that the right genes are turned on or off at the right time in concert with other genes. When these regions of the gene are included in a similarity estimate between human and chimp, the values can drop markedly and will vary widely according to the types of genes being compared.

The diagram in Figure 1 illustrates how a gene is typically represented as a portion of a chromosome. As indicated, there is considerably more non-coding sequence ahead of the gene, within it ("introns"), and behind it. The 98 to 99 percent sequence similarity estimates are often derived from the small pieces of coding sequence ("exons"). Other non-coding sequences, including the introns and sequences flanking the gene region, are often omitted in a "gene for gene" comparative analysis. The critical importance of the non-coding sequences in the function of the genome was not well understood until recently, but this does not excuse the bias of the "98 to 99 percent similarity" claim.

Another important factor concerns the potential for variants of the same protein to have different functions that can perform different tasks in different tissues. There is now no doubt that gene or protein sequence similarities, in and of themselves, are not as significant as other functional and regulatory information in the cell. Unfortunately, evolutionary assumptions drove a biased approach of simple sequence comparisons, providing few answers as to why humans and chimps are obviously so different.

Interestingly, current research is confirming that most of what makes humans biologically unique when compared to chimps and other animals is how genes are controlled and regulated in the genome. Several studies within the past few years are demonstrating clear differences in individual gene and gene network expression patterns between humans and chimps in regard to a wide number of traits. 4, 5 Of course, the largest differences are observed in regard to brain function, dexterity, speech, and other traits with strong cognitive components. To make the genetic landscape even more complicated, a number of recent studies are also confirming that close to 93 percent of the genome is transcriptionally active (functional). 6 Not so long ago, scientists thought that only 3 to 5 percent of the genome that contained the protein coding regions was functional the rest was considered "junk DNA."

So what is an appropriate response to the assertion that a 99 percent similarity exists between human and chimp DNA, and thus proves common ancestry?

One can simply say that the whole genomes have never really been compared, only hand-selected regions already known to be similar have been examined, and the data is heavily biased. In fact, due to limitations in DNA sequencing technology, researchers do not even have the complete genomic sequence for human or chimp at present. In the sequence that they do have, much more analysis needs to be done.

Here are a number of key points that counter the evolutionary claims of close human-chimp similarity:

  • The chimp genome is 10 to 12 percent larger than the human genome and is not in a near-finished state like the human genome it is considered a rough draft.
  • When large regions of the two genomes are compared, critical sequence dissimilarities become evident.
  • Extremely large blocks of dissimilarity exist on a number of key chromosomes, including marked structural differences between the entire male (Y) chromosomes.
  • Distinct differences in gene function and regulation are now known to be a more significant factor in determining differences in traits between organisms than the gene sequence alone. Research in this area has clearly demonstrated that this is the case with humans and apes, where marked dissimilarities in expression patterns are evident.

It is clear that the only way to obtain extreme DNA-based similarity between man and chimpanzee is to use comparative analyses that are heavily skewed by an evolutionary bias where one picks and chooses what data or what part of the genome to use. At present, the DNA sequence differences between these genomes clearly indicate a much lower level than 98 to 99 percent. In fact, one evolutionary study suggests it may be as low as 86 percent or less. In addition, the complex functional aspects of genes and their regulatory networks differ markedly between humans and chimps and play a more important role than DNA sequence by itself.

The DNA data, both structural and functional, clearly supports the concept of humans and chimps created as distinct separate kinds. Not only are humans and chimps genetically distinct, but only man has the innate capacity and obligation to worship his Creator. 7

  1. International Human Genome Sequencing Consortium. 2004. Finishing the euchromatic sequence of the human genome. Nature. 431 (7011): 931-945.
  2. The Chimpanzee Sequencing and Analysis Consortium. 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 437 (7055): 69-87.
  3. Anzai, T. et al. 2003. Comparative sequencing of human and chimpanzee MHC class I regions unveils insertions/deletions as the major path to genomic divergence. Proceedings of the National Academy of Sciences. 100 (13): 7708-13.
  4. Calarco, J. et al. 2007. Global analysis of alternative splicing differences between humans and chimpanzees. Genes & Development. 21: 2963-2975.
  5. Cáceres, M. et al. 2003. Elevated gene expression levels distinguish human from non-human primate brains. Proceedings of the National Academy of Sciences. 100 (22): 13030-13035.
  6. The ENCODE Project Consortium. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 447 (7146): 799-816.
  7. Criswell, Daniel. 2006. What Makes Us Human? Acts & Facts. 35 (1).

* Dr. Tomkins is Research Associate at the Institute for Creation Research.

Cite this article: Tomkins, J. 2009. Human-Chimp Similarities: Common Ancestry or Flawed Research? Acts & Facts. 38 (6): 12.

All Similarities Are Not Equal

A high degree of sequence similarity does not equate to proteins having exactly the same function or role. For example, the FOXP2 protein, which has been shown to be involved in language, has only 2 out of about 700 amino acids which are different between chimpanzees and humans.15 This means they are 99.7 percent identical. While this might seem like a trivial difference, consider exactly what those differences are. In the FOXP2 protein, humans have the amino acid asparagine instead of threonine at position 303 and then a serine that is in place of an asparagine at 325. Although apparently a minor alteration, the second change can make a significant difference in the way the protein functions and is regulated.16 Thus, a very high degree of sequence similarity can be irrelevant if the amino acid that is different plays a crucial role. Indeed, many genetic defects are the result of a single change in an amino acid. For example, sickle cell anemia results from a valine replacing glutamic acid in the hemoglobin protein. It does not matter that every other amino acid is exactly the same.

Usually people think that differences in amino acid sequence only alter the three-dimensional shape of a protein. FOXP2 demonstrates how a difference in one amino acid can yield a protein that is regulated differently or has altered functions. Therefore, we should not be too quick to trivialize even very small differences in gene sequences. Further, slight differences in regions that don’t code for proteins can impact how protein levels are regulated. This alteration can change the amount of protein that is produced or when it is produced. In such cases, the high degree of similarity is meaningless because of the significant functional differences that result from altered protein levels.

DNA Comparisons between Humans and Chimps: A Response to the Venema Critique of the RTB Human Origins Model, Part 2

What is the best evidence for human evolution? For many biologists it’s the high degree of genetic similarity between humans and chimpanzees. The similarity in DNA sequences between humans and chimps is often regarded as evidence that we evolved from a common ancestor, making chimps our closest living “relative” in the animal kingdom.

As I discussed last week and elsewhere, it’s possible to understand the DNA similarity between humans and other animals (including chimps) as reflecting the work of a Creator. The biological similarities can be understood as a Creator’s use of the same materials and the same design templates to make humans and chimps. Likewise, some of the genetic difference could be understood as intentionally introduced by the Creator to make each creature unique. (A detailed presentation of the RTB human origins model can be found in our book Who Was Adam?)

As part of our critique of human evolution, we question whether there is as strong a genetic connection between humans and chimps as is commonly communicated to the general public. Most people are familiar with the claim that there is a 99 percent DNA similarity between humans and chimpanzees. Our assertion, however, is that the genetic commonality between these two primates is closer to 90 percent. 1

This claim has prompted one biologist, Dennis Venema of Trinity Western University, to critique our human origins model 2 in an article published on The Biologos Foundation website. Venema complains that we 1) intentionally ignored key scientific papers about genetic comparisons between humans and chimpanzees that went against our model and 2) that we erroneously claim that the genetic similarity between humans and chimpanzees is around 90 percent, not 95 percent (or about 99 percent) as the most recent scientific papers report. 3

How Similar Is Human and Chimp DNA?
As discussed in Who Was Adam?, researchers have performed a number of studies that indicate a 98 to 99 percent genetic similarity between humans and chimps. But as Hugh Ross and I point out, these comparisons were based on relatively limited genetic regions and focused on a single type of genetic difference (called substitutions or single nucleotide polymorphisms, SNPs). Comparisons that encompass larger regions of the genomes and include other types of genetic differences (like indels) show that the DNA similarity between humans and chimpanzees is much less than 98 to 99 percent.

Consider several studies discussed in Who Was Adam?. One study, conducted by Roy J. Britten and mentioned in Venema’s critique, compared five regions of the chimpanzee genome collectively (encompassing about 780,000 nucleotide base pairs) with corresponding regions of the human genome. Britten found a 1.4 percent difference when substitutions were considered, but a 3.4 percent difference when these five regions were examined for indels. Both types of differences combined show a 95 percent genetic similarity—not 99 percent. 4

Another study we referred to used this same approach. These researchers uncovered a more limited genetic similarity when comparing a 1,870,955 base-pair segment of the chimp genome with the corresponding human genome region. When substitutions only were considered, the sequence similarity proved about 98.6 percent. Including indels in the comparison dropped the similarity down to 86.7 percent. 5 Venema notes this study in his critique as well, but dismisses it because it focused on genes that are part of the immune system, making them variable inherently. According to Venema, this property in immune system genes means that the genetic comparison in this study shouldn’t reflect comparisons made with the whole genomes. He makes a good point. However, he also overlooks the larger point we are trying to make: namely, that including indels reduces the genetic similarity between humans and chimps.

We also cite other studies in Who Was Adam? of which Venema makes no mention. These studies affirm our main point: indels appear to account for significant differences between human and chimpanzee genomes. For example, we cite a comparative analysis of 27,000,000 base pairs of human chromosome 21 with the corresponding chimpanzee chromosome (number 22). This anaylsis identified 57 indels ranging in size from 200 to 800 base pairs, 21 of which were found in regions containing genes. 6 In spring of 2004, The International Chimpanzee Chromosome 22 Consortium confirmed this initial observation when they generated a detailed sequence of chimp chromosome 22 and compared it to human chromosome 21. 7 They discovered a 1.44 percent sequence difference when they lined up the two chromosomes and made a base-by-base comparison. But, they also discovered 68,000 indels in the two sequences with some indels up to 54,000 nucleotides in length.

Who Was Adam? also discusses work that compared a 1,800,000 base-pair region of human chromosome 7 with the corresponding region in the genomes of several vertebrates. Only a third of the differences between humans and chimpanzees involved substitutions. Indels accounted for roughly two-thirds of the sequence differences between these two primates and about one-half of these were greater than 100 base-pairs long. 8

In fall of 2005, The Chimpanzee Sequencing and Analysis Consortium published a rough draft sequence of the chimp genome and performed an extensive comparison with the human genome. Though a rough draft, about 98 percent of the chimp genome was considered of extremely high quality. This work’s results remained consistent with those of the earlier studies. Considering substitutions only produced a 1.23 percent difference between the genomes, which amounted to about 35 million base pairs (genetic letters). But including indels in the comparison uncovered another 1.5 percent difference, corresponding to another 5 million changes. So instead of being 99 percent, the actual genetic similarity is around 97 percent.

This figure, however, overestimates genetic similarity. When performing the comparison, the researchers examined only about 2.4 billion base pairs, which represent around 75 to 80 percent of the genomes. As the authors note:

Best reciprocal nucleotide-level alignments of the chimpanzee and human genomes cover

2.4 gigabases (Gb) of high-quality sequence, including 89 Mb from chromosome X and 7.5 Mb from chromosome Y. 9

The reason for this limited comparison stems from the fact that they struggled to get a significant fraction of the genomes to align, in part, because of differences. The authors of the study described the nature of the difficulties:

On the basis of comparisons with the primary donor, some small supercontigs (most <5 kb) have not been positioned within large supercontigs (

1 event per 100 kb) these are not strictly errors but nonetheless affect the utility of the assembly. There are also small, undetected overlaps (all <1 kb) between consecutive contigs (

1.2 events per 100 kb) and occasional local misordering of small contigs (

0.2 events per 100 kb). No misoriented contigs were found. Comparison with the finished chromosome 21 sequence yielded similar discrepancy rates (see Supplementary Information “Genome sequencing and assembly”). The most problematic regions are those containing recent segmental duplications. Analysis of BAC clones from duplicated (n = 75) and unique (n = 28) regions showed that the former tend to be fragmented into more contigs (1.6-fold) and more supercontigs (3.2-fold). Discrepancies in contig order are also more frequent in duplicated than unique regions (

0.1 events per 100 kb). The rate is twofold higher in duplicated regions with the highest sequence identity (> 98%). If we restrict the analysis to older duplications (≤98% identity) we find fewer assembly problems: 72% of those that can be mapped to the human genome are shared as duplications in both species. These results are consistent with the described limitations of current WGS assembly for regions of segmental duplication. 10

Given that the reason for the investigation’s failure to align 0.6 to 0.8 billion base pairs in the two genomes stems from the extensive genetic differences, it is unlikely that these regions display only a 3 percent difference, as is the case for the rest of the genomes. Instead the genetic difference in these regions must be greater. When this greater genetic difference is considered, it is reasonable to conclude that the overall difference between humans and chimpanzees is less than 97 percent and may well be as low as about 90 percent. In direct response to Venema’s criticisms, this is why we state the genetic similarity between humans and chimpanzees may be as low as 90 percent, not 95.

Earlier work presaged the Chimpanzee Sequencing and Analysis Consortium’s struggle in their attempts to align large regions of the human and chimp genomes. In early 2002, The International Consortium for the Sequencing of Chimpanzee Chromosome 22 reported one of the first studies to make a large scale genome-to-genome comparison. 11 To make this comparison, the Chimpanzee Genome Project team cut the chimp genome into fragments, sequenced them, then compared them to corresponding sequences found in the Human Genome Database. The team found that those chimp DNA fragments able to align with human sequences displayed a 98.77 percent agreement. However, the researchers also found that about 15,000 of the 65,000 chimp DNA fragments did not align with any sequence in the Human Genome Database. These fragments appeared to represent unique genetic regions. Furthermore, during a detailed comparison of the chimp DNA fragments with human chromosome 21, the team discovered that this human chromosome possesses two regions apparently unique to humans.

A few months later, a team from the Max Planck Institute achieved a similar result when they compared over 10,000 regions (encompassing nearly 3,000,000 nucleotide base pairs). Only two-thirds of the sequences from the chimp genome aligned with the sequences in the human genome. As expected in those that did align, a 98.76 percent genetic similarity was measured—yet one-third found no matches. 12

It is interesting that when evolutionary biologists discuss genetic comparisons between human and chimpanzee genomes, the fact that, again, as much as 25 percent of the two genomes won’t align receives no mention. Instead, the focus is only on the portions of the genome that display a high-degree of similarity. This distorted emphasis makes the case for the evolutionary connection between humans and chimps seem more compelling than it may actually be.

In many respects this discussion is moot, unless there is a clear understanding as to how the genetic differences between humans and chimpanzee translate into the biological and profound behavioral differences between these two species. In Who Was Adam? and elsewhere we have made the point that these types of genetic comparisons are meaningless. Next week, I will address why and, in doing so, respond to another charge leveled against Hugh Ross and me by Dennis Venema: namely, that this assertion doesn’t reflect, but rather misrepresents, the scientific community’s viewpoint.

Articles About DNA Similarities

Results from this study negate the concept of the 98.5% DNA similarity myth and highlight the extremely flawed and humanized nature of the panTro4 version of the chimpanzee genome.

Venema took Tomkins’ claims to task on the BioLogos website regarding the supposed remnants of an egg-laying gene (vitellogenin) in human DNA.

BioLogos has engaged in systematic scientific error on one of their “evidences” for evolution, and they have misrepresented the arguments for several years.

When you hear stories about the astonishing similarity between human and chimp DNA, there’s something they’re not telling you . . .

Taken together, genomic data for both the alleged fusion and cryptic centromere sites refute the concept of fusion in a human-chimpanzee common ancestor.

The current chimpanzee genome assembly has problems that reduce its veracity as an authentic representation.

Some think acorn worms, virtually unchanged since the Cambrian explosion, represent an evolutionary link between invertebrates and humans.

Interestingly, BioLogos is probably the only evolutionary group that puts such a high level of focus on this hypothesis as key evidence for evolution.

Past evolutionary research in comparative DNA analysis between chimps and humans has employed a great deal of preferential and selective data analysis.

Some assert that humans and chimpanzees are only 1 to 2% different, but careful re-tallying suggests there is a gigantic genetic gap between the two species.

With mice and men, practice makes perfect, but a mouse with a man’s FOXp2 gene achieves perfection faster.

When evidence is interpreted in a particular worldview, it can sound very convincing that the evidence supports that worldview.

DNA similarity could easily be explained as a result of a common Creator.

A major argument supposedly supporting human evolution from a common ancestor with chimpanzees is the “chromosome 2 fusion model.”

The Human Genome Project, supposedly disproved the possibility of all humans being descended from one man and woman. But what does the science really show?

Revolutionary DNA sequencing technique said to be “a powerful new tool to fish for genes that have recently evolved.”

Scientists recently analyzed the bonobo ape's genome.

To provide a global set of analyses, large-scale comparative DNA sequence alignments between the chimpanzee and human genomes were performed with the BLASTN algorithm.

The author believes that his use of the Y-chromosome comparison example was misinterpreted and desires to clarify.

Turtles in search of their long lost ancestor discover genes trump holes in the head.

When evaluating comparisons between genomes using DNA sequence, it is important to understand the nature of how that sequence was obtained and bioinformatically manipulated before drawing conclusions.

No one, not even evolutionists, disputes that humans have crossed a threshold that sets us apart from the rest of the animal kingdom—even chimpanzees, our “close evolutionary relatives.” But according to new research, it’s actually the genes we don’t have that sets us apart.

When considered alongside humans and chimps, the orangutan is the genomic “odd man out.” Is that because it hasn’t evolved as quickly?

Evolutionists often emphasize our genetic similarity to chimpanzees, but our genetic connections don’t end there.

DNA resembles a language in many uncanny ways.

According to American Demographics, 113 million Americans have begun to trace their roots.

We’ve all heard that humans and chimps share up to 98% of their DNA. But new studies are accentuating the differences between humans and chimps.

Twice in the past five years, our alleged ancestry with apes has made the cover of Time magazine.

Can a simple “yes or no” answer be adequate for a question about Adam and Eve’s genetic code and today’s human traits?

It seems that if a protein performs a certain function in one organism, then that same protein should perform the same function in a variety of organisms.

There are many anatomical similarities between humans and apes. Our chromosomes are similar as well. But do human chromosomes hint of chimp ancestry?

Scientists publishing in the journal Genetics last week have showed that “[m]any more genes separate humans from chimpanzees than scientists believed.”

The chimp and human genome have more differences than previously thought.

All humans today can be traced back to the same small group of people.

His name notwithstanding, the current legal case for the personhood of Mr. Matthew Hiasl Pan (a chimp) is in jeopardy, reports the Associated Press from Vienna, Austria.

The inadequacy of similar “genetic potential” in explaining organisms’ similarity is perhaps most notable in comparisons of chimps and humans.

Sponge nerve system genes correlate with human nervous system genes by 25%.

A sensational headline ran across the science media this week: “Chimps More Evolved than Humans.”

The news has been buzzing lately about two recent papers that are reporting the sequencing of up to one million bases of the Neanderthal genome.

I have a soft spot for Twycross Zoo. It is a favourite with my children and me, since it is only a 30-minute drive from our house in the west of Leicestershire.

While there is much similarity in DNA sequences and gene expression among them, there are also important differences. In the case examined, as in other cases, the differences make the difference.

In the current controversies about teaching about the origin of life in public schools, there is a general misunderstanding of the differences between “origin science” and “operation science."

Last week, in a special issue of Nature devoted to chimpanzees, researchers report the initial sequence of the chimpanzee genome.

It is conventionally held that humans and chimps differ only very slightly in their DNA. However, new evidence suggests that the difference might be much more drastic.

'Do you realise our DNA is 98.5% identical?’ These are the words in an advertisement for the first-class stamp in a new series called ‘The secret of life,’ released by Royal Mail (UK).

A new report in the Proceedings of the National Academy of Sciences suggests that the common value of >98% similarity of DNA between chimp and humans is incorrect.

The claim that pseudogenes and their respective variations are shared between primates in a nested hierarchy, and can only be explained through common evolutionary descent, is found wanting.

Comparison of bonobo anatomy to humans offers evolutionary clues

Percentage of muscle distribution to upper and lower limbs in Pongo pygmaeus, Gorilla gorilla, P. paniscus, and H. sapiens. Credit: (c) Adrienne L. Zihlman,PNAS, doi: 10.1073/pnas.1505071112

(—A pair of anthropology researchers, one with the University of California, the other Modesto College has found what they believe are clues to human evolutionary development by conducting a long term study of bonobo anatomy. In their paper published in Proceedings of the National Academy of Sciences, Adrienne Zihlman and Debra Bolter, describe their anatomy studies and their ideas on why what they found offers new clues on why humans developed in the ways we did.

Scientists looking to understand how humans evolved have studied a lot of fossils, but such samples are of bones, which means there is little to no evidence of what organs, muscle or fat looked like in our ancestors which means there are still questions regarding things such as what percentage or proportion of fat or muscle was there, where were they located on the body, and what the organs were like. In this new study, the research pair sought to uncover clues by studying bonobos, apes that look a lot like chimpanzees and are considered to be our closest relative.

To learn more about bonobo anatomy, the researchers performed autopsies on thirteen of the apes that had died naturally over the course of three decades, carefully jotting down seldom noted information such as fat and muscle percentages. In so doing, they came to see that bonobos have considerably less fat on their bodies than do humans, even those that lived a similar sedentary life due to living in captivity. They also found that the apes had more upper body mass than humans as a rule and less leg muscle—bonobos also have a lot more skin.

In analyzing their results, the researchers suggest that the differences likely came about as early human ancestors began walking around upright, causing the need for more leg muscle and more fat—because a nomadic lifestyle would necessitate a fat store to prevent starvation during lean times, especially for females if they were to successfully bear offspring. They also believe that we humans have less skin because as we moved around and moved faster on two legs—our skin developed an ability to sweat as a means to keep cool and that led to thinner skin.

The human body has been shaped by natural selection during the past 4–5 million years. Fossils preserve bones and teeth but lack muscle, skin, fat, and organs. To understand the evolution of the human form, information about both soft and hard tissues of our ancestors is needed. Our closest living relatives of the genus Pan provide the best comparative model to those ancestors. Here, we present data on the body composition of 13 bonobos (Pan paniscus) measured during anatomical dissections and compare the data with Homo sapiens. These comparative data suggest that both females and males (i) increased body fat, (ii) decreased relative muscle mass, (iii) redistributed muscle mass to lower limbs, and (iv) decreased relative mass of skin during human evolution. Comparison of soft tissues between Pan and Homo provides new insights into the function and evolution of body composition.

What is Animal Skeleton

Animal skeleton is the structural framework of animals. Based on the structure, three types of skeletons occur in animals: endoskeletons, exoskeletons, and hydrostatic skeletons.


Endoskeleton is the internal skeleton made up of bones and cartilages. It occurs inside the body of vertebrates, including humans, mammals, birds, reptiles, amphibians, and fish. Moreover, it develops from the endoderm and is a living structure. It grows as the body grows and a single skeleton is maintained throughout the lifetime of the animal.

Figure 3: Endoskeleton

The two main parts of the endoskeleton are axial skeleton and appendicular skeleton. The axial skeleton consists of the skull and backbones. The main function of the skull is to protect the brain. Backbone protects the spinal cord. The appendicular skeleton provides support to the appendages while protecting internal organs.

Though the main function of the endoskeleton is to provide structural support and aid in the movement, it also involves in the protection of internal body organs. In addition, it produces blood cells in a process called hematopoiesis. Also, the bone matrix serves as a storage compartment of calcium, iron, ferritin, and phosphate. In addition, bone cells perform an endocrine function by secreting hormones like osteocalcin, which regulates blood sugar levels and fat deposition.


Exoskeleton is the external skeleton of arthropods it is made up of chitin. It occurs in diplopods, chilopods, arachnids, crustaceans, and insects. The main characteristic feature of the exoskeleton is its molting. Arthropods have to shed their skeleton since it occurs outside the body and prohibits the growth of the body. Therefore, they develop several exoskeletons during their lifetime. In addition, mollusks have an exoskeleton made up of calcium compounds. However, they do not shed their skeleton.

Figure 4: Exoskeleton

Hydrostatic Skeleton

Hydrostatic skeleton is a fluid-filled compartment inside the body called coelom. Here, hydrostatic pressure is the main factor which provides structural support. Also, it supports the internal organs. This is found in invertebrates with soft bodies like sea anemones, earthworms, and cnidarians.

Figure 5: Hydrostatic Skeleton

Bioinformatics Exam #1

(3) You can find common ancestor using protein seqeunce from over 1 bilion years ago, whereas DNA sequences can only go back 600 million years ago.

BLOSUM62 & PAM120: Go to alignments

At some point two homologous proteins are too divergent for the alignment to be recognized as significant.

For PAM matrices, there is something called the Twilight Zone. After

The goal of Needleman and Wunsch is to identify an optimal alignment. You create a new matrix with m+1 or n+1, because you will be asigning each pair a score. Gap penalities (-2 for each gap position) are placed along the first row and column. This will allow us to introduce a terminal gap of any length.

One main difference is that score cannot be negative. If they are going to be negative, they should get a score of zero. Scoring: +1 for match -0.33 for mismatch -1.3 for a gap of length 1 (the larger the gap, the harsher the penalty).

BLASTN: compares DNA to DNA (nucleotides to nucleotides)

BLASTX: translates DNA into six protein sequences using all six possible reading frames, and then compares each of these proteins to a protein database.

TBLASTN: translate every DNA sequence in a database to six potential proteins, and then compare your protein query against each of those translated proteins.

Watch the video: What are the similarity score percentage in Turnitin (August 2022).