Is there a specific terms to describe variant of existing genes in bacteria?

Is there a specific terms to describe variant of existing genes in bacteria?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I have a question regarding a specific term that describes the variant of existing genes. I am analyzing whole-genome sequencing of a bacterial isolate. I found there are a large number of genes that have partial sequence identity or subject/query coverage to the known reference genes.

Now I know that by looking at sequence identity and coverage may not be enough to analyze the genes. I know some gene can have 30% sequence identity and still folded the same way as the original genes. However, let's say these genes are truly variant of the existing genes based off sequence identity. What is the term to call them?

My PI told me they are rapidly evolving genes. I searched some paper and found rapidly evolving genes are genes that are subjected to positive selection. While I agree for some of these are rapidly evolving genes as they probably subject to positive selection, I cannot see that all of the genes actually undergoes selection at same time. I am wondering if there is another term to call these gene variants. If there is any paper to define them that will be great.



From wikipedia > allele

An allele is a variant form of a given gene

I somewhat disagree with this definition in the sense that it is not general enough. An allele is a variant at any locus (locus = position in the genome). You can talk about allele at non-coding sequences. Also, if you are talking about a coding sequence, you can define different sequences as belonging to different alleles whether or not their protein products differ in function or not. The proteins might even be the exact same one for two different alleles if the only difference between these alleles concerns a synonymous mutation (or a mutation in an intron).

You can feel free to define that different sequences as being different alleles by any kind of arbitrary threshold measure. You will just need to define it. For example, you can define that different sequences are different alleles only if they have at least 5% pairwise differences or if you found out by some biophysics computation that they ought to fold differently or for whatever another arbitrary measure.

The term allele is used very often in evolutionary biology and you'll find its usage in any intro course to evolutionary biology.

Rapidly evolving genes

As far as I know, there are no commonly agreed upon definition for what a rapidly evolving gene is. To me, a rapidly evolving gene is simply a coding sequence which evolved more rapidly than other reference sequences. Per se, although it would very likely involve positive selection, I don't think it is a requisite by the definition of rapidly evolving genes. It could, for example, be caused by a very high mutation rate at this particular sequence or by balancing selection causing any new rare alleles to be beneficial.

Horizontal gene transfer

Our editors will review what you’ve submitted and determine whether to revise the article.

Horizontal gene transfer, also called lateral gene transfer, the transmission of DNA (deoxyribonucleic acid) between different genomes. Horizontal gene transfer is known to occur between different species, such as between prokaryotes (organisms whose cells lack a defined nucleus) and eukaryotes (organisms whose cells contain a defined nucleus), and between the three DNA-containing organelles of eukaryotes—the nucleus, the mitochondrion, and the chloroplast. Acquisition of DNA through horizontal gene transfer is distinguished from the transmission of genetic material from parents to offspring during reproduction, which is known as vertical gene transfer.

Horizontal gene transfer is made possible in large part by the existence of mobile genetic elements, such as plasmids (extrachromosomal genetic material), transposons (“jumping genes”), and bacteria-infecting viruses (bacteriophages). These elements are transferred between organisms through different mechanisms, which in prokaryotes include transformation, conjugation, and transduction. In transformation, prokaryotes take up free fragments of DNA, often in the form of plasmids, found in their environment. In conjugation, genetic material is exchanged during a temporary union between two cells, which may entail the transfer of a plasmid or transposon. In transduction, DNA is transmitted from one cell to another via a bacteriophage.

In horizontal gene transfer, newly acquired DNA is incorporated into the genome of the recipient through either recombination or insertion. Recombination essentially is the regrouping of genes, such that native and foreign (new) DNA segments that are homologous are edited and combined. Insertion occurs when the foreign DNA introduced into a cell shares no homology with existing DNA. In this case, the new genetic material is embedded between existing genes in the recipient’s genome.

Compared with prokaryotes, the process of horizontal gene transfer in eukaryotes is much more complex, mainly because acquired DNA must pass through both the outer cell membrane and the nuclear membrane to reach the eukaryote’s genome. Subcellular sorting and signaling pathways play a central role in the transport of DNA to the genome.

Prokaryotes can exchange DNA with eukaryotes, although the mechanisms behind this process are not well understood. Suspected mechanisms include conjugation and endocytosis, such as when a eukaryotic cell engulfs a prokaryotic cell and gathers it into a special membrane-bound vesicle for degradation. It is thought that in rare instances in endocytosis, genes escape from prokaryotes during degradation and are subsequently incorporated into the eukaryote’s genome.

Horizontal gene transfer plays an important role in adaptation and evolution in both prokaryotes and eukaryotes. For example, the transfer of a gene encoding a unique metabolic enzyme from a species of Pasteurella bacteria to the protozoan parasite Trichomonas vaginalis is suspected to have facilitated the latter organism’s adaptation to its animal hosts. Likewise, the exchange of a gene from a human cell to the bacterium Neisseria gonorrhoeae—a transfer that appears to have occurred relatively recently in the bacterium’s evolution—may have enabled the organism to adapt and survive in humans. Scientists have proposed too that the recent evolution of the methylaspartate pathway of metabolism in the halophilic (salt-loving) archaean Haloarcula marismortui originated with the organism’s acquisition of a specialized set of genes via horizontal transfer.


The accumulation of biomedical knowledge is growing exponentially. There has been tremendous effort to structure research findings as annotations on biological entities (e.g., genes, genetic variants, and pathways). However, these annotations are fragmented among many resources that range greatly in terms of size, funding, and visibility (see, e.g., Ensembl [1], UniProt [2], PROSITE [3], and Reactome [4]). Tools for knowledge integration enable more efficient analysis of genome-scale data sets and discovery of relationships between biological entities.

Bioinformaticians facing data integration problems generally pursue one of two strategies: data warehousing or data federation. Data warehousing involves downloading flat files from various sources, writing parsers to process the files, and then loading the parsed data into a local database. This strategy has the advantage of very high performance, but it also requires significant initial effort to write the parsers and ongoing effort to keep the resource up to date. On the other hand, data federation works by accessing remote data resources through web services. Federated data solutions are always up to date, but extra care is required to maintain the links, and large queries may take a long time to return due to server and network limitations. Moreover, the dependability of federated solutions is entirely dependent on the stability of the remote resources.

Results and Discussion

To find out how many members of the SDR family are present in E. coli K-12 MG1655, henceforth E. coli, we assembled enzymes identified with an EC number 1.1.1.x. Among these are enzymes with the structural and sequence characteristics of the SDR superfamily. Initially we used the AllAllDb program of the Darwin system [14] (after first separating independent, fused proteins into their components) to collect all sequence related E. coli enzymes from this group. Parameters of the initial pair-wise similarity search were set as requiring a Pam value of at least 200, an alignment of 83 residues and an involvement of at least 50% of the length of the smaller protein of any sequence-similar pair. Related enzymes were assembled by transitive relationship. To extend membership in the groups to include proteins whose sequence may have diverged further, we submitted all members to PSI-BLAST analysis [15].

E. coli has 15 members of the SDR family whose substrates and reactions are known (Table 1). We found that the entire superfamily could be subdivided based their sequence similarity into two separate groups. One of these groups contained all the dehydrogenase/reductases, the other all the epimerase/dehydratases. Although the reactions of the second group are not oxidative the apparent anomaly is explained by their reaction mechanisms. For SDR enzymes, reactions of epimerization, dehydration or isomerization are promoted with an oxidation-reduction type of chemistry that promotes both loss and gain of a proton so as to change the placement of the moieties of the substrate or to promote dehydration. Both types of reactions are facilitated by a Ser-Tyr-Lys catalytic triad whose spatial configuration and charge distribution is affected by the binding of each substrate [16].

Examination of the sequence alignments of the E. coli SDR enzymes revealed four regions that aligned for all members of the extended family, the substrate binding site, the NAD(P)/H-binding Rossman fold and two sites of unknown function, likely to be important for folding (Fig. 2). Each of the conserved sequences occurs in approximately the same region within each protein. Small changes in the residues in conserved regions have large effects on the affinity for particular substrates and on the specific reaction that is catalyzed.

Alignment of E. coli SDR family members. The enzymes of the family members are listed in Table 1. Four conserved regions of the proteins are shown. The protein sequences were aligned with ClustalW 2.0.11. Identical residues are highlighted in dark grey while conserved and semi-conserved residues are highlighted in light grey.

Table 1 shows the separation into two types of crotonases and the variety of pathways and resulting phenotypes served by the SDR superfamily. Some pathways are used by many organisms, such as fatty acid synthesis, but many products and processes are characteristic of the enteric organisms only, such as bile acid emulsification, biosynthesis of colanic acid, lipid A, enterobactin and enterobacterial common antigen. It appears that the process of duplication and divergence has contributed to the metabolic characteristics of a unique phylogenetic group of bacteria.

One can ask how broad the phenomenon of families is among E. coli enzymes. Even before the sequence of the E. coli genome was completed, the existence of families of related sequence within its genome was observed [17, 18]. Such sequence-related families are viewed as paralogous families that arose by duplication of genes within the genome of the organism itself or in that of an ancestor, although as previously mentioned some members of these families could have been introduced by lateral gene transfer. After completion of the full genomic sequence of E. coli [19], the complete set of paralogous families in relation to the whole genome could be determined. Pair-wise related sequences from the entire genome were assembled, using the criteria of similarity as having Pam values below 200 and alignments of at least 83 residues. By requiring an alignment of 83 amino acids or more we seek to avoid grouping sequences by small common domains or motifs, such as DNA binding domains, instead we detect protein level duplications. For example in the RbsR/RbsD case, the 45 amino acid DNA-binding domain (PF00356) is present in 14 additional E. coli transcriptional regulators. Since the main components of these proteins, the ligand-binding domains, not are related to RbsR we do not consider them paralogs. Our groups ranged in size from 92 members in the largest group down to the smallest size, simple pairs. Over half of the E. coli proteins resided in these sequence-related groups [20–22].

The existence of families of sequence-similar proteins making up a large fraction of the genomic content supports the proposal that duplication followed by divergence is an important mechanism of molecular evolution. The largest groups in the E. coli genome were those of related transport proteins, regulatory proteins, and redox (i.e. iron-sulfur) subunits of enzyme complexes. Groups of sequence similar enzymes were smaller, had fewer members, than the groups of transporters and regulators. However, we concentrated on the class of enzymes because studying families of enzymes has the advantage of being able to draw on the detailed knowledge in the extensive biochemical literature concerning their properties, prosthetic groups, the mechanisms of the reactions they catalyze and pathways they belong to. One is in a position to link genetic information with biochemical information and thus with phenotypes of the organism. Examining the members of enzyme families of E. coli allowed a view at the molecular level of what kind of creation of function occurred as a consequence of presumed duplication and divergence.

Another superfamily that is structurally and mechanistically related but catalyzes diverse reactions is the crotonase family. The family was originally characterized by similarities in three-dimensional structure of four enzymes derived from different sources. Although structurally related, sequence related and mechanistically related, their biochemistry showed that they catalyzed four different reactions [23]. Subsequent investigation has shown that the crotonase enzymes are related in sequence, though often distantly, and catalyze a broad range of reactions i.e. dehalogenation, hydration/dehydration, decarboxylation, formation/cleavage of carbon-carbon bonds and hydrolysis of thioesters [24].

To look at crotonases in an evolutionary context, one can ask if they could have arisen by duplication and divergence. To approach this question, one could enumerate all crotonases in one organism. Starting with a crotonase in E. coli, encoded in the N-terminal portion of FadB (here designated FadB_1) with demonstrable structural similarity at the active site to the rat liver crotonase, we assembled the group of sequence-similar enzymes in E. coli as before by the Darwin AllAllDb program. Figure 3 presents the alignment of residues at the active site for the E.coli crotonase family. The greatest amino acid conservation is seen for the residues involved in acyl-CoA-binding and the catalytic site. There is a CoA-binding site and an expandable acyl-binding pocket as well as an oxyanion hole for binding the thioester C = O bond, crucial to the reaction catalyzed by members of this superfamily [23, 25]. Variations in residues at critical positions in the active sites dictate which of the related reactions occurs. Again, as for the SDR family, one can visualize that the broad family of crotonases, spanning several kinds of reactions, could have arisen by gene duplication and divergence early in evolutionary time.

Alignment of E. coli crotonase family members. Protein family membership was determined as proteins having sequence similarity of 200 Pam units or less over at least 50% of their length. Members of the E. coli crotonase family are listed in Table 3. The protein sequences were aligned with ClustalW 2.0.11. Identical residues are highlighted in dark grey while conserved and semi-conserved residues are highlighted in light grey. Residues forming the FadB oxanion hole used to stabilize reaction intermediates are shown in bold-face. The FadB reaction center is outlined.

By assembling the crotonase family members in a few organisms, one expects that some individual enzymes will be present in all the organisms as they are virtually universal. However other members of the crotonase family are expected to differ from one organism to another. We expect that bacteria in separate lineages would have some enzymes that catalyze different reactions. Differentiation of bacteria as they evolved along different lineages is expected to be partly as a consequence of generating different enzyme family members in the course of the divergence process. Other molecular evolution events are occurring at the same time as the duplication and divergence, such as lateral transfers and gene loss. To focus on gene duplication we decided to look at families of enzymes in a set of both similar and distant bacteria.

We asked whether members of three enzyme families are the same in the bacteria examined or whether there are differences dictated by separate evolutionary histories and separate selective pressures. Three enzyme families were compared in four bacteria. The families chosen for comparison were the crotonases, pyridoxal phosphate-requiring aminotransferases Class III, and thiamin diphosphate-requiring decarboxylases. The four bacteria are E. coli, Salmonella enterica subsp. enterica serovar Typhimurium LT2 (henceforth S. enterica), the distant γ-proteobacterium Pseudomonas aeruginosa PAO1 and the gram positive bacterium Bacillus subtilis subsp. subtilis strain 168 (henceforth "B. subtilis).

The families of enzymes were assembled for the three organisms using the same methods as for E. coli. Table 2, 3, and 4 list members of the aminotransferase-, decarboxylase-, and crotonase superfamilies, respectively. Known enzymes and strongly predicted enzymes present in each of the four bacteria are shown as well as the number of proteins currently of unknown function.

We note that some of the enzymes are present in all four bacteria, suggesting they are integral parts of core metabolic functions. This is supported by the pathways they participate in biotin synthesis and porphyrin synthesis (BioA and HemL), aminobutyrate utilization (GabT), pyruvate oxidation (PoxB/YdaP), and fatty acid oxidation (FadB). One supposes such commonly held important functions are conserved in many bacteria in many taxa.

Other enzymes differ in their distribution (presence or absence) among the four organisms. This is presumably a result of different evolutionary histories in different lineages during the divergence processes, leading to establishment of bacterial taxa with biochemical and metabolic differences. For example the MenD decarboxylase and MenB crotonase used for menaquinone biosynthesis are absent from P. aeruginosa and present in the other three organisms. This distribution is reflective of the Pseudomonads using only ubiquinone, and not both ubiquinone and menaquinone, as electron carriers for respiration. Gcl, tartronate-semialdehyde synthase of glyoxalate utilization, is present in three bacteria, and not in B. subtilis. Degradation of glyxolate in B. subtilis has been shown to occur by a different pathway from the other three organisms. In the two enteric organisms, their particular paths of metabolizing putrescine and carnitine are reflected in the presence of putrescine aminotransferase (PatA) and carnityl-CoA dehydratase (CaiD) in both E. coli and S. enterica.

Several of the aminotransferases are involved in arginine metabolism, and the occurrences of these enzymes also vary among the organisms. E. coli and its close relative S. enterica both have ArgD and AstC for biosynthesis and degradation of arginine, respectively. AruC is used by P. aeruginosa for both arginine synthesis and degradation. While in B. subtilis, ArgD is used for arginine synthesis and RocD, another member of the aminotransferase family, is used to degrade arginine by a different pathway. We observe that the two more closely related enteric organisms have a higher similarity in their aminotransferase content.

Some of the protein family members represent isozymes, sequence similar enzymes that catalyze the same reaction but with definable differences such as substrate breadth, feedback inhibition, binding constants, reaction rates and the like. Based on the common nature of the isozymes, we suppose they have arisen by gene duplication and slight divergence. Examples of isozymes are the trio of acetolactate synthases IlvB, IlvI and IlvG, found in E. coli and S. enterica. These isozymes function in the isoleucine and valine biosynthesis pathway, each responding to distinct feed back. One copy, IlvG, is mutated and inactive in E. coli, rendering E. coli valine sensitive. This phenotype is used in identification protocols to distinguish E. coli and S. enterica. A second type of acetolactate synthase (AlsS) is also present in B. subtilis, but this enzyme is used exclusively for catabolism and not synthesis of isoleucine and valine.

E. coli and S. enterica have another set of isozymes, FadB and FadJ. Both enzymes are used for fatty acid oxidation, but FadB is used under aerobic conditions and FadJ is used under anaerobic conditions. Other isozymes are GabT and PuuE in E. coli, GsaB and HemL in B. subtilis. Isozymes are often specific to pathways, such as PuuE, which is specific to putrescine utilization. One supposes that simply by small changes in duplicate genes, pathway content and biochemical capability of an organism can expand.

In addition there are protein family members that are unique to only one of the four organisms and absent in the other three. These enzymes often confer metabolic properties unique to their host. An example is oxalyl-CoA decarboxylase (Oxc) that is present E. coli where it is believed to confer oxalate degrading capabilities. As is the case for any of the enzymes present in one organism, not the others, the gene could have been acquired by lateral transmission [26]. However when an enzyme like oxalyl-CoA decarboxylase, is found in many bacteria, it is at least as possible that it arose by gene duplication and divergence. Other organism specific enzymes, in this case B. subtilis, include the IolD for myo-inositol degradation and the crotonases PksH and PksI used for polyketide synthesis. Polyketides are a group of secondary products peculiar to the Bacilli. Other unique B. subtilis enzymes AlsS, GsaB and RocD have been mentioned above. It seems evident that formation of different enzymes by unique divergence events, add up to creation of taxa with different metabolic characteristics.

P. aeruginosa has the largest number of unique, or organism specific, enzymes in our dataset. This is shown for all three enzyme families (Tables 2, 3, 4). These Pseudomonas specific enzymes include synthesis of the siderophore pyoverdine (PvdH), and utilization of mandelate (MdlC), leucine and isovalerate (LiuC) and acyclic terpenes (AtuE). Other predicted family members include two aminotransferases: PA5313, evidently an isozyme for 4-aminobutyrate, and OapT, likely a beta-alanine:pyruvate enzyme. Each of these enzymes contributes to the distinct metabolic character of P. aeruginosa as a pseudomonad. In addition there are 5 aminotransferases, 5 decraboxylases and 14 crotonases whose functions remain unknown in P. aeruginosa. Our phylogenetic analysis [9] suggests that these are unique enzymes representing additional functions yet to be discovered. Combining genes of known and unknown function for the three families, the number of unique P. aeruginosa genes (33) far surpasses that of B. subtilis (12), E. coli (2) and S. enterica (1). The large number of Pseudomonas specific enzymes detected is in agreement with the well-documented metabolic versatility of this group [27, 28].

These examples of differences among enzyme families in four organisms suggest that the distinct events of divergence in genes of protein families over time have generated taxa of bacteria that are distinguished in part by their metabolic differences. Bacteria that are closely related have fewer differences in these families. For all three enzyme families we noted that the two most closely related organisms, E. coli and S. enterica, contain the most similar complement of enzymes. Larger differences in both number of dissimilar enzymes and enzyme functions were seen when comparing either B. subtilis or P. aeruginosa to any of the other three.

Overall, our protein family analysis includes several examples of how the functional and metabolic diversity of today's organisms is reflected in a history of duplicated and diverged gene copies in their genome sequences. In some instances the gene copies are the same in all the bacteria. These are enzymes for universal functions. Some of the gene copies did not undergo much divergence and resulted in isozymes catalyzing the same reactions but with different properties. Such enzymes usually contribute to phenotypic differences, for instance by changes in substrate specificity or regulation. Still other gene copies were not found in other bacteria. These were functions characteristic of the phenotype of the particular organism. We do not suggest that duplication of genes was the only source of diversity in these organisms. In addition there lateral transfer could have introduced a new function and also gene losses would have changed the composition of protein families. Some analyses suggest that lateral gene transfer has played a large role in assembling gene families [29]. However one needs to take into account the lack of congruence between organism trees and gene trees, the latter being affected by different selective pressures on individual enzymes (such as gene family composition, cofactor/substrate availablility) compared to those affecting the organism as a whole. Lawrence and Hendrickson [30] have discussed in a thoughtful way the difficulties in distinguishing horizontal transmission from duplication of existing genes. We have therefore not attempted to identify laterally transferred genes in our enzyme families. While possibly there we do not expect them to predominate. In summary, it is a combination of all these genetic changes (duplications, divergence, loss and acquisitions) in ancestors of contemporary organisms that has generated the characteristic phenotypes of today's organisms.

The Flea Kind and Unique Anatomy

There are 1,830 different kinds of fleas known throughout the world. They commonly appear on cats, dogs, and other pets. Today, we know them as ectoparasites and vectors of disease, plague, and other pestilence.

Fleas are small (1.5 to 3.3 mm long), laterally flattened wingless insects forming the order Siphonaptera (figs. 3 and 4). Fleas have well-designed hind jumping legs and mouthparts with a prominent proboscis adapted for piercing tissues (probably plants in pre-Fall era) such as the skin and sucking blood (probably plant juices in pre-Fall era). They eat plants, detritus, and organic matter as larvae and stages prior to becoming adults. The adults primarily need blood for producing eggs.

Figure 3. Basic Flea Anatomy. Image reproduced from Wikimedia Commons.

Figure 4. Advanced Flea Anatomy. Image reproduced from Wikimedia Commons.

Fleas have a unique organ called the pygidium (plural pygidia), the terminal segment of the flea designed to detect air currents. Pygidia may help in “hitchhiking” on a host (Roberts, Janovy, and Nadler, 2013). Yet, as one looks at a flea under the microscope, the most impressive feature is its exceptionally long hind legs. Most fleas are capable of vertical jumping about a foot (some fleas as high as 34 inches!), or about 150 times its own length. This is certainly worth an Olympic medal, and would be the equivalent of a human leaping up over 1,000 feet. It can jump horizontally about 13 inches, or 100 times its body length (Marquardt, Demaree, and Grieve, 2000). Lyon (2007) reports some can “long jump” horizontally up to 200 times their body length. The hind legs of fleas are linked to a zone where kinetic energy for jumping is released. The “spring” of the flea comes from an amazing “rubber-like” cross-linked protein called resilin. This substance gives fleas the capacity to easily leap upon and hitchhike with a mammal or bird host.

Most evolutionary biologists believe that fleas once had wings that were lost through time while larger hind legs evolved and that they are descendants of Mecoptera (scorpion flies). But we believe they were specially created as a distinct “kind”—designed fully formed to jump (not fly) and travel via an animal host. Fossil fleas consist mainly of modern-looking species that supposedly go back 65 million years. A possible large, non-jumping variant has been found as low as the Jurassic.1 Fossils document that fleas have always been fleas (Rothschild et al., 1973).

Mechanisms of Genetic Variation | Evolution | Species | Biology

In this article we will discuss about the mechanisms that decrease and increase genetic variation.

Mechanisms that Decrease Genetic Variation:

Some types of organisms within a population leave more offspring than others. Over time, the frequency of the more prolific type will increase. The difference in reproductive capability is called natural selection. Natural selection is the only mechanism of adaptive evolution it is defined as differential reproductive success of pre­existing classes of genetic variants in the gene pool.

The most common action of natural selection is to remove unfit variants as they arise via mutation. In other words, natural selection usually prevents new alleles from increasing its frequency. This led a famous evolutionist, George Williams, to say “Evolution proceeds in spite of natural selection.”

Natural selection can maintain or deplete genetic variation depending on how it acts. When selection acts to weed out deleterious alleles, or causes an allele to sweep to fixation, it depletes genetic variation. When heterozygotes are fit than either of the homozygotes, however, selection causes genetic variation to be maintained.

This is called balancing selection. An example of this is the maintenance of sickle-cell alleles in human populations subject to malaria. Variation at a single locus determines whether red blood cells are shaped normally or sickled. If a human has two alleles for sickle- cell, he/she develops anemia — the shape of sickle-cells precludes them carrying normal levels of oxygen.

However, heterozygotes that have one copy of the sickle-cell allele, coupled with one normal allele enjoy some resistance to malaria —the shape of sickled cells makes it harder for the plasmodia (malaria causing agents) to enter the cell. Thus, individuals homozygous for the normal allele suffer more malaria than heterozygotes.

Individuals homozygous for the sickle- cell are anemic. Heterozygotes have the highest fitness of these three types. Heterozygotes pass on both sickle- cell and normal alleles to the next generation. Thus, neither allele can be eliminated from the gene pool. The sickle-cell allele is at its highest frequency in regions of Africa where malaria is most pervasive.

Balancing selection is rare in natural populations. Only a handful of other cases beside the sickle-cell example have been found. At one time population geneticists thought balancing selection could be a general explanation for the levels of genetic variation found in natural populations.

That is no longer the case. Balancing selection is only rarely found in natural populations. And, there are theoretical reasons why natural selection cannot maintain polymorphisms at several loci via balancing selection.

Individuals are selected. Dark colored moths had a higher reproductive success because light colored moths suffered a higher predation rate. The decline of light colored alleles was caused by light colored individuals being removed from the gene pool (selected against). Individual organisms either reproduce or fail to reproduce and are hence the unit of selection.

One way alleles can change in frequency is to be housed in organisms with different reproductive rates. Genes are not the unit of selection (because their success depends on the organism’s other genes as well) neither are groups of organisms a unit of selection. There are some exceptions to this “rule” but it is a good generalization.

Organisms do not perform any behaviours that are for the good of their species. An individual organism competes primarily with others of its own species for its reproductive success. Natural selection favors selfish behaviour because any truly altruistic act increases the recipient’s reproductive success while lowering the donors.

Altruists would disappear from a population as the non- altruists would reap the benefits, but not pay the costs, of altruistic acts. Many behaviours appear altruistic. Biologists, however, can demonstrate that these behaviours are only apparently altruistic. Cooperating with or helping other organisms is often the most selfish strategy for an animal. This is called reciprocal altruism.

A good example of this is blood sharing in vampire bats. In these bats, those who are lucky enough to find a meal will often share part of it with an unsuccessful bat by regurgitating some blood into the other’s mouth.

Biologists have found that these bats form bonds with partners and help each other out when the other is needy. If a bat is found to be a “cheater,” his partner will abandon him. The bats are thus not helping each other altruistically they form pacts that are mutually beneficial.

Helping closely related organisms can appear altruistic but this is also a selfish behaviour. Reproductive success (fitness) has two components direct fitness and indirect fitness. Direct fitness is a measure of how many alleles, on average, a genotype contributes to the subsequent generation’s gene pool by reproducing.

Indirect fitness is a measure of how many alleles identical to its own it helps to enter the gene pool. Direct fitness plus indirect fitness is inclusive fitness. J. B. S. Haldane once remarked he would gladly drown, if by doing so he saved two siblings or eight cousins. Each of his siblings would share one half his alleles his cousins, one eighth. They could potentially add as many of his alleles to the gene pool as he could.

Natural selection favors traits or behaviours that increase a genotype’s inclusive fitness. Closely related organisms share many of the same alleles. In diploid species, siblings share on average at least 50% of their alleles. The percentage is higher if the parents are related. So, helping close relatives to reproduce gets an organism’s own alleles better represented in the gene pool.

The benefit of helping relatives increases dramatically in highly inbred species. In some cases, organisms will completely forgo reproducing and only help their relatives reproduce. Ants, and other eusocial insects, have sterile castes that only serve the queen and assist her reproductive efforts. The sterile workers are reproducing by proxy.

The words selfish and altruistic have connotations in everyday use that biologists do not intend. Selfish simply means behaving in such a way that one’s own inclusive fitness is maximized altruistic means behaving in such a way that another’s fitness is increased at the expense of ones’ own. Use of the words selfish and altruistic is not meant to imply that organisms consciously understand their motives.

The opportunity for natural selection to operate does not induce genetic variation to appear — selection only distinguishes between existing variants. Variation is not possible along every imaginable axis, so all possible adaptive solutions are not open to populations. To pick a somewhat ridiculous example, a steel shelled turtle might be an improvement over regular turtles.

Turtles are killed quite a bit by cars these days because when confronted with danger, they retreat into their shells — this is not a great strategy against a two ton automobile. However, there is no variation in metal content of shells, so it would not be possible to select for a steel shelled turtle.

Here is a second example of natural selection. Geospiza fortis lives on the Galapagos islands along with fourteen other finch species. It feeds on the seeds of the plant Tribulus cistoides, specializing on the smaller seeds. Another species, G. Magnirostris, has a larger beak and specializes on the larger seeds.

The health of these bird populations depends on seed production. Seed production, in turn, depends on the arrival of wet season. In 1977, there was a drought. Rainfall was well below normal and fewer seeds were produced. As the season progressed, the G. fortis population depleted the supply of small seeds. Eventually, only larger seeds remained.

Most of the finches starved the population plummeted from about twelve hundred birds to less than two hundred. Peter Grant, who had been studying these finches, noted that larger beaked birds fared better than smaller beaked ones. These larger birds had offspring with correspondingly large beaks.

Thus, there was an increase in the proportion of large beaked birds in the population the next generation. To prove that the change in bill size in Geospiza fortis was an evolutionary change, Grant had to show that differences in bill size were at least partially genetically based.

He did so by crossing finches of various beak sizes and showing that a finch’s beak size was influenced by its parent’s genes. Large beaked birds had large beaked offspring beak size was not due to environmental differences (in parental care, for example).

Natural selection may not lead a population to have the optimal set of traits. In any population, there would be a certain combination of possible alleles that would produce the optimal set of traits (the global optimum) but there are other sets of alleles that would yield a population almost as adapted (local optima).

Transition from a local optimum to the global optimum may be hindered or forbidden because the population would have to pass through less adaptive states to make the transition. Natural selection only works to bring populations to the nearest optimal point. This idea is Sewall Wright’s adaptive landscape. This is one of the most influential models that shape how evolutionary biologists view evolution.

Natural selection does not have any foresight. It only allows organisms to adapt to their current environment. Structures or behaviours do not evolve for future utility. An organism adapts to its environment at each stage of its evolution. As the environment changes, new traits may be selected for.

Large changes in populations are the result of cumulative natural selection. Changes are introduced into the population by mutation the small minority of these changes that result in a greater reproductive output of their bearers are amplified in frequency by selection.

Complex traits must evolve through viable intermediates. For many traits, it initially seems unlikely that intermediates would be viable. What good is half a wing? Half a wing may be no good for flying, but it may be useful in other ways. Feathers are thought to have evolved as insulation (ever worn a down jacket?) and/or as a way to trap insects.

Later, proto-birds may have learned to glide when leaping from tree to tree. Eventually, the feathers that originally served as insulation now became co-opted for use in flight. A trait’s current utility is not always indicative of its past utility. It can evolve for one purpose, and be used later for another.

A trait evolved for its current utility is an adaptation one that evolved for another utility is an exaptation. An example of an exaptation is a penguin’s wing. Penguins evolved from flying ancestors now they are flightless and use their wings for swimming.

In many species, males develop prominent secondary sexual characteristics. A few often cited examples are the peacock’s tail coloring and patterns in male birds in general, voice calls in frogs and flashes in fireflies. Many of these traits are a liability from the standpoint of survival. Any ostentatious trait or noisy, attention getting behaviour will alert predators as well as potential mates. How then could natural selection favor these traits?

Natural selection can be broken down into many components, of which survival is only one. Sexual attractiveness is a very important component of selection, so much so that biologists use the term sexual selection when they talk about this subset of natural selection.

Sexual Selection is natural selection operating on factors that contribute to an organism’s mating success. Traits that are a liability to survival can evolve when the sexual attractiveness of a trait outweighs the liability incurred for survival. A male who lives a short time, but produces many offspring is much more successful than a long lived one that produces few.

The former’s genes will eventually dominate the gene pool of his species. In many species, especially polygynous species where only a few males monopolize all the females, sexual selection has caused pronounced sexual dimorphism.

In these species males compete against other males for mates. The competition can be either direct or mediated by female choice. In species where females choose, males compete by displaying striking phenotypic characteristics and/or performing elaborate courtship behaviours.

The females then mate with the males that most interest them, usually the ones with the most outlandish displays. There are many competing theories as to why females are attracted to these displays.

The good genes model states that the display indicates some component of male fitness. A good genes advocate would say that bright coloring in male birds indicates a lack of parasites. The females are cueing on some signal that is correlated with some other component of viability.

Selection for good genes can be seen in sticklebacks. In these fish, males have red coloration on their sides. Milinski and Bakker showed that intensity of color was correlated to both parasite load and sexual attractiveness. Females preferred redder males. The redness indicated that he was carrying fewer parasites.

Evolution can get stuck in a positive feedback loop. Another model to explain secondary sexual characteristics is called the runaway sexual selection model. R. A. Fisher proposed that females may have an innate preference for some male trait before it appears in a population.

Females would then mate with male carriers when the trait appears. The offspring of these matings have the genes for both the trait and the preference for the trait. As a result, the process snowballs until natural selection brings it into check. Suppose that female birds prefer males with longer than average tail feathers.

Mutant males with longer than average feathers will produce more offspring than the short feathered males. In the next generation, average tail length will increase. As the generations progress, feather length will increase because females do not prefer a specific length tail, but a longer than average tail.

Eventually tail length will increase to the point where the liability to survival is matched by the sexual attractiveness of the trait and an equilibrium will be established. Note that in many exotic birds male plumage is often very showy and many species do in fact have males with greatly elongated feathers. In some cases these feathers are shed after the breeding season.

None of the above models are mutually exclusive. There are millions of sexually dimorphic species on this planet and the forms of sexual selection probably vary amongst them.

Mechanisms that Increase Genetic Variation:

The cellular machinery that copies DNA sometimes makes mistakes. These mistakes alter the sequence of a gene. This is called a mutation. There are many kinds of mutations. A point mutation is a mutation in which one “letter” of the genetic code is changed to another. Lengths of DNA can also be deleted or inserted in a gene these are also mutations. Finally, genes or parts of genes can become inverted or duplicated.

Typical rates of mutation are between 10 -10 and 10 -12 mutations per base pair of DNA per generation.

Most mutations are thought to be neutral with regards to fitness. Only a small portion of the genome of eukaryotes contains coding segments. And although some non-coding DNA is involved in gene regulation or other cellular functions, it is probable that most base changes would have no fitness consequence.

Most mutations that have any phenotypic effect are deleterious. Mutations that result in amino acid substitutions can change the shape of a protein, potentially changing or eliminating its function. This can lead to inadequacies in biochemical pathways or interfere with the process of development.

Organisms are sufficiently integrated that most random changes will not produce a fitness benefit. Only a very small percentage of mutations are beneficial. The ratio of neutral to deleterious to beneficial mutations is unknown and probably varies with respect to details of the locus in question and environment.

Mutation limits the rate of evolution. The rate of evolution can be expressed in terms of nucleotide substitutions in a lineage per generation. Substitution is the replacement of an allele by another in a population.

This is a two-step process- First a mutation occurs in an individual, creating a new allele. This allele subsequently increases in frequency to fixation in the population. The rate of evolution is k = 2Nvu (in diploids) where k is nucleotide substitutions, N is the effective population size, v is the rate of mutation and u is the proportion of mutants that eventually fix in the population.

Mutation need not be limiting over short time spans. The rate of evolution expressed above is given as a steady state equation it assumes the system is at equilibrium. Given the time frames for a single mutant to fix, it is unclear if populations are ever at equilibrium. A change in environment can cause previously neutral alleles to have selective values in the short term evolution can run on “stored” variation and thus is independent of mutation rate.

Other mechanisms can also contribute selectable variation. Recombination creates new combinations of alleles (or new alleles) by joining sequences with separate microevolutionary histories within a population. Gene flow can also supply the gene pool with variants. Of course, the ultimate source of these variants is mutation.

Mutation creates new alleles. Each new allele enters the gene pool as a single copy amongst many. Most are lost from the gene pool, the organism carrying them fails to reproduce, or reproduces but does not pass on that particular allele. A mutant’s fate is shared with the genetic background it appears in.

A new allele will initially be linked to other loci in its genetic background, even loci on other chromosomes. If the allele increases in frequency in the population, initially it will be paired with other alleles at that locus—the new allele will primarily be carried in individuals heterozygous for that locus.

The chance of it being paired with itself is low until it reaches intermediate frequency. If the allele is recessive, its effect won’t be seen in any individual until a homozygote is formed. The eventual fate of the allele depends on whether it is neutral, deleterious or beneficial.

Most neutral alleles are lost soon after they appear. The average time (in generations) until loss of a neutral allele is 2(Ne/N) 1n (2N) where N is the effective population size (the number of individuals contributing to the next generation’s gene pool) and N is the total population size.

Only a small percentage of alleles fix. Fixation is the process of an allele increasing to a frequency at or near one. The probability of a neutral allele fixing in a population is equal to its frequency. For a new mutant in a diploid population, this frequency is 1/2N.

If mutations are neutral with respect to Fitness, the rate of substitution (k) is equal to the rate of mutation (v). This does not mean every new mutant eventually reaches fixation. Alleles are added to the gene pool by mutation at the same rate they are lost to drift. For neutral alleles that do fix, it takes an average of 4N generations to do so.

However, at equilibrium there are multiple alleles segregating in the population. In small populations, few mutations appear each generation. The ones that fix do so quickly relative to large populations. In large populations, more mutants appear over the generations. But, the ones that fix take much longer to do so. Thus, the rate of neutral evolution (in substitutions per generation) is independent of population size.

The rate of mutation determines the level of heterozygosity at a locus according to the neutral theory. Heterozygosity is simply the proportion of the population that is heterozygous. Equilibrium heterozygosity is given as H = 4Nv/ [4Nv+1 ] (for diploid populations). H can vary from a very small number to almost one.

In small populations, H is small (because the equation is approximately a very small number divided by one). In (biologically unrealistically) large populations, heterozygosity approaches one (because the equation is approximately a large number divided by itself).

Directly testing this model is difficult because N and v can only be estimated for most natural populations. But, heterozygosities are believed to be too low to be described by a strictly neutral model. Solutions offered by neutralists for this discrepancy include hypothesizing that natural populations may not be at equilibrium.

At equilibrium there should be a few alleles at intermediate frequency and many at very low frequencies. This is the Ewens- Watterson distribution. New alleles enter a population every generation, most remain at low frequency until they are lost. A few drift to intermediate frequencies, a very few drift all the way to fixation.

In Drosophila pseudoobscura, the protein Xanthine dehydrogenase (Xdh) has many variants. In a single population, Keith, et. al., found that 59 of 96 proteins were of one type, two others were represented ten and nine times and nine other types were present singly or in low numbers.

iv. Deleterious Alleles:

Deleterious mutants are selected against but remain at low frequency in the gene pool. In diploids, a deleterious recessive mutant may increase in frequency due to drift. Selection cannot see it when it is masked by a dominant allele. Many disease causing alleles remain at low frequency for this reason.

People who are carriers do not suffer the negative effect of the allele. Unless they mate with another carrier, the allele may simply continue to be passed on. Deleterious alleles also remain in populations at a low frequency due to a balance between recurrent mutation and selection. This is called the mutation load.

Most new mutants are lost, even beneficial ones. Wright calculated that the probability of fixation of a beneficial allele is 2s. (This assumes a large population size, a small fitness benefit, and that heterozygotes have an intermediate fitness. A benefit of 2s yields an overall rate of evolution- k=4Nvs where v is the mutation rate to beneficial alleles).

An allele that conferred a one percent increase in fitness only has a two percent chance of fixing. The probability of fixation of beneficial type of mutant is boosted by recurrent mutation. The beneficial mutant may be lost several times, but eventually it will arise and stick in a population. (Recall that even deleterious mutants recur in a population.)

Directional selection depletes genetic variation at the selected locus as the fitter allele sweeps to fixation. Sequences linked to the selected allele also increase in frequency due to hitchhiking. The lower the rate of recombination, the larger the window of sequence that hitchhikes. Begun and Aquadro compared the level of nucleotide polymorphism within and between species with the rate of recombination at a locus.

Low levels of nucleotide polymorphism within species coincided with low rates of recombination. This could be explained by molecular mechanisms if recombination itself was mutagenic. In this case, recombination with also be correlated with nucleotide divergence between species.

But, the level of sequence divergence did not correlate with the rate of recombination. Thus, they inferred that selection was the cause. The correlation between recombination and nucleotide polymorphism leaves the conclusion that selective sweeps occur often enough to leave an imprint on the level of genetic variation in natural populations.

One example of a beneficial mutation comes from the mosquito Culex pipiens. In this organism, a gene that was involved with breaking down organophosphates – common insecticide ingredients -became duplicated. Progeny of the organism with this mutation quickly swept across the worldwide mosquito population.

There are numerous examples of insects developing resistance to chemicals, especially DDT which was once heavily used in this country. And, most importantly, even though “good” mutations happen much less frequently than “bad” ones, organisms with “good” mutations thrive while organisms with “bad” ones die out.

If beneficial mutants arise infrequently, the only fitness differences in a population will be due to new deleterious mutants and the deleterious recessives. Selection will simply be weeding out unfit variants. Only occasionally will a beneficial allele be sweeping through a population.

The general lack of large fitness differences segregating in natural populations argues that beneficial mutants do indeed arise infrequently. However, the impact of a beneficial mutant on the level of variation at a locus can be large and lasting. It takes many generations for a locus to regain appreciable levels of heterozygosity following a selective sweep.

Have you heard? A revolution has seized the scientific community. Within only a few years, research labs worldwide have adopted a new technology that facilitates making specific changes in the DNA of humans, other animals, and plants. Compared to previous techniques for modifying DNA, this new approach is much faster and easier. This technology is referred to as “CRISPR,” and it has changed not only the way basic research is conducted, but also the way we can now think about treating diseases [1,2].

What is CRISPR

CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeat. This name refers to the unique organization of short, partially palindromic repeated DNA sequences found in the genomes of bacteria and other microorganisms. While seemingly innocuous, CRISPR sequences are a crucial component of the immune systems [3] of these simple life forms. The immune system is responsible for protecting an organism’s health and well-being. Just like us, bacterial cells can be invaded by viruses, which are small, infectious agents. If a viral infection threatens a bacterial cell, the CRISPR immune system can thwart the attack by destroying the genome of the invading virus [4]. The genome of the virus includes genetic material that is necessary for the virus to continue replicating. Thus, by destroying the viral genome, the CRISPR immune system protects bacteria from ongoing viral infection.

How does it work?

The steps of CRISPR-mediated immunity. CRISPRs are regions in the bacterial genome that help defend against invading viruses. These regions are composed of short DNA repeats (black diamonds) and spacers (colored boxes). When a previously unseen virus infects a bacterium, a new spacer derived from the virus is incorporated amongst existing spacers. The CRISPR sequence is transcribed and processed to generate short CRISPR RNA molecules. The CRISPR RNA associates with and guides bacterial molecular machinery to a matching target sequence in the invading virus. The molecular machinery cuts up and destroys the invading viral genome. Figure adapted from Molecular Cell 54, April 24, 2014 [5].

Interspersed between the short DNA repeats of bacterial CRISPRs are similarly short variable sequences called spacers (FIGURE 1). These spacers are derived from DNA of viruses that have previously attacked the host bacterium [3]. Hence, spacers serve as a ‘genetic memory’ of previous infections. If another infection by the same virus should occur, the CRISPR defense system will cut up any viral DNA sequence matching the spacer sequence and thus protect the bacterium from viral attack. If a previously unseen virus attacks, a new spacer is made and added to the chain of spacers and repeats.

The CRISPR immune system works to protect bacteria from repeated viral attack via three basic steps [5]:

Step 1) Adaptation – DNA from an invading virus is processed into short segments that are inserted into the CRISPR sequence as new spacers.

Step 2) Production of CRISPR RNA – CRISPR repeats and spacers in the bacterial DNA undergo transcription, the process of copying DNA into RNA (ribonucleic acid). Unlike the double-chain helix structure of DNA, the resulting RNA is a single-chain molecule. This RNA chain is cut into short pieces called CRISPR RNAs.

Step 3) Targeting – CRISPR RNAs guide bacterial molecular machinery to destroy the viral material. Because CRISPR RNA sequences are copied from the viral DNA sequences acquired during adaptation, they are exact matches to the viral genome and thus serve as excellent guides.

The specificity of CRISPR-based immunity in recognizing and destroying invading viruses is not just useful for bacteria. Creative applications of this primitive yet elegant defense system have emerged in disciplines as diverse as industry, basic research, and medicine.

What are some applications of the CRISPR system?

The inherent functions of the CRISPR system are advantageous for industrial processes that utilize bacterial cultures. CRISPR-based immunity can be employed to make these cultures more resistant to viral attack, which would otherwise impede productivity. In fact, the original discovery of CRISPR immunity came from researchers at Danisco, a company in the food production industry [2,3]. Danisco scientists were studying a bacterium called Streptococcus thermophilus, which is used to make yogurts and cheeses. Certain viruses can infect this bacterium and damage the quality or quantity of the food. It was discovered that CRISPR sequences equipped S. thermophilus with immunity against such viral attack. Expanding beyond S. thermophilus to other useful bacteria, manufacturers can apply the same principles to improve culture sustainability and lifespan.

Beyond applications encompassing bacterial immune defenses, scientists have learned how to harness CRISPR technology in the lab [6] to make precise changes in the genes of organisms as diverse as fruit flies, fish, mice, plants and even human cells. Genes are defined by their specific sequences, which provide instructions on how to build and maintain an organism’s cells. A change in the sequence of even one gene can significantly affect the biology of the cell and in turn may affect the health of an organism. CRISPR techniques allow scientists to modify specific genes while sparing all others, thus clarifying the association between a given gene and its consequence to the organism.

Rather than relying on bacteria to generate CRISPR RNAs, scientists first design and synthesize short RNA molecules that match a specific DNA sequence—for example, in a human cell. Then, like in the targeting step of the bacterial system, this ‘guide RNA’ shuttles molecular machinery to the intended DNA target. Once localized to the DNA region of interest, the molecular machinery can silence a gene or even change the sequence of a gene (Figure 2)! This type of gene editing can be likened to editing a sentence with a word processor to delete words or correct spelling mistakes. One important application of such technology is to facilitate making animal models with precise genetic changes to study the progress and treatment of human diseases.

Gene silencing and editing with CRISPR. Guide RNA designed to match the DNA region of interest directs molecular machinery to cut both strands of the targeted DNA. During gene silencing, the cell attempts to repair the broken DNA, but often does so with errors that disrupt the gene—effectively silencing it. For gene editing, a repair template with a specified change in sequence is added to the cell and incorporated into the DNA during the repair process. The targeted DNA is now altered to carry this new sequence.

With early successes in the lab, many are looking toward medical applications of CRISPR technology. One application is for the treatment of genetic diseases. The first evidence that CRISPR can be used to correct a mutant gene and reverse disease symptoms in a living animal was published earlier this year [7]. By replacing the mutant form of a gene with its correct sequence in adult mice, researchers demonstrated a cure for a rare liver disorder that could be achieved with a single treatment. In addition to treating heritable diseases, CRISPR can be used in the realm of infectious diseases, possibly providing a way to make more specific antibiotics that target only disease-causing bacterial strains while sparing beneficial bacteria [8]. A recent SITN Waves article discusses how this technique was also used to make white blood cells resistant to HIV infection [9].

The Future of CRISPR

Of course, any new technology takes some time to understand and perfect. It will be important to verify that a particular guide RNA is specific for its target gene, so that the CRISPR system does not mistakenly attack other genes. It will also be important to find a way to deliver CRISPR therapies into the body before they can become widely used in medicine. Although a lot remains to be discovered, there is no doubt that CRISPR has become a valuable tool in research. In fact, there is enough excitement in the field to warrant the launch of several Biotech start-ups that hope to use CRISPR-inspired technology to treat human diseases [8].

Ekaterina Pak is a Ph.D. student in the Biological and Biomedical Sciences program at Harvard Medical School.


2. Pennisi, E. The CRISPR Craze. (2013) Science, 341 (6148): 833-836.

3. Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P., Moineau, S., Romero, D.A., and Horvath, P. (2007). CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712.

4. Brouns, S.J., Jore, M.M., Lundgren, M., Westra, E.R., Slijkhuis, R.J., Snijders, A.P., Dickman, M.J., Makarova, K.S., Koonin, E.V., and van der Oost, J. (2008). Small CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964.

5. Barrangou, R. and Marraffini, L. CRISPR-Cas Systems: Prokaryotes Upgrade to Adaptive Immunity (2014). Molecular Cell 54, 234-244.

6. Jinkek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. (2012) 337(6096):816-21.

What is the latest research on autism?

Doctors have defined autism spectrum disorder (ASD) as a neurobiological developmental condition that can impact communication, sensory processing, and social interactions. Although recent research has advanced the understanding of autism, there is much more to learn about the factors that influence this neurotype.

Share on Pinterest Michele Pevide/Getty Images

As of March 26, 2021, the Centers for Disease Control and Prevention (CDC) report that among 8-year-old children, one in 54 are autistic. This number has increased from the one in 59 prevalence reported in previous estimates.

With autism rates on the increase, the scientific community has become all the more interested in uncovering the factors linked with autism.

Some scientists speculate that gene variants cause autism, while others believe environmental factors , such as exposure to toxins , contribute to this neurotype. Still others theorize imbalances in the intestinal microbiome may be at play.

The latest autism research includes investigations into factors associated with this neurotype, as well as genetic variants, gut biome imbalances, and neurological factors that may contribute to it.

In this Special Feature, Medical News Today examines the latest scientific discoveries and what researchers have learned about autism.

A multiyear study funded by the CDC is underway to learn more about factors potentially linked to autism.

The Study to Explore Early Development is a collaboration between six study sites in the United States. These sites are part of the Autism and Developmental Disabilities Research and Epidemiology network and focus on children aged 2–5 years.

One of the goals of the study is discovering what health conditions occur in autistic and neurotypical children and what factors are associated with the likelihood of developing ASD.

Another objective of the study is to differentiate the physical and behavioral characteristics of autistic children, children with other developmental conditions, and those without these conditions.

This ongoing research has already produced several published studies. The latest found an association between ASD and a mother’s exposure to ozone pollution during the third trimester of pregnancy.

Researchers also found that exposure to another type of air pollution called particulate matter during an infant’s first year also increased the likelihood of the infant later receiving a diagnosis of ASD.

This research appears in the journal Epidemiology .

Other avenues of research on autism include investigations into gene variants that could play a role in the development of ASD.

A recent study analyzed the DNA of more than 35,584 people worldwide, including 11,986 autistic individuals. The scientists identified variants in 102 genes linked with an increased probability of developing ASD.

The researchers also discovered that 53 of the genes identified were mostly associated with autism and not other developmental conditions.

Expanding the research further, the team found that autistic people who carried the ASD-specific gene variants showed increased intellectual function compared with autistic individuals who did not have the variants.

The gene variants the scientists identified mainly reside in the cerebral cortex, which is responsible for complex behaviors.

These variants may play a role in how the brain neurons connect and also help turn other genes on or off — a possible factor that may contribute to autism.

Biological research has unearthed some interesting findings linking certain types of cell malfunctions to ASD.

Scientists at the Lieber Institute for Brain Development in Baltimore, MD, discovered a decrease in the integrity of myelin, a protective sheath surrounding nerve cells in the brain, in mice with a syndromic form of ASD.

The study, published in Nature Neuroscience , showed a gene variant-based malfunction in oligodendrocytes , which are cells that produce myelin.

This malfunction may lead to insufficient myelin production in the nerve cells and disrupt nerve communication in the brain, impairing brain development.

Using mouse models, researchers are now investigating treatments that could increase the myelination in the brain to see whether this improves ASD-associated behaviors that individuals may find challenging.

The gastrointestinal, or gut, microbiome is another area of interest to researchers looking for factors that contribute to autism.

Several studies have established a link between imbalances in the gut biome and ASD. There is also growing evidence that balancing the populations of gut microbes can help correct these disparities and improve some of the unwanted symptoms and behaviors linked to autism.

One 2017 study, published in the journal Microbiome, investigated whether microbiota transfer therapy (MTT) in autistic children improved gut microbiota diversity and symptoms associated with autism.

Investigators found that, after the MTT treatment, participants experienced more gut bacterial diversity.

Also observed in the participants treated with MTT was a decrease in gastrointestinal (GI) symptoms, as well as improved language, social interaction, and behavioral symptoms.

In a 2-year follow-up investigation , researchers found that the participants who received MTT treatment still experienced fewer GI issues and a continued improvement of autism-related symptoms.

Scientists have also recently discovered a possible connection between genes and the gut microbiome.

A study published earlier this month found that mice that lack CNTNAP2, a gene linked to autism, have an unusual population of microbes in their intestines. They also displayed some social behaviors similar to those seen in some autistic people.

When the mice were treated with Lactobacillus reuteri, a common bacterium missing from their microbiome, and a strain of gut bacteria commonly found in wild-type mice, their social behaviors improved.

Autism can be challenging to detect, especially in very young children. Research has shown early diagnosis and treatment interventions can lead to better long-term outcomes for autistic people.

Because of this, the scientific community is working toward finding innovative diagnostic methods that can help detect this neurotype much earlier.

Hearing tests may be one such diagnostic tool. Researchers from Harvard Medical School in Boston, MA, and the University of Miami analyzed data from auditory brainstem response (ABR) hearing tests routinely given to infants shortly after birth in the state of Florida.

The team then matched the data to the Florida Department of Education records of those children who later received a diagnosis of a developmental condition.

Results showed that infants who later received an ASD diagnosis had slower brain responses to sounds during their ABR tests conducted at birth.

This study appears in the journal Autism Research .

The investigators hope to conduct more studies to determine whether the ABR test could help recognize autism at an early age.

Further advancements in recognizing autism include new research into biomarkers.

When analyzing data from the Children’s Autism Metabolome Project (CAMP), a team of researchers found metabotypes associated with autism in 357 children aged 18–48 months.

After optimizing these and previously discovered metabotypes into screening tests, the research team detected autism in 53% of the participants in the CAMP study.

Study author Elizabeth L. R. Donley of Stemina Biomarker Discovery in Madison, WI, told MNT:

“Our approach to understanding the biology of autism is going to revolutionize how we diagnose and treat autism. Autism is diagnosed through behavioral assessment, but there are underlying biological reasons for the disruptions in neurodevelopment that result in the behaviors of autism.”

Donley said the differences her team has identified in the metabolism of autistic children can provide insight into more specific treatment options where necessary.

“The first metabolic subtypes we published from our clinical study, the [CAMP], may be addressable with a supplement. The biology of other subtypes may be targets for drugs or new indications for existing drugs,” Donley explained.

She added, “[O]ur approach identifies where dysregulation is occurring in the biology of the child so that therapies that address this biology can be prioritized rather than just trying anything and everything with no precision.”

The research team has already validated the first three of five planned panels that can identify subtypes of metabolism associated with autism. They expect to validate the remaining panels this year and begin the first clinical study of a paired therapy.

With the prevalence of autism on the rise, scientists persevere to uncover what factors are associated with this neurotype.

Their hope is that, once they identify causative factors, researchers could then develop screening tests for earlier detection and more targeted treatments for symptoms and health conditions related to autism.

At the same time, nonprofit advocacy organizations run by autistic people, such as the Autistic Self Advocacy Network (ASAN), warn against regarding autism itself as something to be “treated” or “cured.”

The ASAN states that “[m]ost self-advocates agree that autism does not need to be cured. Instead of wasting time and money on something that is not possible and that autistic people do not want, we should focus on supporting autistic people to live good lives.”

“The most important thing,” the ASAN adds, “is that any therapy should help autistic people get what we want and need, not what other people think we need. Good therapies focus on helping us figure out our goals, and work with us to achieve them.”

The CRISPR War Raging Inside Bacteria

(Inside Science) -- The CRISPR-Cas system is a highly accurate gene-editing tool that genetic engineers have adopted from bacteria. The engineers use it to create genetically modified organisms and even treat genetic disease. But humans are not the first to adapt this system for their own ends -- inside bacteria, CRISPR has been co-opted into the ongoing fight between free-floating rings of DNA within a bacteria's cell, called plasmids.

The CRISPR-Cas system evolved in bacteria and another type of single-celled organism, called archaea, as an adaptive immune system to fight against invaders like viruses. It's similar to how our antibodies protect us. It records the genetic signature of previous invaders and uses it to recognize new ones, then chops the invader's genome to pieces. The system commonly used in laboratories by genetic engineers is called CRISPR-Cas9, but there are dozens of other variants.

One of those variants, known as type IV, has been largely ignored by scientists, says Rafael Pinilla-Redondo, a microbiologist at the University of Copenhagen, because it is rare and is missing some of the key components that would make it interesting and useful to genetic engineers. But Pinilla-Redondo and a few others are fascinated by it, because it's found not in the main genome of bacteria, but on plasmids, which are independent scraps of DNA that act a little like parasites, existing inside the bacteria and using the bacterial cellular machinery to replicate and spread.

"That was a sign for me to dig deeper," he said.

Pinilla-Redondo and his colleagues mined databases for everything they could find about the genetic sequences of type IV CRISPR. They identified several new subtypes and reconstructed the system's evolutionary history. And they found something intriguing.

"We found that in contrast to almost all the other CRISPR systems, which target viruses, type IV targets other plasmids," said Pinilla-Redondo. "We propose that it is used in plasmid-versus-plasmid warfare within the bacteria."

While plasmids sometimes bring benefits to their host bacteria -- they are one of the ways that bacteria can acquire resistance to antibiotics, for example -- they are still using the bacteria's resources. Too many plasmids in one bacterium could overtax its cellular machinery and eventually kill it.

"Everything in nature is competing for limited resources and space," said Pinilla-Redondo. "The biggest competitor for plasmids are other plasmids, so they needed to find ways to eliminate the competition."

Other researchers have gone further than Pinilla-Redondo's database-trawling work. Ryan Jackson, a biochemist at Utah State University, in Logan, was the first to demonstrate directly in experiments in bacteria that the type IV system can defend against and clear other plasmids from the host. But there are still a lot of unanswered questions, he says.

For one thing, the type IV system is weird. It doesn't have a nuclease -- the protein that actually cuts up DNA. So it's not clear how it actually attacks opposing plasmids. It does have a helicase -- a protein that unwinds DNA -- which seems to be important. So Jackson and Pinilla-Redondo both speculate that maybe unwinding the DNA makes it more likely to break naturally, or perhaps the helicase protein just gets in the way of the DNA replication machinery and stops it dead. But there is no good evidence yet.

Secondly, type IV CRISPRs lack the machinery used to pick up and store pieces of the invading genome that act like mugshots to identify future invaders. Instead it seems to borrow some of the host bacteria's CRISPR machinery to do this, but the exact mechanism is unclear.

"We don't really know what's going on," said Jackson. "It's a huge mystery."

Viruses have also been shown to use CRISPR to knock down cellular defenses, and megaviruses also have their own CRISPR systems to attack cells and other viruses. It seems CRISPR is a popular and effective weapon in the microbial world. "There is a lot of selective pressure in this arms race," said Jackson. "Plasmids, viruses and bacteria all want to find something to give them an edge."

The type IV system could prove useful for humans as well. Because plasmids are one of the main ways bacteria pick up and share antibiotic resistance genes, it is possible that researchers could find a way to use it to fight the spread of resistance, by attacking the plasmids that carry it.

"We know these systems evolved to fight other plasmids, so they are probably quite good at it," said Pinilla-Redondo. "Whether they are any better than other ways of fighting resistance, we will have to find out."

Complete genome sequence of the facultative anaerobic magnetotactic bacterium Magnetospirillum sp. strain AMB-1

Magnetospirillum sp. strain AMB-1 is a Gram-negative alpha-proteobacterium that synthesizes nano-sized magnetites, referred to as magnetosomes, aligned intracellularly in a chain. The potential of this nano-sized material is growing and will be applicable to broad research areas. It has been expected that genome analysis would elucidate the mechanism of magnetosome formation by magnetic bacteria. Here we describe the genome of Magnetospirillum sp. AMB-1 wild type, which consists of a single circular chromosome of 4967148 bp. For identification of genes required for magnetosome formation, transposon mutagenesis and determination of magnetosome membrane proteins were performed. Analysis of a non-magnetic transposon mutant library focused on three unknown genes from 2752 unknown genes and three genes from 205 signal transduction genes. Partial proteome analysis of the magnetosome membrane revealed that the membrane contains numerous oxidation/reduction proteins and a signal response regulator that may function in magnetotaxis. Thus, oxidation/reduction proteins and elaborate multidomain signaling proteins were analyzed. This comprehensive genome analysis will enable resolution of the mechanisms of magnetosome formation and provide a template to determine how magnetic bacteria maintain a species-specific, nano-sized, magnetic single domain and paramagnetic morphology.