Africa has four language superfamilies, of which Khoisan is one and the other three are Niger-Kordofanian (also known as Niger-Congo), Nilo-Saharan and Afro-Asiatic. The Niger-Kordofanian languages, the most widespread, were carried from western to eastern Africa and then south by the Bantu expansion, a great stream of migrations from the proto-Bantu homeland in western Africa that began in about 1000 BC and reached southern Africa a thousand years later. Afro-Asiatic languages are spoken in a broad belt across northern Africa, and the Nilo-Saharan speakers are sandwiched between Afro-Asiatic to the north and Niger-Kordofanian to the south.
Genetics generally correlates with language family, except in the case of populations that have switched languages; the pygmies now speak Niger-Kordofanian languages, and the Luo of Kenya, whose genetics place them with Niger-Kordofanian speakers, now speak a Nilo-Saharan language.
The Tishkoff team surveyed African Americans from Chicago, Baltimore, Pittsburgh and North Carolina and found that 71% of their genomes, on average, matched the genetics of Niger-Kordofanian speakers, 8% matched that of other African populations and 13% were European. These percentages varied greatly from one individual to another.
The origin of a species can often be located by surveying the genetic diversity in its members and seeing where diversity is highest. This is because the founding population will have had longest to accumulate the mutations that generate diversity, and the groups that migrate away will carry with them only a sample of the original mutations. (Other forces, like natural selection, reduce diversity by eliminating harmful mutations and sweeping away others when a beneficial mutation is favored.) On the basis of the new African and other genomic data, the origin of the modern human migration lies in southwestern Africa, near the border of Namibia and Angola, in a region that is the current homeland of the San click-speakers. The finding is not definitive, because the distribution of ancient populations may have been rather different from those of today. Nonetheless, the fact that human genetics points to a single origin confirms that today’s races are all mere variations on the same theme.
Fingerprints of Selection in the Human Genome
Both repeated DNA units and SNPs, the two kinds of DNA marker used by the surveys described above, lie for the most part outside genes and have little or no effect on a person’s physical makeup. They are what geneticists call neutral variations, meaning that they are ignored by natural selection. What then is it that makes human populations differ from one another?
Natural selection is the major shaper of differences, especially in large societies. In small societies, genetic drift—the luck of the draw as to which alleles make it into the next generation—can be a significant influence. But natural selection, often in concert with drift, is a major force over the long run. With the advent of fast methods of genome sequencing, geneticists have at last begun to delineate the fingerprints of natural selection in remodeling the human genome. These fingerprints are both recent and regional, meaning that they differ from one race to another.
The regional nature of selection was first made evident in a genomewide scan undertaken by Jonathan Pritchard, a population geneticist at the University of Chicago, in 2006. He looked for genes under selection in the three major races—Africans, East Asians and Europeans (or more exactly Caucasians, but European genetics are at present much better understood, so European populations are the usual subjects of study). Copious genetic data had been collected on each race as part of the HapMap, a project undertaken by the National Institutes of Health to explore the genetic roots of common disease. In each race Pritchard found about 200 genetic regions that showed a characteristic signature of having been under selection (206 in Africans, 185 in East Asians and 188 in Europeans). But in each race, a largely different set of genes was under selection, with only quite minor overlaps.10
The evidence of natural selection at work on a gene is that the percentage of the population that carries the favored allele of the gene has increased. But though alleles under selection become more common, they rarely displace all the other alleles of the gene in question by attaining a frequency of 100%. Were this to happen often in a population, races could be distinguished on the basis of which alleles they carried, which is generally not the case. In practice, the intensity of selection often relaxes as an allele rises in frequency, because the needed trait is well on the way to being attained.
Geneticists have several tests for whether a gene has been a recent target for natural selection. Many such tests, including the one devised by Pritchard, rest on the fact that as the favored allele of a gene sweeps through a population, the amount of genetic diversity in and around the gene is reduced in the population as a whole. This is so because increasing numbers of people now carry the same sequence of DNA units at that site, those of the favored allele. So the result of such a sweep is that DNA differences between members of a population are reduced in the region of the genome affected by the sweep. The concept of using sweeps as signatures of natural selection is discussed further below.
Figure 4.1. Regions of the genome that are highly selected in the three major races. ASN = East Asian, a sample of Chinese and Japanese. YRI = Yoruba, a West African people. CEU = European.
FROM JONATHAN PRITCHARD, PLOS BIOLOGY 4(2006):446–58.
Other researchers too have found that in doing genome scans for the fingerprints of natural selection, each major race or continental population has its own distinctive set of sites where selection has occurred.
These sites of selection are often very large and contain many genes, making it hard or impossible to decide which specific gene was the target of natural selection. In a new approach, which takes advantage of the many whole genomes that have now been decoded, Pardis Sabeti of Harvard and colleagues have defined 412 regions under selection in Africans, Europeans and East Asians. The regions are so small that most contain one or no genes. Those without genes presumably contain a control element, meaning a stretch of DNA that regulates some nearby gene.11
Of the 412 regions of the human genome shown to be under selection, 140 were under selection just in Europeans, 140 in East Asians and 132 in Africans.12 The absence of any overlap, meaning genes selected in two or more populations, as was found by Pritchard, is due to the Sabeti team’s genome scanning method, which depended in part on looking for sites at which the three races differed.
Each gene under selection will eventually tell a fascinating story about some historical stress to which the population was exposed and then adapted. A case in point is the analysis of the EDAR-V370A allele which, as described in the previous chapter, is the cause of thick hair and other traits in East Asians. But those narratives are for the moment inaccessible. The exploration of the human genome is so much at its beginning that the precise function of most genes is unknown.
Still, even though the exact tasks of most genes are still uncertain, the general roles of most genes can be inferred by comparing the DNA sequence of any unknown gene with those of known genes recorded in genomic data banks. The known genes are grouped into general functional categories, like brain genes or genes involved in metabolism, and since function is related to structure, the genes in each category have a characteristic sequence of DNA units. By comparing the DNA sequence of any new gene with the data bank sequences, the gene can be assigned to a general functional category. The genes Pritchard identified as shaped by natural selection included genes for fertilization and reproduction, genes for skin color, genes for skeletal development and genes for brain function. In the brain function category, four genes were under selection in Africans and two each in East Asians and Europeans. What these genes do within the brain is largely unknown. But the findings establish the obvious truth that brain genes do not lie in some special category exempt from natural selection. They are as much under evolutionary pressure as any other category of gene.
Population geneticists have developed several different kinds
of tests to see if natural selection has influenced the DNA sequence of a gene. All these tests are statistical, and many depend on the disturbance in gene frequencies that is caused as a favored gene sweeps through a population. Natural selection cannot pick out single genes or even single mutations in DNA. Rather, it depends on the process called recombination, in which the mother’s and father’s genomes are shuffled prior to creating eggs and sperm.
In the egg-making or sperm-making cells, the two sets of chromosomes that a person has inherited, one from their mother and one from their father, are lined up side by side, and the cell then forces them to exchange large sections of DNA. These new composite chromosomes, consisting of some sections from the father’s genome and some from the mother’s, are what is passed on to the next generation.
The swapped sections, or blocks, may be 500,000 DNA units in length, long enough to carry several genes. So a gene with a beneficial mutation will be inherited along with the whole block of DNA in which it is embedded. It’s because beneficial genes lie in such a large block that the effect of natural selection on the genome can be detected—the favored blocks sweep out large regions of the genome as they spread through a population.
Generation by generation, the block of DNA with the favored version of a gene gets to be carried by more and more people. Eventually, the new allele may sweep through the entire population, in which case geneticists say it has gone to fixation. But most sweeps do not carry an allele to fixation because, as already noted, the intensity of selection on a beneficial allele relaxes as the trait is molded toward its most efficient form.
Whether a sweep is complete or partial, the favored blocks of DNA eventually get whittled down over the generations, because the cuts that generate them are not always made in the same places on the chromosome. After just 30,000 years or so, according to one calculation, the blocks get too short to be detectable. This means that most genomewide scans for selection are looking at events that occurred just a few thousand years ago, very recently in human evolution.
Biologists have long had to depend on the evidence from fossils to judge the speed of evolution. But fossils capture just the bones of an animal. And since the skeletal structure of a species changes only slowly, evolution has long seemed a glacially slow and plodding process.
With the ability to decode DNA sequences, biologists can examine the raw programming of evolutionary change and track every gene in a species’ repertoire. It’s now clear that evolution is no sluggard. There are already clear examples of human evolutionary change within the past few thousand years, such as Tibetans’ adaptation to high altitude, starting from just 3,000 years ago. Of course, every gene in the human genome has been intensely shaped by natural selection at one time or another. But with most genes, the selection was accomplished eons before humans or even primates had evolved. The fingerprints of these ancient selection events have long since faded from sight. The type of selection picked up by most genome scans is very recent selection, meaning within the past 5,000 to 30,000 years or so, but fortunately this is a period of great interest for understanding human evolution.
More than 20 scans for selection have now been performed on the human genome. They do not all mark the same regions as being under selection but that is not surprising since the authors use different kinds of tests and different statistical methods, which are in any case imprecise. But if one takes just the regions marked by any two of the scans, then 722 regions, containing some 2,465 genes, have been under recent pressure of natural selection, according to an estimate by Joshua M. Akey of the University of Washington. This amounts to 14% of the genome.13
That so much of the genome has been under natural selection strong enough to be detectable shows how intense human evolution must have been in the past few thousand years. A principal driver of evolutionary change would have been the need to adapt to a wide range of new environments. In proof of that point, some 80% of the 722 regions under selection are instances of local adaptation, meaning that they occur in one of the three main races but not in the other two.
The genes under selection affect a large number of biological traits, prominent among them being skin color, diet, bone and hair structure, resistance to disease and brain function.
A similar finding emerged from a particularly comprehensive genome scan conducted by Mark Stoneking and colleagues. Stoneking, a population geneticist at the Max Planck Institute for Evolutionary Anthropology in Leipzig, is known for having developed an ingenious way of estimating when humans first started to wear clothes. The body louse, which lives only in clothes, evolved from the head louse, which lives on hair. Stoneking realized that a date for the first tight-fitting clothes could be derived by using genetic methods to date the birth of the body louse lineage—about 72,000 years ago.14
In his genome survey, Stoneking found many genes under selection that affected people’s interaction with their environment, such as genes involved in metabolizing certain classes of food and genes that mediate resistance to pathogens. Among the genes under selection he also found several that were involved in aspects of the nervous system, such as cognition and sensory perception.
The genes of the nervous system have been under selection for the same reason as the other genes—to help people adapt to local circumstances. Changes in social behavior may well have been foremost, given that it is largely through their society that people interact with their environment. Signals of selection in brain genes “may be related to how different human groups interact behaviorally with their environment and/or with other human groups,” Stoneking and colleagues wrote.15
Another regional trend indicated by the genome scans is that there seem to be more genes under selection in the genomes of East Asians and Europeans than in those of Africans. Not all genome scans have reported such a finding—the Pritchard scan described above did not—and African populations have been poorly sampled so far. But in a subsequent scan, Pritchard and others did find evidence for more sweeps outside Africa.
“A plausible explanation is that humans experienced many novel selective pressures as they spread out of Africa into new habitats and cooler climates,” they wrote. “Hence there may simply have been more sustained selective pressures on non-Africans for novel phenotypes.” 16 Phenotype refers to the physical trait or organism produced by the DNA, as contrasted with the DNA itself, which is known as the genotype. One obvious example of a novel phenotype needed outside Africa is that of skin color. Africans have retained the default dark skin of the ancestral human population, whereas East Asians and Europeans, descendants of populations who adapted to extreme northern latitudes, have evolved pale skin.
Both within Africa and in the world outside, social structure underwent a radical transition as populations began to grow after the beginning of agriculture some 10,000 years ago. Independently on all three continents, people’s social behaviors started to adapt to the requirements of living in settled societies that were larger and more complex than those of the hunter-gatherer band. The signature of such social changes may be written in the genome, perhaps in some of the brain genes already known to be under selection. The MAO-A gene, which influences aggression and antisocial behavior, is one behavioral gene that, as mentioned in the previous chapter, is known to vary between races and ethnic groups, and many more will doubtless come to light.
Hard Sweeps and Soft Sweeps
Textbooks about evolution discuss favorable alleles that sweep through a population and become universal. There are many ancient alleles that have probably become fixed in this way. All humans, at least compared with chimpanzees, carry the same form of the FOXP2 gene, which is a critical contributor to the faculty of speech. A variation called the Duffy null allele has become almost universal among Africans because it was an excellent defense against an ancient form of malaria. A gene called DARC (an acronym for Duffy antigen receptor for chemokines) produces a protein that sits on the surface of red blood cell
s. Its role is to convey messages from local hormones (chemokines) to the interior of the cell. A species of malarial parasite known as Plasmodium vivax, once endemic in parts of Africa, learned how to use the DARC protein to gain entry into red blood cells. A mutated version of the DARC gene, the Duffy null allele, then became widespread because it denies the parasite access to the blood cells in which it feeds and thus provides a highly effective defense. Almost everyone in Africa carries the Duffy null allele of DARC, and almost no one outside does.17
Many other mutations have arisen to protect people against current strains of malaria, such as those that cause sickle-cell anemia and the thalassemias. Sickle-cell anemia occurs with high frequency in Africa, and beta-thalassemia is common in the Mediterranean, but neither has attained the universality of the Duffy null allele within a population. Another widespread but fairly exclusive allele is associated with skin color. This is an allele of KITLG (an acronym for KIT ligand gene) which leads to lighter skin. Some 86% of Europeans and East Asians carry the skin lightening allele of KITLG. This allele evolved because of a mutation in the ancestral, skin-darkening version of KITLG, which is carried by almost all Africans.18 A skin-lightening allele of another gene, called SLC24A5, has swept almost completely through Europeans.
But the number of such genes, in which one allele has gone to fixation in one race and a different allele in another, is extremely small and in no way sufficient to account for differences between populations. Pritchard found no cases of an allele going to fixation among the Yoruba, a large African tribe in Nigeria. This has led him and other geneticists to conclude that complete sweeps have been much rarer in human evolution than supposed.19
A Troublesome Inheritance Page 11