CARTA: Ancient DNA and Human Evolution – The Landscape of Archaic Ancestry in Modern Humans

– [Announcer] This UCSD-TV program is
presented by University of California Television. Like what you learn? Visit our
website, or follow us on Facebook and Twitter to keep up with the latest
programs. ♪ [music] ♪ – [Narrator] We are the paradoxical ape;
bipedal, naked large-brained. Lone the master of fire, tools and language, but
still trying to understand ourselves. Aware that death is inevitable, yet filled
with optimism. We grow up slowly. We hand down knowledge. We empathize and deceive.
We shape the future from our shared understanding of the past. CARTA brings
together experts from diverse disciplines to exchange insights on who we are and how
we got here. An exploration made possible by the generosity of humans like you. ♪ [music] ♪ – [Sriram] Thanks so much for the
invitation, it’s a great pleasure to be here. Today I’m going to be talking about
the legacy of archaic admixture in present day humans. And so, to set the stage here,
based on genetic analysis of present day populations we’ve built up a fairly good
broad outline of the history of modern humans. For example, we know that modern
humans evolved in Africa and about 100,000 years ago there was a dispersal outside of
Africa to the rest of the world. We also know, based on archeological evidence,
that during this dispersal modern humans overlapped with another archaic human
population, the Neandertal. And so, this begs the question whether these two
populations that overlapped potentially met and interbred. So several studies
tried to answer this hypothesis, but in the last five years there have been some
major breakthroughs that allow us to get whole modern, whole genomes, from ancient
human populations. In particular, we have genomes from the Neandertals and their
sister group, the Denisovans. We’ll hear a little more about the technical challenges
that enabled the breakthrough in the later talks. But, what I’d like to focus on
today is the fact that once we have these ancient whole genome sequences, we can
pretty much definitively answer the question of whether there was
interbreeding between these two groups. And so, a number of studies have shown
that there was indeed interbreeding, or gene flow. For example, we now know that
non-African populations trace about 2% of their genetic ancestry back to the
Neandertals. We also know that there was Denisovan gene flow into some present day
human populations. For example, populations from Island Oceania have about
3% to 6% of their genetic ancestry tracing back to the Denisovans. This is over and
above the 2% that they inherit from the Neandertals. So, the big picture is all
non-Africans today carry a small amount of archaic ancestry. At first blush this
might seem like a small contribution, but when we examine this further, one thing
that we need to keep in mind is the Denisovans and the Neandertals were highly
diverged from modern human populations at the time of admixture. We think they were
separated by at least a couple of hundreds of thousands of years. And as a result,
these populations accumulated novel mutations that were never seen in the
modern human population. These novel mutations entered modern humans through
the admixture process. We estimate that about 10% of the SNPs at the time of
admixture, in non-Africans, could have been of Neandertal origin. So, a
hypothesis is that these archaic admixtures were having a potentially large
impact on human biology. So as one concrete example of the potential impact
of archaic admixture: we were involved in a genome wide association study, a study
that aims to determine genetic variance associated with Type 2 diabetes in
Mexican-Americans. In this study, we determined a novel radiant found in this
gene called SLC16A11. We found that this gene had a unique geographic distribution.
When we looked at the variant in this gene that confers increased risk for Type 2
diabetes, we found that the risk-increasing variant is found at
appreciable frequency in the Americas, is essentially absent in Africa and is
present at low frequency everywhere else. What we found is that this risk-increasing
variant at this gene matches the genomic sequence found in one of the Neandertal
genomes that had been sequenced. So, what we determined was this was a variant that
had introgressed into modern human populations from Neandertals. Now, a
number of other studies have looked at specific genetic positions, genetic loci,
and have documented substantial contributions of archaic ancestries at
these loci. Here is a not very representative list. But, what we wanted
to do is go beyond single loci analysis and ask, “What does the distribution of
Neandertal ancestry look if we were to look across the genome?” So we wanted to
go from single loci to a genome wide assessment. To do this, we need to build a
map of archaic local ancestry. So, what do we mean by that? Basically, what we want
to do is to go along an individual person’s genome and label the positions
where they carry archaic ancestry. Here is a cartoon that depicts what’s going on and
why we need to do this. We have a modern human genome admixing with an archaic
human genome. If you look at the descendants of this interbreeding because
of the way the genome is transmitted in every generation, it gets broken down by a
process called recombination. And as a result, if you look at the descendant of
this interbreeding event, this descendant’s genome is going to have this
mosaic pattern where some portions of the genome trace their ancestry to the modern
human and others trace their ancestry to the archaic human population. What we’d
like to be able to do is look at a present day individual’s genome and figure out
which are the portions that are red and which are the portions that are blue. To
be able to do this, we came up with a statistical model and what this model
allows us to do is to infer these locations of archaic ancestry.
Specifically, we’re going to look at a target genome, this could be any genome
that we’re interested in, and we compare this target genome to the genome of the
Neandertal, the Denisovan, as well as to a reference panel of African individuals.
We’re looking at Africans because this is a population that we assume has little
archaic ancestry. It’s a reasonable assumption but not entirely true, but we
work with that. Our goal is to look at this target genome and for every position,
which we call a SNP here, along this target genome, we like to label the
ancestry, or more precisely the local ancestry, of this individual. We like to
be able to say that this SNP of the target individual is Neandertal in ancestry
whereas this other SNP is modern human in ancestry. I’m going to give you a little
bit of introduction of what goes under the layers of this statistical model that
allows us to make these inferences. The basic idea is we’re going to be looking at
patterns of genetic variation that are informative of archaic ancestry. Here’s an
example where we’re going to be looking at features of genetic variation that are
informative of Neandertal ancestry. Here, we are looking at a single position in the
genome where we’re comparing the target genome to the Neandertal, the Denisovan,
and the Africans. What we see here is a genealogy, or a tree, that relates these
genomes at this position. What we see at this position is that there’s a mutation
that clusters the target and the Neandertal genome to the exclusion of all
the other genomes. When we see this pattern, we’re likely to think that this
is a position that has come into the target genome from Neandertal admixture.
On the other hand, here’s another position where there is a mutation that clusters
the target with the Denisovan and the Africans to the exclusion of the
Neandertals. When we see this, we conclude the opposite that this is a position that
is unlikely to carry Neandertal ancestry. These kinds of features go into the
statistical model which figures out what is an optimal way of combining these
features to get the best possible predictions. Similarly, we can also do a
statistical model that allows us to predict Denisovan ancestries. We applied
an initial version of this statistical model to a data set that comes from what
is called the 1,000 Genomes Project. This already gave us some very interesting
insights. But a limitation of that data set was that this data set had a small
sampling of populations from outside of Africa. A second limitation was that it
lacked any individuals who carried Denisovan ancestry so, we can make no
inferences about the Denisovan contributions to modern human populations.
Instead, we applied this data set, this method, to a new data set. This is a rich
genomic data set that’s called a Simons Genome Diversity Project. It has genome
sequences from over 100 non-African populations and, most importantly, we have
in this data set 20 genomes from Oceania, from individuals who have substantial
Denisovan ancestries. We applied our method to this data set and the first
thing we’d like to be able to conclude is that the method is giving us accurate
inferences about archaic ancestries. One way to do this is to make sure that the
inferences are consistent with everything we know about human history. The way we do
this is we take the inferences that come out of the statistical model, we average
it across an individual’s genome and compute what proportion of an individual’s
genome is archaic in ancestry. In this particular case, we’re asking what
proportion of an individual’s genome is Neandertal in ancestry. What we have here
is a circle for every population and the color is telling us whether you have low
levels of Neandertal ancestry to high levels of Neandertal ancestry. The first
thing we observed is in general, non-Africans carry substantially more
Neandertal ancestry compared to south African hunter-gatherers which are not
included here, but this is what we expect based on our previous demographic models
which tell us that there was Neandertal admixture into non-Africans after they
split from Africans. We also observe that when you look at Eastern non-African
populations, they carry more Neandertal ancestry compared to west Eurasians which
has also been a previous observation consistent with the literature. Now, we
can do the analogous thing with Denisovan ancestry and what we see here is that
Oceanian populations have substantially more Denisovan ancestry compared to
Mainland Eurasian populations. Further, amongst the Mainland Eurasians, there
seems to be more Denisovan ancestry in East Asians compared to West Eurasians.
Again, all of these are consistent with previous results. Now, when we look at the
data further, there was one element of surprise and a novel result. The novel
result is that several populations in South and East Asia tend to have an excess
of Denisovan ancestry that had not been observed before. These are populations
like the Sherpa, which is a population from Nepal, the Tibetans, and Bengali, a
population from East India. It turns out that this is a trace amount of Denisovan
ancestry that they carry. We estimate that it’s about 5 parts in 1,000, which is why
you need these sensitive statistical methods to see these Denisovan
contributions. Now, one question we were interested in is: “Can this excess of
Denisovan ancestry in South Asian populations be explained by a mixture
between Eastern non-Africans who have more Denisovan ancestry with West Eurasians who
have less Denisovan ancestry?” To see this, we plotted the Denisovan ancestry as
a function of what proportion of your genome is related to non-West Eurasians.
What we see is generally as you get closer to the non-West Eurasians, you have more
Denisovan ancestry, but when you look at South Asian populations, you have
systematically more that can be explained by this model. What this is telling us is
one of several things: a model that is consistent with this observation, though
not the only model, is that there were actually three Denisovan introgression
events in the history of modern human populations: one, in the history of the
Papuans or the Oceanians; the second, in the history of the East Asians; and the
third, in the history of South Asian populations. Now, we decide to zoom in and
instead of looking at how archaic ancestry changes across individuals, we’d like to
look at how archaic ancestry changes as we move along the genomes. Here, as we move
along the circle, we’re moving along the genome looking at different chromosomes.
The outermost circle is telling us what is the Denisovan ancestry proportion in the
Oceanians and each of the inner circles are telling us what is the Neandertal
ancestry proportion in different continental populations. The key
observation is the colors along these circles tell us which are positions in the
genome where there is detectable proportions of archaic ancestry. The key
takeaway from this figure is the archaic ancestry doesn’t seem to be randomly
scattered along an individual’s genome. There are certain hotspots where there is
an elevated proportion of archaic ancestry, so we call them “peaks,” and
then there are certain other positions in the genome where nobody seems to be
carrying archaic ancestry, which we term “deserts.” This was another surprise and
we’d like to figure out what is going on in these peaks and deserts of archaic
ancestry. So, we looked at one of the most extreme examples of a peak. This is a
locus that overlaps a gene called basonuclin 2 and in the 1,000 Genomes
European population, we find that about 60% of individuals, European individuals,
today carry the Neandertal variant of the allele. This needs to be contrasted with
the 2% who would have carried it 50,000 years ago. Essentially, the Neandertal
variant has increased from 2% to 60% over the last 50,000 years. This is not an
isolated example and we find of the order of 200 loci with elevated Neandertal
ancestries in the different non-African populations and about 50 loci with
elevated Denisovan ancestry in the Oceanian populations. These are all
potential candidates what we call “archaic adaptive introgression.” Putting it
simply, these are places in the genome where the archaic allele conferred an
adaptive benefit in the modern human population which is why it rose up in
frequencies. So, we’d like to understand what might be driving this increase in
frequencies at these positions. That turns out to be a really challenging problem.
One way we try to address that is by looking at sets of genes that are known to
be associated with certain functions or certain biological processes. And we
asked, “Are these sets of genes harboring an excess of archaic ancestry much more
than we’d expect?” We find several sets of genes that show an elevated proportion of
archaic ancestry much more than we’d expect. For example, genes that are
involved in keratin filament formation so, keratin is a protein that is found in hair
and skin, are enriched for Neandertal ancestry across all the non-African
populations. Similarly, we find that genes involved in Trace-amine receptors, these
are genes that are involved, are important, for olfaction, tend to have
elevated proportions of Denisovan ancestry. This, again, allows us to narrow
down what the selection pressures might be, but still, we are quite some ways away
from figuring out what the exact sequence of processes were that drove these archaic
variants to high frequencies. Next, we turn our attention to deserts. These are
large regions, tens of millions of bases long, where we cannot detect either
Neandertal or Denisovan ancestry. Even more impressively, there are four such
regions in the genome that are deserts for both Neandertal and Denisovan ancestry.
Here is one example: this is a desert on chromosome seven, and it contains a number
of genes within this region, but one gene that caught our eye because of the prior
work associated with it, is a gene called FOXP2. This is a gene that’s shown to be
important for speech and language. So, a possibility here is that these deserts of
archaic ancestry are places in the genome that are resistant to archaic
introgression and a reason for that is that these are places that are important
for the modern human phenotype. The challenge, again, is that these are large
regions of the genome and trying to localize what might the changes be that
make them resistant to introgression is actually quite challenging. We decided to
look at this in a slightly more quantitative manner. The way we did this
is we chopped up the genome according to a measure of the selective strength in that
region of the genome and we asked how does the archaic ancestry change in different
portions of the genome. The x-axis, as we go from right to left, is going in
directions of stronger selective constraint. What we find is the archaic
ancestry, whether we’re looking at Neandertals or the Denisovans, decreases
as we move towards the strongly selectively constrained of the genome.
This is consistent with the observation that there has been purifying selection to
the more archaic alleles and there are several models that have been proposed:
one of these is that these archaic alleles are deleterious and they’ve been purged
from the human population; another one that we’ve also proposed is one of hybrid
sterility where these populations have diverged and have accumulated genetic
incompatibilities that are not tolerated on each other’s genetic background. I’m
just going to conclude very quickly. I’ve talked about statistical models for
inferring maps of archaic ancestry and so by combining these sensitive statistical
models with the rich ancient and modern genomic data sets, we can make some very
fine scale inferences. The kinds of inferences lead us to conclude that there
is a lot of complexity in the demographic histories of these populations; we’ll
learn more about it in later talks. When we look at the variation along the genome,
clearly this is affected by selection but there’s also demography at play here. A
major challenge for us is to separate out demography and selection and different
kinds of selection from each other. With that, I’d like to acknowledge my
colleagues at Harvard, colleagues at the the Max Planck, and the members of the
Neandertal Genome Analysis Consortium for comments and criticism at different stages
of the work. Thanks. ♪ [music] ♪


Add a Comment

Your email address will not be published. Required fields are marked *