The Genomic History of the Bronze Age Southern Levant – Paper review

This week, the first Bronze and Iron Ages genomes from Israel were published (Link to paper). This is a very exciting dataset. Of course, this excitement was tainted by the usual propaganda that surrounded Jewish genomic analysis since 1998. So, let’s see what was actually done in the study

Abstract

We report genome-wide DNA data for 73 individuals from five archaeological sites across the Bronze and Iron Ages Southern Levant. These individuals, who share the ‘‘Canaanite’’ material culture, can be modeled as descending from two sources: (1) earlier local Neolithic populations and (2) populations related to the Chalcolithic Zagros or the Bronze Age Caucasus. The non-local contribution increased over time, as evinced by three outliers who can be modeled as descendants of recent migrants. We show evidence that different ‘‘Canaanite’’ groups genetically resemble each other more than other populations. We find that Levant-related modern populations typically have substantial ancestry coming from populations related to the Chalcolithic Zagros and the Bronze Age Southern Levant. These groups also harbor ancestry from sources we cannot fully model with the available data, highlighting the critical role of post-Bronze-Age migrations into the region over the past 3,000 years. 

Agranat-Tamir et al. 2020. Cell.

What was done? Israel Finkelstein, one of the greatest archeologists in Israel of all time, has been digging in Tel Megiddo for years. If you haven’t done it yet, I suggest that you read his most famous book: The Bible Unearthed. I read all his books and most of his papers and it’s a fascinating work. As often happens when you dig in archeological sites, Finkelstein found bones, but whose bones are they? We don’t know. This is, of course, just one small and minor detail. It is so tiny and insignificant that it seems unimportant, which is probably why the authors of the Cell paper ignore it. Here is the most they could say about it:

In much of the Late Bronze Age, the region was ruled by imperial Egypt, although in later phases of the Iron Age, it was controlled by the Mesopotamian-centered empires of Assyria and Babylonia. Archaeological and historical research has documented major changes during the Bronze and Iron Ages, such as the cultural influence of the northern (Caucasian) populations related to the Kura-Araxes tradition during the Early Bronze Age (Greenberg and Goren, 2009) and effects from the ‘‘Sea Peoples’’ (such as Philistines) from the west in the beginning of the Iron Age (Yasur-Landau, 2010).

That’s very generic historical description, which doesn’t answer the question: WHO ARE THE PEOPLE BURIED IN TEL MEGIDDO and YEHUD? How can we say anything about them if we cannot answer this question? This is one of the real questions, the hard ones, so naturally, it is ignored. Israel Finkelstein couldn’t answer this question for the same reasons that archeologists of the Levant always struggled with problems of origin (difficulties in finding cultural culprits, distinguishing between authentic and imported good, the unimaginative nature of Levantine artifacts, etc.). Overall, it is complicated to look at a find and say, “Ahah, I know exactly where it is from.” Compare that, for example, to Finkelstein’s other work that found Vanilla on the same site (of potential South Asian origin).

You would think that geneticists would be able to answer this question having a large amount of genomic data from all over the world, and they potentially can. Still, in this paper, they don’t do so because A) they chose not to and B) there are many other reasons why such analyses would yield incorrect results (inferior bioinformatic tools, lack of “anchors,” inability to deal with human migration over time and space, and preference of using tools and models that tell you what you want to hear). Instead, the bioinformatic analysis is DESCRIPTIVE rather than being INVESTIGATIVE.

The analyses. Let us explore the analysis to see if they can tell us anything meaningful.

pca

The first analysis is PCA and like all PCAs it is nonsense where anything can mean anything if you select the populations correctly (the grey “unimportant” part). The second part is an unsupervised admixture with 6 splits, where almost all the individuals look the same. What does it mean? Nothing. If you compare a lot of Africans, Europeans, and Asian then all Africans would look the same. So what? So nothing, but its colorful and makes a nice plot. Don’t take my words, here is what the authors had to say about these fascinating plots: 

The ADMIXTURE results are qualitatively consistent with the principal component analysis (PCA), suggesting that all individuals but the outliers from Megiddo and the Ashkelon IA1 population have similar ancestry (Figure 1C)

Was that supposed to be a surprise? Everyone knows that the two methods perform similarly. Alexander et al. (2009), the authors of the ADMIXTURE tool, already wrote that:

ADMIXTURE performs as well as EIGENSTRAT [PCA] at statistically correcting for population structure.

So what exactly is the big revelation in reporting that the tools perform similarly? And that’s it, that’s the summary of the results of this section. All this work and all these figures for nothing. So, for now, let us agree that there is very little that we can infer from this about the ancestry of the ancient or modern-day people.

To be fair, Agranat-Tamir et al. didn’t invent anything. Almost all the paleo-genomic papers are written in a similar way, which is also why most of them are a big waste of time.

Next, the authors find relatives in the data and show that the individuals from the two sites are genomically close, as expected. They argue (unconvincingly) that there were changed in the genomic makeup of individuals (i.e., their ancestry changed) between the Bronze and Iron age by showing an increase in the fraction of Iranian Chalcolithic ancestry over time of multiple samples. I have no doubt that ancestry changes over time, but this figure doesn’t show that well provided that most of the signal is in the middle. Why did the authors choose Iranians Chalcolithic? Why not Anatolian? Why not Armenian? Where is the discovery process? What is the importance of this ancestry as opposed to Greek and Anatolian ancestries, known to have a strong influence on this region?

Iranian

Are modern-day Jews descendants of the Israelite? Now we get to the cheesy “Further Change in Levantine Populations Since the Bronze Age” section. How much ancestry do we derive from these people?

We attempted to model groups that have a tradition of descent from ancient people in the region (Jews) as well as Levantine Arabic-speakers as mixtures of various ancient source populations… For this, we generated present-day populations as a mixture of two closely related ancient populations with and without a third, more distant, population.

David Reich is expert on taking very complex demographic processes and minimize them to 2-3 variables that he could easily deal with. It was excusable 10 years ago, it is completely intolerable now.  

The following populations were used as a base line:

1) Megiddo_MLBA (the largest group) as a representative of the Middle-to-Late Bronze Age component;

2) Iran_ChL as a representative of the Zagros and the Caucasus;

3) Present-day Somalis as representatives of an Eastern African source (in the absence of genetic data on ancient populations from the region);

4) Europe_LNBA as a representative of ancient Europeans from the Late Neolithic and Bronze Age

When you model populations using 4 ancient populations you will always get one or more of these ancient populations, you know that, right? And if your test population includes ancestry outside of these 4 groups, you’ll still get only those 4 groups.

Let’s think of it in a different way, describe the color of the sky at 14:00 anywhere in the world but England? You cannot use the color blue, you must use pink, yellow, and greenyellow. According to this model, the color of the sky is greenyellow, because the color code of greenyellow is [173, 255, 2] (where the primary colors RGB range from 0-255). But we all know that is wrong, because the color of the sky is actually blue, but we didn’t use blue in the model. You are very welcome to send your paper to Cell and tell them that you have a strong evidence that the color of sky is greenyellow. This is what happened when you force (or rape) the data to produce the results you want. Shocked? This is one the oldest trick in the “Judaeans wannabes” book. You read one you read them all. You know how Behar and colleagues came up with the four “Matrilineal Ancestry of Ashkenazi Jewry” concept, right? They stopped counting about the first four, which amounted to 40% of the data. Although their Table 1 clearly shows the wide range of all haplogroups, they had no problem concluding that:

In total, we have identified four Ashkenazi founding lineages, three within Hg K and one in Hg N1b, deriving from only four ancestral women and accounting for fully 40% of the mtDNAs of the current Ashkenazi population (∼8,000,000 people).

From Behar et al. 2006, AJHG.

The next thing you know, FTDNA made a fortune from selling Jewishness tests with Behar as their consultant. The 40% figure became “nearly half” and those % were later disappeared as the “four mothers” element took over. No, they didn’t report their conflict of interests. Ever.  Although only an idiot would claim that a haplogroup is associated with a single person, this claim was touted endlessly. 

Let’s get back to the Agranat-Tamir et al. study. Now, you are in a better position to guess what happens when we have an ancient population of unknown origins collected from two very small site, Iranians (who supposedly representing a vaster region than Iran),  Somalis (As an African outgroup, but there is no Asian one), and Europeans (again supposedly representing all of Europe) what is it exactly that you are getting? It turns out that forcing the data in such manner yields quite exciting results: It obviously and very clearly tells you that God promised the land to the Jews, and that everyone else should be kicked out or something to that effect. Kidding… It gives you these two bar plots.

Here, the 4 baseline populations became 3 to blur the high Tel Megiddo ancestry of “Arab speakers”. In short, we are back to the racial description of populations as Africans, Europeans, and Asians, where each continental population (or race) is very very poorly represented .

Capture

Never heard of LINADMIX or PHCP? No surprise, they were both introduced in this paper. We have no idea how they perform or what is their accuracy in real time. The brief simulation section in the supplementary materials shows that these tools were calibrated exactly on the populations they aim to study. This is wrong. The training should have been done on an independent dataset. We also don’t know why the authors didn’t use supervised ADMIXTURE, which is what they needed to do here. Why reinvent the wheel twice? Another important question is whether these methods are sample-independent, i.e., they perform the same for a test sample whether tested alone or with 1000 other individuals. So far, all the methods used by the authors were sample-dependent.

Also why use only 4 “regional” populations? It is not that there are no other gene pools or that it is difficult to find them. In our 2008 study by Esposito et al. we identified 8 such gene pools from all over the world. GPS Origins (COI: I consult DDC) uses 41. The same technique can be used to identify all Bronze Age gene pools. Limiting a study to three racial groups is very problematic and uninformative. 

What to make of those results? Population genomics is a comparative science. The more data you have, the more sense you can make. I have the genomes of >1000 modern-day populations and the genomes of >3000 ancient genomes . Reich (senior author) has several times that data, probably close to 3000 modern-day population and >8000 ancient populations. When Reich shows you a figure with a couple of grey dots and 17 modern-day populations for an analysis, which, as you recall, was designed to produce results only for 4 very unevenly-sized populations, you should rightfully ask yourselves what’s going on? Or you can pretend that the Oracle told you that Neo is the one and that you can see the Matrix. It’s really up to you. 

Last words. I commend the authors for carrying out the sequencing effort and publishing the dataset and hope they would consider this as constructive criticism to improve their work. I have no doubt that the dataset that they produced would be invaluable for future studies. 

This entry was posted in BLOG and tagged , , , , . Bookmark the permalink.

4 Responses to The Genomic History of the Bronze Age Southern Levant – Paper review

  1. Froy says:

    Will all these shortcomings be explained in detail and published in a scientific paper?

  2. zaid almasri says:

    can you produce the figures/graphs with more modern population and more evenly sized .

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s