In the previous post I discussed the results of an Ashkenazic Jew produced by GPS Origins. As more and more people take the GPS Origins ancestry test that my lab built (Disclosure: I consult DDC) and make them public, we have a chance to take a glimpse at what these results look like.
As with all tools, GPS Origins also worked extremely well in the lab but reality proved more challenging. For example, GPS Origins breaks the DNA’s origin into two and then calculates the migration routes of each origin, but what if someone comes from an extremely diverse background and multiple origins? In this case GPS Origins needs to make a choice. That choice reflects the two most dominant ancestries. But what if something went wrong during the genotyping process (as commonly happens), particularly for users that come from 23andme, FTDNA, ancestry.com, or Genographic? We somehow have to model that noise. These are only two of the difficulties we encountered. But, like good experimentalists we tested the tool in “field conditions,” improved the model when new issues were found, and moved back and forth like that until we were happy with the performances.
As of now, GPS Origins reconstructs the migration route with 90% accuracy and can predict the exact country of the two ancestors with nearly 85% accuracy. 95% of the people predicted incorrectly would found themselves 250km away from the correct country. Lab conditions, of course. Recall that in reality, GPS Origins predictions range from 2000 BC till present time (though most predictions would be over the past 1000 years). The countries you will see on the map had very different borders back then and very different people.
The following GPS Origins results were made publicly available by four people who also provided their knowledge of their ancestry. This is most helpful to evaluate GPS Origin’s results, but we must remember that GPS Origins provides the origin of your DNA, which may or may not correspond to where your grandparents came from. For example, I live in England, but my DNA is not from here. Before we get into the results I would like to thank those people for their bravery (it’s not easy taking a DNA test that may challenge what you thought about your ancestry!). Let’s take a look at the results.
Case number 1. English-Irish descent
The first participant is of English and Irish origins (claimed to be 500 years ago) and was estimated by other tools to be roughly half and half. This part is well reflected in the blue migration route with ancestors who finally arrived to England between 211-1950 (the huge gap in dates is the results of the ongoing England-Irish mixtures that kept “resetting” the clock. GPS Origins simply cannot catch a break so it reports the whole period). The migration line begins in the historical region of the Roman Empire who left England between the 4th and 5th century, without some of their genes. The red line represents a clear nordic ancestry dated to 659-1366 AD.
Why the Nordic ancestry? Nordic/Norwegian tribes, like the Vikings, are some of the groups that contributed genes to modern day English. This line probably complements the Irish ancestry of the participants. But why the lines do not converge in England/Ireland and show this separation? Warning, that explanation would make much more sense to Americans. Recall the recent elections where essentially all polls predicted Hillary’s victory but were eventually wrong (who can forget right?)? One of the reasons for this failure is the attempt to call very close battles. Models based on probability did slightly better when estimated the chances for Trump’s victory in x% . GPS Origins works in a similar way, yes it sees that the person has two ancestors from England but it also assumes that the data may be off due to genotyping errors and other issues. So it goes through a heuristic step where it mutates the data to simulate the error and runs the model multiple times. The best solution is the one that is being presented. In this case, the participant get to see their more ancestral origins.
Case number 2. Mixed European origin
The second participant has a known 3/16 German and 1/16 Irish ancestry with the remainder being colonial American, and presumably English. The migration stories show that their ancestors came from Greece prior to 696 AD, and from Russia prior to 1935 AD. If both routes converge is the the sign of a true love? Dare to hope.
As with all cases of people with multiple ancestries, GPS Origins needs to make a choice because it is limited to two migration routes (I know, we are working on this…) . Here, GPS Origins picks Germany as the major and most dominant ancestry, which was also highly common in colonial America.
Case number 3. East Asian descent
Let’s move away from Europe and see what happens in the far East. Here, the participant calculated their ancestry to be roughly 5/8 Thai (but mixed) and 3/8 Southern China. Both the migration routes start in Kyrgyzstan, but why? This is the point to note, that the migration routes are calculated independently of each other. We don’t try to make them match, intercept, run away from each other, chase each other, or whatever scenario you can think of. When the migration route meet it is because people’s movement meet. When they cross, it is because gene flow has occurred, and when they start from the same place, well, it is because a lot of people left that place. Why?
It is difficult to say. GPS Origins can only date the last point (I know, we are working on this one too…). GPS Origins dated the arrival of the northern ancestors in northern China between 1183 AD and 1617 AD and the southern ancestors to 1150 AD and 1751 AD. These dates are very similar (and they are also calculated independently). This is probably a good time to recall your history lessons on Chinese history, start reading about Chinese history, of just press the blue icon that would give you an accurate overview of the event that led people to move in/out of the region and may have prompted these outgoing migrations.
The Northern Chinese ancestry does not end up in south China because the south signal was not strong enough. This may be due to the mixed ancestry or avoiding gene exchange with southern populations. The blue ancestry misses Thailand by a hair, but still within the expected range of error.
Case number 4. Turkish descent
The final results are of a Turkish participant. The paternal grandfather was born in Bulgaria and paternal grandmother was born in Romania but both claim to be Turks. The maternal grandmother is from Siverek (south east Turkey). The maternal grandfather is from Samsun north coast of Turkey). Here is what GPS Origins reported. Those large circles represent uncertainty and they appear if you zoom-in the results. Turkey, like the rest of the Near East is an extremely heterogeneous place and most of these points – particularly the red one – are along the Silk Roads so we can expect heavy DNA traffic in this area.
Let us assume that the red line represents the maternal line (we are working on telling which is which…). The mixed ancestry can be due to the region or the northern and southern grandparents, or most likely both. Therefore, between north and south Turkey GPS Origins picks the middle way (if both are pulling in equal strength). The blue line, may represent the paternal line, which starts from the Caucasus, crosses Turkey and then moves a bit out to Crete, a jumping site to the Balkan where they were eventually born (without mixing with the local populations). It is difficult to say whether this represents a slight shift in the ancestry towards Greek, because the island is very close to Turkey and was under Turkish influence for extended time periods. So let’s look at the dates, this is what they are there for.
The blue migration route indicates that the ancestors came from the Caucasus prior to 1244 AD. During this time many tribes left the region and moved out outh. These ancestors passed through Turkey and ended up in Crete sometime between 1244 AD and 1557 AD. The red migration route shows a more ancient ancestry in Turkey moved eastwards between 1037 AD and 1527 AD. I suspect that this line would be split in the next GPS Origins that would show four migration routes.
I am happy to hear your thoughts and comments.