Search This Blog

Loading...

Monday, April 23, 2012

Eurogenes ADMIXTURE utilities at Gedmatch


Update 27/11/2013: I've made the new K13 the default Eurogenes admix test at GEDmatch. It seems to hit the spot for most people. See here.

Update 11/10/2012: The Jtest and EUtest are now available at GEDmatch, and these have superseded most of my tests offered there, except those that include Amerindian references samples. Also, the Jtest and EUtest come with a variety of "Oracle" ethnic matching tools. For more info see here.

...

Update 07/05/2012: Eurogenes' Gedmatch ADMIXTURE ancestry test guide.

...

Gedmatch now features a series of ancestry tests based on my experiments with the ADMIXTURE software. The K9 to K12 are basically part of a single package, and specifically designed for people with majority of their recent ancestry from north of the Alps and Carpathians. However, the K9 will be useful for many other individuals too. Then there's the K12b, the K13, and a new version of my Hunter-Gatherer vs. Farmer test. The latter two should work well for people from all over the globe.

The K9 to K13, K12b and Hunter-Gatherer vs. Farmer tests don't suffer from the usual "calculator effect", whereby project members get somewhat different scores from non-members of exactly the same ancestry. In other words, all results are directly comparable, and I encourage all users to swap notes vigorously. However, it's important to understand that you won't see dramatic differences in the levels of the inta-North European components. That's because they're largely based on very similar allele frequencies, and separated by low Fst (genetic) distances. Indeed, note the progression from one North European cluster at K9, to four at K12 (North Sea, South Baltic, Volga-Ural and Western European). Generally speaking, the four northern clusters at K12 can be thought of as a subset of the K9 cluster.

Thus, if you're looking for a fairly clean cut summation of your ancestry, then use the results from K9 or the Hunter-Gatherer vs. Farmer tests, where genetic differentiation between the clusters is greatest. But I think the K10 to K13 are more useful as chromosome paintings, and that's precisely because they're not as clean cut. As a result, much of your genome won't be covered in just one color, but in a mosaic of colors. Studying these patterns at local level, and cross checking the information with other results, like Ancestry Finder data from 23andMe, might be a useful way of pinpointing segments from very specific parts of Europe.

Below is my K12 chromosome painting. Yes, it's kind of noisy, but in fact, it's also very informative. I can spot some very close correlations with my Ancestry Finder results. It's not only matching single segments, but whole clusters of segments. Actually, here's an idea: download all the paintings, from K9 to K12, and then view them like an animation using the default Windows image viewer. It's interesting to see how the North Euro cluster breaks up into four different clusters.



43 comments:

Ziemowit said...

Polako, I am testing Gedmatch and comparing results with previous Eurogenes runs.
For K=10,11,12 my new Gedmatch results are very different from old K=10,11,12 Eurogenes runs. Only for K=12b are similar. Here are some results:

https://docs.google.com/spreadsheet/pub?key=0AkbFGFGkvhh9dHRoY3ZBUlZ0NmpPbjBCMXFkdG9ZU2c&output=html

I really don’t know what to think.

EastPole

Davidski said...

^ They're not the same tests though. They rely on different clusters, based on very different allele frequencies. Even the names of many of the clusters aren't the same.

It's actually not possible to reproduce the results from ADMIXTURE at intra-North European level for people who aren't included in the original runs, so I stopped trying.

The results from the above tests aren't as drastic (ie. one sided) with their overall classifications, but they're still very accurate.

Eduardo Pinto said...

Hello David

Could you please post the spreadsheets. I need references for comparison.

Davidski said...

^ Yes, I will, but it might take a bit of time. I'll actually post a comprehensive guide to all the Ks.

Eduardo Pinto said...

Thank you David!
BTW... I know it is not within the aim of your project, but it would also be nice see an intra-south European ADMIXTURE run in a near future.
I'm sure it would be much easier to achieve than a northern one, as there are many more genetic substructures in southern Europe.

jackson_montgomery_devoni said...

Awesome stuff yet again David!...One quick question though...Is the North European component in the K9 run of this analysis/calculator the same or similar to the North European hunter-gatherer (+Neolithic admixture) component from your K8 run of ''So who's the most European of us all?'' analysis?

Ronald Gillesapie said...

Ronald Gillespie. When I use these new programs to me I see migrations. I have been studying this stuff for a year and I can see my findings in these new programs.

Davidski said...

^ Jackson,

Yes, those clusters are very similar. But this K9 shows a somewhat higher (5-10%) North Euro score.

jackson_montgomery_devoni said...

Ahhh right on cool stuff thanks David.

@Ronald Gillespie,

It sounds like you have some interesting observations about these ADMIXTURE components and migrations within Europe. Would you care to share any of your thoughts or findings about this topic?

EastPole said...

There is an interesting blog entry by Razib Khan:
http://blogs.discovermagazine.com/gnxp/2012/04/one-baby-alone-on-a-pca-island/


Razib writes:
“When I read ADMIXTURE bar plots I try hard (and do not always succeed) to remember that they are telling with excellent precision relative relationships, but they are not telling me absolute truths”.

I also keep on forgetting that ADMIXTURE results are relative. Although I am getting different proportions of different populations in different ADMIXTURE runs, my position on PCA or MDS maps probably does not change much.

Problem is that because all those populations proportions are relative and not quite real it is difficult to use them for deducing the history of migrations within Europe. Many other data has to be used for that.

Davidski said...

^ I was skeptical for a long time that ADMIXTURE results could be signals of ancient population movements. But my dataset is now very comprehensive, as far as West Eurasia is concerned anyway, and I've run hundreds of comparisons with it. The output is always showing the same general patterns, and those patterns match what we know about different Neolithic waves from archeology. So I think the most robust clusters are indeed signals of ancient migrations.

But yes, the scores are always relative. It doesn't really matter what they are in absolute terms, but it's essential that they're correct in relative terms. The K12b still isn't correct in that context, due to the "calculator effect", but I'll sort that out shortly.

Tom Moffatt said...

Can you tell me what defines paleo-mediterranean as a grouping? Is there a blog post with the definitions as the groups are being used? Thanks.

Davidski said...

^ I don't have Paleo-Mediterranean in my tests.

Horsley said...

What do the drop down tabs that say 1-E5 etc mean and which one is best to use for the most accurate representation of recent ancestry?

Davidski said...

I'm getting solid results at all the settings (from 1E-2 to 1E-7), and they really don't differ that much. But maybe that's just me?

~Elizabeth~ said...

Mine differ slightly from 1E-2 thru 1E-7 but now that Gedmatch.com says 1E-7 is the most accurate I use that one.

Thanks so much for creating the Eurogenes Project and for adding it to Gedmatch. I've also used the Dienekes/Dodecad and MDLP links at Gedmatch.

Lerbea said...

Sorry I'm so new with questions. On the Eurogenes K9 Admixture, is there any way to tell from Maternal or Paternal inheritance of a North Amerindian proportions the results come from?

Matthew said...

The 1-E5 is similar to the scientific calculator representation for 1x10^-5 or one part in 100,000. These admixtures are calculated using data fits to the DNA data. In the case of 1-E5, when the error improvement is less than one part in 100,000, the calculation stops and the results are displayed. 1-E6 means one part in 1 million and so forth.

A higher E-value of 1-Ex- value will give you a more confident result. This is to say that 1-E6 gives you more confident values than 1-E5. It also takes longer to run the calculation. The nature of fitting data to models means that the results have natural uncertainties. If the improved results are within the natural uncertainties in the data, then you waste your time by doing more confident calculations.

villandra said...

Could you possibly give us the studies or study cites, ie, locations, where the genetic data used to construct the components, is based on? I'm seeing that I'm Volga-Ural oneminute and North European another, and that North European is Latvian or something of the sort, and that eastern Finland, Finland and Latvia run together in some odd way. I'd like to be able to make more specific sense of the data.

Thanks!

villandra said...

I need to know more specifically relative to WHAT.

Maiysa said...

Hello, I'm new to this and I can't begin to understand my tests results for this. I have a very French great grandfather who could not speak English, but only Western Euro shows anything close to that. Is there a reason for this that I am not understanding? Also some ancestry is showing up that completely shocked me and I had no idea-Baloch. Is this completely accurate or is it more according to percentage. Also when it says Mediterranean, does that mean recent or is it from a very distant relations thousands of years ago? WE do have 2 adoptions with both great grandparents, so I guess anything is possible. Sorry so many questions.
Sincerely,
Maiysa

ironhide781 said...

I tried using the Eurogenes admixture proportions tool on GedMatch.com to confirm the sliver of African DNA (0.1%) in my grandmother's 23andMe results (gedmatch kit #M153907). The tool shows the African DNA in the graphs, but not it in the percentage breakdowns. Do you know why this would happen? I have screenshots on my blog at http://brandtgibson.blogspot.com.

Thanks,

Brandt Gibson
Edgewood, WA

Chad Rohlfsen said...

Davidski,
I have a question about the k11 and k12 test. I am mostly of British and Irish decent as far as I know. There is some Scandinavian and continental stuff in there as well. I am getting a 17% Baltic and 10% Uralic on k11 and 13% Baltic and 8% Uralic on the k12. Are these typical of northwest Europe or is there something more eastern in my ancestry? I'm curious because my grandfather (supposedly 1/2 Danish, 1/2 Irish/British) is rather eastern looking. He could easily pass for someone in western Siberia/Northeast Europe, or half Native American. Thanks!

Davidski said...

If by Uralic you mean Volga-Ural, then yes, such scores are typical of Northern Europe. Also, "eastern" facial traits are common across Northern Europe, and like the alleles which peak in the Volga-Ural today, they're usually of ancient origin, often going back to the Mesolithic.

To see whether you have any recent ancestry from the Volga-Ural or Northern Fennoscandia, run the EUtest and look at the Oracle results.

Chad Rohlfsen said...

The eu test has me at 13% Baltic and 12% east European. My top oracle is central and west German with north and south Swedish and north and south finish there as well. I have almost no known German in my ancestry. Can I be pulled there by having a grandpa with a lot of Baltic, Volga: Uralic from his father?

Davidski said...

Yes, your inflated proportion of Baltic ancestry will pull you towards or even into Germany.

Chad Rohlfsen said...

Interesting. I'm going to try and talk my dad into the 23andme testing to test his proportions. Thanks for making sense of this. Is there a way to post my grandpas pic here to show what I'm talking about? He certainly doesn't look like a "typical" Dane/British person.

Davidski said...

If you like, you can upload it to imageshack or somewhere similar and then post the direct link to it in your comment.

But I've been to Denmark, Sweden and eastern Norway (Oslo) and I have to say that exceedingly eastern-like facial traits are not uncommon in that part of the world. They're also found regularly in the UK and Ireland. Like this from Denmark...

http://i.imgur.com/jaMBhxw.jpg

Chad Rohlfsen said...

That guy is fairly close to my grandpas face. My grandpa is a little more eastern with almost black hair, brown eyes and skin.

Chad Rohlfsen said...

I'd say he's almost identical to coons ladogan type. The young guy with the slicked back hair.

Alex Hernandez said...

How about creating a utility for gedmatch which calculates the % of Neanderthal and Denisovian using the autosomal data?

Ervin L Horton Jr said...

Hi, I just ran my Eurogenes Jtest model with the following results: FYI, the raw data used for this is from my DNA test at 23andme. Regarding the Ashkenazi percentage. Does the low percentage actually mean I may have Ashekenzi or is it considered too low according to most tests out there? I actually have an older sister that has based her whole life on saying that we are of Jewish heritage but all the other DNA tests I have taken do not show enough of a percentage to say that we are. I am still waiting on my mtdna full sequence results from FTDNA also and just order an mtdna test for my mom. All of her maternal ancestors are from Hungary and being Jewish had never been mentioned in their family. It is no big deal if we are or are not but I would like to know for sure. Also I had mentioned to my sister that she should be tested and as far as she is concerned her mind is already made up. thank you.
Population
SOUTH_BALTIC 15.91%
EAST_EURO 8.54%
NORTH-CENTRAL_EURO 27.74%
ATLANTIC 20.75%
WEST_MED 11.57%
ASHKENAZI 3.36%
EAST_MED 6.24%
WEST_ASIAN 4.19%
MIDDLE_EASTERN -
SOUTH_ASIAN 0.97%
EAST_AFRICAN 0.06%
EAST_ASIAN -
SIBERIAN 0.50%
WEST_AFRICAN 0.18%

reg said...

I'm new to the DNA search yet I have a naive question. If you simplify the situation and limit one ancestor to native american, can you roughly guess that if you're 25% on a chromosome then the source relative would be at least 2 generations back? If so does it follow that 12.5% would be 3 back and so on?

Davidski said...

Each chromosome is of a different size, so it's better to try this sort of thing based on all the 22 autosomal chromosomes (ie. your genome-wide result). But even then you can only get a rough idea of haw far back the admixture entered your pedigree. Another way of doing it is to measure the size of the exotic half-segment in centimorgans. From memory, anything over 7 cM is fairly recent, like within the past 250 years.

http://en.wikipedia.org/wiki/Centimorgan

RC Caudill said...

Hey all,
I am new to the wonderful and exciting field of DNA genealogy. Although not new to genealogy. But I have fell further in love with the subject since my DNA test and even further after seeing this site. WOW is all I can say. There is so much research that can be done using these tools and I would like to say thank you to each and every person who has donated and continues to donate their time to gedmatch. I will do all I can do to keep it going and will be donating as well.
Moving on, as I said I am new to this. I received my autosomal DNA test results back from Ancestry 3 days ago. I knew the results I received could not have been correct by a long shot. I have grew up being taught that I was part Cherokee (1/8th) and it is very obvious in the maternal side of my family and myself. However my ancestry DNA results were as follows:

British Isles 76%
Scandinavian 19%
Uncertain 5%

I knew it just couldn't be right. So luckily I found out that Ancestry's results are very incorrect most of the time and I stumbled upon gedmatch. I used the Eurogenes K9 Model and received the following results:

North Euro: 63.78%
South Asian: 2.31%
Caucasus: 6.78%
Southwest Asian: 1.32%
Mediterranean: 24.84%
West African: 0.97%

These results seemed much more legible. And I now know that my "Cherokee" was in fact a mixture of Mediterranean/Mid Eastern. But without gedmatch I never would have known that.
So my question is how do these results stack up to my Ancestry results in terms of, how recent is it? Ancestry explains that their autosomal test goes back from about a few hundred to one thousand years? Are these results basically the same thing? And from the article David states that the Eurogenes K9 model is a fairly clean cut sum of one's ancestry, so is there really a way of determining how accurate it actually is? I undoubtedly know it is much much more so than Ancestry but would it be worth it to transfer my results to Family Tree DNA as well? Or are these basically the same as I would get there? Thanks!

Davidski said...

You know what, your West African admix is a bit higher than I'd expect for a European. Are you sure the "Cherokee" wasn't someone who was actually part African-American?

The Caucasus, Southwest Asian and Mediterranean percentages look pretty standard actually. They're real, but ancient.

In any case, I just sent a new test to John at GEDmatch. It's an upgraded EUtest with 15 clusters, including an Amerindian one, and a comprehensive population averages sheet for the Oracles, including North Amerindians. So if that doesn't catch the Cherokee admixture, then nothing will. It'll probably go up at GEDmatch next week sometime.

Onur said...

You know what, your West African admix is a bit higher than I'd expect for a European. Are you sure the "Cherokee" wasn't someone who was actually part African-American?

David, I think you've hit the nail on the head. Most of claims of descent from Native Americans (especially from popular groups such as the Cherokee) among White Americans may actually be attempts to mask African descent, as descent from Africans was a taboo subject for a long time among White Americans and people with such heritage frequently tried to hide it by concocting origin myths involving descent from Native American groups such as the Cherokee. As generations passed, such family myths would come to be perceived as real as all the knowledge of descent from Africans would be forgotten.

RC Caudill said...

Awesome I can't wait to use the new EUtest and find out what comes up! I have researched the "tri racial" theory behind melungeons in the Appalachians. But I know that in a lot of cases that African and white mixes did hide their true ancestry and just say that they were Cherokee. Especially in the south and during those time periods.
But the thing is I was said to have Cherokee on my father's side as well with my great grandmother supposedly being 1/4th. So I probably couldn't tell which side the West African actually was on without a Maternal and Paternal ancestry test.? I have seen pictures of my great grandpa on my mom's side. Honestly he doesn't look like a African white mix he looks more like Arabic or Mid Eastern but that is just going off of pictures. However on my Father's side I can see more of the standard African traits. Like very thick hair and wide noses. But again I am not sure about that either.

RC Caudill said...

I think that your theory now is even more solid David after I used the Eurogenes Hunter Gatherer vs. Farmer model, the African admixture is Bantu Farmer 0.30% and Pygmy Hunter Gatherer 0.82%. After I used the Eurogenes K9b the African admixture was Sub Saharan 1.13% and South African 0.77%. That is very interesting.

Davidski said...

Try the new K15, and let me know how the oracles interpret your minor Sub-Saharan admix.

http://bga101.blogspot.com.au/2013/10/eurogenes-k15-now-at-gedmatch.html

RC Caudill said...

(I am assuming you do mean the EUtest V2 K15) (Sorry I am a newb lol) Results are:
Sub Saharan 0.97%

RC Caudill said...

Okay yea I see it in your link ;) just making sure

K Dub said...
This comment has been removed by the author.