McDonald Ancestry Analysis II

When my sister got her 23andme results, we sent them over to Doug McDonald. I was expecting something close to my results, but it was radically different:

This one is different it says 37% Druze, 4% Bushman or Pygmy, the rest North India. It is complicated enough that the program refuses to generate a spot on the map. The chromosome painting looks quite reasonable for that assignment.
I am including several plots .. these show just how odd this is.

Here are the PCA plots that Doug sent. My sister is shown by the crosshairs.

Think of this as two-dimensional projections of a multidimensional space and you’ll notice that my sister is not close to any of the reference groups.

You can see her 3-D position (“Test Person”) in the animation below (or by clicking on animation).

Her chromosome painting, a similar concept to 23andme’s ancestry painting, shows which chromosome segments are most like some population. As you can see, there are a few chromosomes that have almost no “South Asian” segments.

I was very surprised by my sister’s results, especially the 4% Bushman/Pygmy. I expected some East African admixture due to the Egyptian ancestry but no Pygmy. Also, I expected some (10-20%) Middle East contribution but Druze at 37% is just too high. So I asked Doug McDonald to redo my ancestry analysis with the new version of his software.

Here’s what he told me:

It says you are half North India, 3% Bushman or Pygmy, and the rest Iranian, OR 80% Sindhi, 2% Bushman or Pygmy, the rest being Bedouin.

The spot on the map is far SW Pakistan.

The Pygmy is clearly a mistake!

The Pygmy is definitely a mistake. Pygmies are a very distinctive population and because genetic diversity is very high in Africa, the continent of humanity’s origin, sometimes these reference populations can give weird results. These analyses basically try to fit your genetic data to reference populations’ data samples. That’s one reason why you see Sindhi or Pathan as a result for Punjabis because there are no Punjabis in the reference data of HapMap or HGDP.

Here are my PCA plots:

And here is my chromosome painting:

Doug McDonald Ancestry Analysis

As I noted last time, I was in a situation where I needed some help into ascertaining my genetic ancestry. Fortunately, there are people willing to do that sort of analysis for you. One of these is Doug McDonald. So I sent him my data and within an hour I had an analysis.

The PCA plot below shows me as a large cross in relation to different reference populations (like Europeans, Africans, East Asians etc).

Doug McDonald Ancestry Plot for me

Here’s what he said:

We also do quantitative tests. These come in three flavors, first without South Asia (represented by Pakistan) and the Mideast, second with South Asia, and finally with all three, as comparison panels.

The typical random error in the data (standard deviation) is 1%, meaning that numbers less than about 2% are not highly significant. There are also systematic errors. In particular, there is cross-coupling of values for Europe, the Mideast and S. Asia. For example, on the middle panel, a pure, northwestern European measures about 9% S. Asian, and on the third panel they typically measure 4.5% Mideastern and 8% S. Asian. Actual people from South Asia or the Mideast always test at least 15% European.

His first panel:

Europe 71.1%
East Asia 12.3%
Africa 8.2%
Oceania 4.7%
America 3.3%

When South Asia is added:

South Asia 48.7%
Europe 36.8%
Africa 5.8%
East Asia 5.0%
Oceania 2.4%
America 1.2%

And finally when Middle East is added to the list:

South Asia 46.9%
Europe 29.0%
Mideast 11.0%
East Asia 5.1%
Africa 4.3%
Oceania 2.3%
America 1.4%

And here is his analysis of these results:

This is basically a person from somewhere in region from say Iraq to Pakistan, with a substantial African contribution. The East Asian is probably not real. The African could be a few percent direct recent admixture, or it could be in input from a previously mixed population like the Makrani of Pakistan. My techniques can’t tell them apart.

The most interesting thing here for me was the African percentage.