Harappa Ancestry Project

I have become interested (some would say obsessed) with genetics recently. I wrote about getting my DNA test done and there’s a lot more about my own results that I plan to bore you with.

One fun application of genetic testing is inferring ancestry: Which ancestral group are you descended from? Can we estimate the admixture of the different population groups you are descended from?

Most DNA testing companies provide information about ancestry and genetic genealogy has taken off. With several genome databases (HapMap, HGDP, etc) and software (like plink, admixture, Structure) publicly available, the days of the genome bloggers are here. And I am trying to be the latest one.

In starting this project, I have been inspired by the Dodecad Ancestry Project by Dienekes Pontikos and Eurogenes Ancestry Project by David Wesolowski. The catalyst for this project was my friend Razib who I bug whenever I need to talk genetics.

What is Harappa Ancestry Project?
It is a project to analyze (autosomal) genetic data of participants of South Asian origin for the purpose of providing detailed ancestry information. So the focus of the project is on South Asians: Indians, Pakistanis, Bangladeshis and Sri Lankans.

The project will collect 23andme raw genetic data from participants to better understand the ancestry relationships of different South Asian ethnicities.

I have named it after Harappa, an archaeological site of the Indus Valley Civilization in Punjab, Pakistan.

People of South Asian origin, or from neighboring countries, are eligible to participate. The list of countries of origin I am accepting are as follows:

  • Afghanistan
  • Bangladesh
  • Bhutan
  • Burma
  • India
  • Iran
  • Maldives
  • Nepal
  • Pakistan
  • Sri Lanka
  • Tibet

Right now, I am only accepting raw data samples from people who have tested with 23andme.

Please do not send samples from close relatives. I define close relatives as 2nd cousins or closer. If you have data from yourself and your parents, it might be better to send the samples from your parents (assuming they are not related to each other) and not send your own sample.

If you are unsure if you are eligible to participate, please send me an email (harappa@zackvision.com) to inquire about it before sending off your raw data.

What to send?
Please send your All DNA raw data text file (zipped is better) downloaded from 23andme to harappa@zackvision.com along with ancestral background information about you and all four of your grandparents. Background information would include where they were born, mother tongue, caste/community to which they belonged, etc. Please provide as much ancestry information as possible and try to be specific. Do especially include information about any ancestry from outside South Asia.

Data Privacy
The raw genetic data and ancestry information that you send me will not be shared with anyone.

Your data will be used only for ancestry analysis. No analysis of physical or health/medical traits will be performed.

The individual ancestry analysis published on this blog will be done using an ID of the form HRPnnnn known to only you and me.

What do you get?
All results of ancestry analysis (individual and group) will be posted on this blog under the Harappa Ancestry Project category. This will include admixture analysis as well as clustering into population groups etc.

I suggest you read about Dienekes’ analysis on South Asians for an idea about what to expect.

You can access all blog posts related to this project from the Harappa Ancestry Project link on the navigation menu on every page of my website. You can also subscribe to the project feed.

By Zack

Dad, gadget guy, bookworm, political animal, global nomad, cyclist, hiker, tennis player, photographer


    1. Aaron: Right now, my focus is on South Asia.

      Indonesia by itself could be a fair amount of work because of the large number of islands likely resulting in a very diverse population.

      You are welcome to follow my blog/project in case I open it up to other groups in the future.

  1. Unfortunately, 23andMe doesn’t deliver to India, Pakistan and the periphery itself, thus reducing the number of potential samples you could have had for your project. I myself am a genetics enthusiast from India, and there are tons more people who are highly interested in taking the 23andMe test.

    Regardless, I will be closely following in on this project. It is likely to be a highly interesting analysis, especially considering that South Asians are a rather under-studied and under-sampled population.

    1. Yes, Vasishta, 23andme doesn’t deliver there. I am hoping there might be enough people of South Asian origin in Europe and North America who can participate right now.

      BTW if you (or someone else in India/Pakistan) is very interested, there is a somewhat risky way: Ask friends/family in the US to order it and then ship to you.

      1. Don’t worry, there are quite a few Indians who’re waiting to submit their Raw Data to you upon receiving their 23andMe results. Most of them got their kits during the initial Holiday sales that took place in December ’10.

        As for me getting a 23andMe kit, I’ll take the risk when there’s a $99 sale. INR 10,337 Bucks ($228)is way too much to put at stake with the delivery or airport authorities (in case I have it brought by a relative in his/her baggage).

        Btw, as an aside, I wanted to ask you – Would you be aware of any restrictions against carrying empty 23andMe DNA kits in your baggage at Airports? Will the Airport authorities make you empty the liquid in the kit, regardless of the fact that it is empty and very small in quantity? I was initially planning on asking a relative to have it delivered to his residence which is in the US (NJ) and then consequently bring it to me in India. Will he face any problem at the Security checks? Also, with regards to Airport authorities, what would be the case for “used” (i.e containing spit) DNA kits entering the US in your baggage?

        I am considering ordering a 23andMe kit IF there is no risk with the Airport authorities. I thought asking you would help.

        1. Good to hear that there are people waiting to participate. I am hearing that 23andme version 3 results are going to start getting posted tonight or tomorrow.

          I believe that taking the unused kit to India might not be an issue, though I have no idea what the Customs rules are. However, I am not sure about the return journey with the spit.

          I have read about people mailing the kits internationally to countries not served by 23andme. So it is doable at least for some countries. May be you could check on any forum where there might be people who have tried this?

  2. Hey Zack,

    I’m actually a bioinformatician by profession and my degrees are in bioinformatics. I’m interested in what you are doing and I’d be happy to lend a helping hand where I can. Let me know if you need help interpreting data, writing scripts, or some sort of consultation. I’m working 80 hour weeks at the moment, but will help where I can. 😀

Comments are closed.