Aryan Debate: Do the recent genetic studies validate Aryan invasion theory?

Aryan Debate: Do the recent genetic studies validate Aryan invasion theory?

Last September, with the release of the genetic paper titled ‘An Ancient Harappan Genome Lacks Ancestry from Steppe Pastoralists or Iranian Farmers’ by Shinde et al, which is about study conducted on ancient remains of a female individual from Rakhigarhi site of Harappan civilization, the topic of Aryan invasion/immigration theory (AIT) was again in the news with heated arguments all over the internet. Along with this paper, another paper titled ‘The formation of human populations in South and Central Asia’ by Narasimhan et al which is focused on study of ancient population remains from Central Asia & Indian subcontinent was also released. Both of these papers endorsed the Kurgan theory of Indo-European expansions which views that the Indo-Aryan (i.e early Vedic Sanskrit) speakers migrated from their original homeland in the Eurasian steppes into India during late/post Harappan era, around 2000-1500 BCE. In this article I discuss few crucial issues concerning the AIT scenario proposed by the two studies, especially I write how the Kurgan theory is far from perfect to explain the Indo-Aryanization of northern Indi

Gandhara (Swat valley) grave culture and supposed Aryan entry

To begin with, Narasimhan et al states that the steppe ancestry in late bronze to iron age Gandhara/Swat valley (by iron age it was obviously part of Vedic realm) region is mostly derived through females & there are really low cases of R1a paternal male based Y-DNA haplogroup in that region. On the other hand, it says modern Indians mostly have steppe admixture through paternal line.

“In the Late Bronze Age and Iron Age individuals of the Swat Valley, we detect a significantly lower proportion of Steppe admixture on the Y chromosome (only 5% of the 44 Y chromosomes of the R1a-Z93 subtype that occurs at 100% frequency in the Central_Steppe_MLBA males) compared with ~20% on the autosomes (Z = −3.9 for a deficiency from males under the simplifying assumption that all the Y chromosomes are unrelated to each other since admixture and thus are statistically independent), documenting how Steppe ancestry was incorporated into these groups largely through females (Fig. 4). However, sex bias varied in different parts of South Asia, as in present-day South Asians we observe a reverse pattern of excess Central_Steppe_MLBA–related ancestry on the Y chromosome compared with the autosomes (Z = 2.7 for an excess from males) (13, 56) (Fig. 4). Thus, the introduction of lineages from Steppe pastoralists into the ancestors of present-day South Asians was mediated mostly by males.”

  • Narasimhan et al, p. 11

So who brought this male based steppe admixture into India if not Swat valley people? Gandhara grave culture of Swat valley is considered as the gateway of Indo-Aryans into India as per the steppe theory. Should we infer that the Indo-Aryanization in India is indeed derived through females, that local men mixed with incoming steppe females? Unlikely, considering patriarchal and patrilineal tradition of Vedic Aryans & other Indo-Europeans. There are no chances of steppe ancestry spreading in India during later times either, as the steppe people had East Asian ancestry by then which is hardly found among Indians.

“This East Asian–related admixture is also seen in later groups with known cultural impacts on South Asia, including Huns, Kushans, and Sakas, and is hardly present in the two primary ancestral populations of South Asia, suggesting that the Steppe ancestry widespread in South Asia derived from pre–Iron Age Central Asians

  • Narasimhan et al, p. 5.

“We also observe individuals from Steppe sites (Krasnoyarsk) dated to between ~1700 and 1500 BCE that derive up to ~25% ancestry from a source related to East Asians (well-modeled as ESHG), with the remainder best modeled as Western_Steppe_MLBA. By the Late Bronze Age, ESHG-related admixture became ubiquitous, as documented by our time transect from Kazakhstan and ancient DNA data from the Iron Age and from later periods in Turan and the Central Steppe, including Scythians, Sarmatians, Kushans, and Huns (29, 52). Thus, these first millennium BCE to first millennium CE archaeological cultures with documented cultural and political impacts on South Asia cannot be important sources for the Steppe pastoralist–related ancestry widespread in South Asia today (because present-day South Asians have too little East Asian–related ancestry to be consistent with deriving from these groups), providing an example of how genetic data can rule out scenarios that are plausible on the basis of the archaeological and historical evidence alone (13) (fig. S52).”

 – Narasimhan et al, p. 7

Thus, if a movement from steppes into India indeed happened, it was prior to 1700-1500 BCE when East Asian admixture became visible in steppes. But during same period, we have extremely low cases of paternal steppe admixture and autosomal steppe admixture as well (just around 20% steppe admixture back then) among the ancient people of Swat valley, who are considered as early Indo-Aryans. The 20% steppe admixture in Swat valley indicates that the movement from steppes, that too by females, was really small, and not any large scale male dominated invasion or population replacement. It is unlikely that these small waves of female immigrants managed to Indo-Aryanize northern India.

BMAC and the case of Aryan acculturation

As per steppe theory, the Oxus civilization or Bactria-Margiana Archaeological Complex (BMAC) from Central Asia acted as proxy for Aryan entry into Swat Valley and further into India.

Narasimhan et al, p.1 summary states the following:

“The main population of the BMAC carried no ancestry from Steppe pastoralists and did not contribute substantially to later South Asians. However, Steppe pastoralist ancestry appeared in outlier individuals at BMAC sites by the turn of the second millennium BCE around the same time as it appeared on the southern Steppe.”

Also the paper on p. 4 further adds:

“Specifically, our analyses reject the BMAC and the people who lived before them in Turan as plausible major sources of ancestry for diverse ancient and modern South Asians by showing that their ratio of Anatolian farmer– related to Iranian farmer–related ancestry is too high for them to be a plausible source for South Asians [P < 0.0001, c2 test; (13)] (figs. S50 and S51). A previous study (30) fit a model in which a population from Copper Age Turan was used as a source of the Iranian farmer–related ancestry in present-day South Asians, thus raising the possibility that the people of the BMAC whom the authors correctly hypothesized were primarily derived from the groups that preceded them in Turan were a major source population for South Asians. However, that study only had access to two samples from this period compared with the 36 we analyze in this study, and it lacked ancient DNA from individuals from the BMAC period or from any ancient South Asians. With additional samples, we have the resolution to show that none of the large number of Bronze and Copper Age populations from Turan for which we have ancient DNA fit as a source for the Iranian farmer–related ancestry in South Asia.”

So basically the article rejects the ancient bronze age population of BMAC in Central Asia as source for ancestry in India. This goes against the mainstream ‘Kulturkugel’ Kurgan theory of Indo-Aryan expansions into India which states that the incoming Indo-Aryans from the steppes took over the BMAC culture and from there, it was the Aryanized BMAC people who moved south into India by adopting new elements like sacred Soma drink, camels, unidentified language substratum etc. The mainstream Indo-Europeanists modelled this theory precisely because archaeologically, there is no trace of steppe cultural traits in India, like large scale Kurgan burials.

Noted Indo-Europeanist David W Anthony who champions the Kurgan theory states the following in his book:

“The Mitanni dynasts came from the same ethnolinguistic population as the more famous Old Indic-speakers who simultaneously pushed eastward into the Punjab, where, according to many Vedic scholars, the Rig Veda was compiled about 1500-1300 BCE. Both groups probably originated in the hybrid cultures of the Andronovo/ Tazabagyab/ coarse-incised-ware type in Bactria and Margiana.

The language of the Rig Veda contained many traces of its syncretic origins. The deity name Indra and the drug-deity name Soma, the two central elements of the religion of the Rig Veda, were non-Indo-Iranian words borrowed in the contact zone. Many of the qualities of the Indo-Iranian god of might/victory, Verethraghna, were transferred to the adopted god Indra, who became the central deity of the developing Old Indie culture. Indra was the subject of 250 hymns, a quarter of the Rig Veda. He was associated more than any other deity with Soma, a stimulant drug (perhaps derived from Ephedra) probably borrowed from the BMAC religion. His rise to prominence was a peculiar trait of the Old Indie speakers. Indra was regarded in later Avestan Iranian texts as a minor demon. Iranian dialects probably developed in the northern steppes among Andronovo and Srubnaya people who had kept their distance from the southern civilizations. Old Indic languages and rituals developed in the contact zone of Central Asia.”

  • The Horse, the Wheel, and Language p. 454

He adds:

“The BMAC fortresses and cities are an excellent source for the vocabulary related to irrigation agriculture, bricks, camels, and donkeys; and the phonology of the religious terms is the same, so probably came from the same source. The religious loans suggest a close cultural relationship between some people who spoke common Indo-lranian and the occupants of the BMAC fortresses.”

  • The Horse, the Wheel, and Language p. 455
Map showing supposed Indo-European and Indo-Aryan expansions from the steppes as per Kurgan model. Source:

So, if the Aryans from the steppe took over BMAC culture, it is all likely that they genetically mixed with the BMAC population as well when adopting their material culture. It was not mere interaction or trade contacts which existed between BMAC and steppe groups, it was case of assimilation and syncretic acculturation. In fact the study also states that there was a movement from Central Asia into the steppes as well, as BMAC or Iranian farmer related ancestry shows up in the Kazakh steppe.

“The BMAC-related admixture in Kazakhstan documents northward gene flow onto the Steppe and confirms the Inner Asian Mountain Corridor as a conduit for movement of people.”

  • Narasimhan et al p. 6 :

“In the Central Steppe (present-day Kazakhstan), an individual from one site dated to between 2800 and 2500 BCE, and individuals from three sites dated to between ~1600 and 1500 BCE, show significant admixture from Iranian farmer–related populations that is well-fitted by the main BMAC cluster, demonstrating northward gene flow from Turan into the Steppe at approximately the same time as the southward movement of Central_ Steppe_MLBA-related ancestry through Turan to South Asia. Thus, the archaeologically documented spread of material culture and technology both north and south along the Inner Asian Mountain Corridor (3, 49, 50, 51), which began as early as the middle of the third millennium BCE, was associated with substantial movements of people.” Narasimhan et al p. 7

Most importantly, if the steppe people expanded south from BMAC into India, it is all likely that they also brought BMAC-specific ancestry into India. But the genetic study by Narasimhan et al clearly rejects BMAC ancestry among Indians.

Harappan ancestry peaks among north Indian Indo-Aryan speakers than south Indian Dravidian speakers.

Finally, Narasimhan et al on p. 5 states the following about Indian ancestry:

“An ancestry gradient of which the Indus Periphery Cline individuals were a part played a pivotal role in the formation of both the two proximal sources of ancestry in South Asia: a minimum of ~55% Indus Periphery Cline ancestry for the ASI and ~70% for the ANI. Today there are groups in South Asia with very similar ancestry to the statistically reconstructed ASI, suggesting that they have essentially direct descendants today. Much of the formation of both the ASI and ANI occurred in the second millennium BCE. Thus, the events that formed both the ASI and ANI overlapped the time of the decline of the IVC.”

So the Ancestral North Indians (ANI) shared more Indus periphery/Harappan ancestry than Ancestral South Indians (ASI). This only means that the modern northern Indian Indo-Aryan speakers have more Harappan ancestry than the south Indian Dravidian speakers. This goes against the theory of Dravidianists that the Harappan civilization was inhabited by Dravidians and it’s prime language was Dravidian.

Also, it would seem that around 30% of the ANI ancestry is from steppe groups, rest 70% being Harappan. So this 30% steppe ancestry is quite low compared to the predominant Harappan ancestry among ANI, and would’ve came from steppes in small waves, instead of any large scale invasion and population replacement.

So how did these small waves of Indo-Aryan migrants managed to change entire cultural zone of northern India within few centuries? Historically, steppe and Central Asian tribes like Shakas, Kushana/Yuezhis, Hephthalites etc did invade India and managed to establish their empire in India. However they got absorbed into Indian culture via adoption of Buddhism and Hinduism. We also have the case of Mitanni elites who had Aryan names and worshipped Vedic Gods in west Asia, however they couldn’t completely Aryanize the native Hurrian speaking population of the region. Instead eventually they got absorbed into the native population after their defeat by Assyrians.

Same fate would’ve awaited the small waves of steppe immigrants who would’ve came into India during late bronze age. Harappan civilization was the largest civilization of it’s times, covering area large as ancient Egypt and Mesopotamia put together. Obviously it would’ve hosted more population as well, with the large cities like Rakhigarhi, Mohenjo-Daro, Dholavira, Harappa etc. It is quite unlikely that the Harappan language(s) which was spoken in such a large civilization vanished completely without leaving any linguistic traits.

Also, even as per Indologist Michael Witzel who endorses the steppe Kurgan theory, only around 4% of early Vedic Sanskrit vocabulary has non-Indo-Aryan loan words.

Some 4% of the words in the Rgvedic hymns that are composed in an archaic, poetic, hieratic form of Vedic, clearly are of non-IE, non-Indo-Aryan origin. In other words, they stem from pre-IA substrate(s).

  • Linguistic Evidence for Cultural Exchange in Prehistoric Western Central Asia by Michael Witzel p.4.

Obviously, if the steppe people who represent the 30% of ANI ancestry came into India in small waves, they would’ve gotten more linguistic substratum from the Harappans who consists 70% of ANI ancestry. Thus it is a big mystery how these small waves of immigrants managed to change the entire cultural sphere of northern India even after mixing heavily with the Harappans, and without any trace of mass substratum on their language. Also, as mentioned above previously, it is clear from Swat valley samples that even during around late bronze age/early iron age there was very low steppe admixture, even lower than 30%, and that too with admixture from female line instead of any male dominated expansions.

The case of Rakhigarhi DNA

Now we move on to the other paper by Shinde et al which was released along with the paper by Narasimhan et al. Based on the Rakhigarhi DNA sample of a single female, the paper highlights that the Harappans had diverged from Iranians 12000 years ago and that agriculture was a native development in India. The paper also states that the Harappans had no steppe based ancestry, Based on this many reports suggested that the Harappans did not speak Indo-European languages which would’ve spread from the steppes during later times. However, it doesn’t mean that Harappans spoke non Indo-European language just because they didn’t have steppe ancestry. In fact ancient DNA study conducted on Anatolian region suggests that the Anatolian Indo-Europeans didn’t have much steppe ancestry, but still they retained their Indo-European languages. To quote from the paper:

“Our results indicate that the early spread of IE languages into Anatolia was not associated with any large-scale steppe-related migration, as previously suggested. Additionally, and in agreement with the later historical record of the region, we find no correlation between genetic ancestry and exclusive ethnic or political identities among the populations of Bronze Age Central Anatolia, as has previously been hypothesized.”

  • The first horse herders and the impact of early Bronze Age steppe expansions into Asia by Damgaard et al

Anatolian branch is said to be among the first to split from the reconstructed Proto-Indo-European or the ancestral Indo-European language. Still they didn’t show much affinity to the population of steppes. Similar could’ve been the case with Harappans, they would’ve got Indo-Aryan or even other early Indo-European languages from a different source than late bronze age steppe cultures, perhaps through contacts with West and Central Asia, which are well attested during Harappan era. Also from the unique centum isogloss features of Bangani language spoken in Uttarakhand which discerns it from satem isogloss features of other Indo-Iranian languages, it is clear that there existed different branch of Indo-Iranian or perhaps even other Indo-European languages in northern India long back, which could’ve been Indo-Aryanized to core in late periods.

Also, as previously highlighted from Narasimhan et al, there are hints of BMAC or Iranian farmer related movement from south into regions of Kazakh steppes in north. It could be possible that this movement would’ve brought Indo-European languages into the steppes, if one view that BMAC or other Iranian farmer-related groups were Indo-European speakers. But this would just remain as a speculation until we could get more evidences.


To sum up, how exactly the Indo-Aryanization India happened would remain unclear unless we get more ancient DNA samples, especially that of male individuals, from various Harappan sites and Gangetic regions as well. Until detailed studies are conducted on them, we can only speculate about the ancient events. Thus, far from validating AIT, these two papers, both Narasimhan et al and Shinde et al, leaves out many unresolved issues.

In my opinion, based on the current genetic data we have, we can now safely reject the kulturkugel BMAC proxy theory and the theory of large scale male dominant invasion into Swat valley which were modelled by the Indo-Europeanists and Indologists for Aryan expansions into India. Also since Indus periphery/Harappan ancestry overwhelmingly peaks among ANI group whose closest descendants are North Indian Indo-Aryan speakers, it is likely that the Indus periphery/Harappan ancestry would have represented some early Indo-Aryan or perhaps other related groups.

Featured Image: Pinterest

Disclaimer: The opinions expressed within this article are the personal opinions of the author. IndiaFacts does not assume any responsibility or liability for the accuracy, completeness, suitability, or validity of any information in this article.