Proto-Indo-European language

"PIE" redirects here. For other uses, see PIE (disambiguation).

Proto-Indo-European (PIE) is the linguistic reconstruction of the common ancestor of the Indo-European languages. Since there is no written record of this language, the knowledge of it that we have has been reconstructed using historical linguistics, by comparing and extrapolating back from the properties of the languages descended from it.

It is thought that Proto-Indo-European may have been spoken as a single language (before divergence began) around 3500 BC, though estimates vary by more than a thousand years. The original speakers may have originated in the Pontic–Caspian steppe of Eastern Europe, north of the Black Sea. As Proto-Indo-European speakers became isolated from each other through migration, the language spoken by those groups diverged into numerous subfamilies. Today, there are about 445 living Indo-European (IE) languages descended from PIE, with Spanish, English, Hindi, Portuguese, Bengali, Russian, Punjabi, German, French and Marathi being the ten Indo-European languages with the most native speakers in descending order.

PIE is thought to have had a complex system of morphology that included inflectional suffixes as well as ablaut (vowel alterations, for example sing, sang, sung). Nouns and verbs had complex systems of declension and conjugation respectively.

Attempts have been made to compose example texts, which are educated guesses at best: Calvert Watkins observed in 1969 that in spite of its 150 years' history, comparative linguistics is not in the position to reconstruct a single well-formed sentence in PIE. The king and the god is a short dialogue loosely based on the "king Harishcandra" episode of Aitareya Brahmana.

Development of the theory

Classification of Indo-European languages. Red: Extinct languages. White: categories or unattested proto-languages. Left half: centum languages; right half: satem languages

There is no direct evidence of PIE. It has been reconstructed from its present-day descendants using the comparative method.[1]

The comparative method is based on the Neogrammarian rule that the Indo-European sound laws are applied without exception. The method compares languages and applies the sound laws to find a common ancestor. For example, compare the pairs of words in Italian and English: piede and foot, padre and father, pesce and fish. Since there is a consistent correspondence of the initial consonants that is far too frequent to be coincidental, the languages can be assumed to stem from a common parent.[2]

Indo-European studies are generally considered to have been begun by William Jones, an Anglo-Welsh philologist, a puisne judge in Bengal who postulated the common ancestry of Sanskrit, Latin, and Greek.[3] Although his name is closely associated with this observation, he was not the first to make it. In the 1500s, European visitors to the subcontinent became aware of similarities between Indo-Iranian languages and European languages[4] and as early as 1653 Marcus Zuerius van Boxhorn had published a proposal for a proto-language ("Scythian") for the following language families: Germanic, Romance, Greek, Baltic, Slavic, Celtic and Iranian.[5] In a memoir sent to the Académie des Inscriptions et Belles-Lettres in 1767 Gaston-Laurent Coeurdoux, a French Jesuit who spent all his life in India, had specifically demonstrated the existing analogy between Sanskrit and European languages.[6]

In many ways Jones' work was less accurate than his predecessors', as he erroneously included Egyptian, Japanese and Chinese in the Indo-European languages, while omitting Hindi.

In 1818, Rasmus Christian Rask elaborated the set of correspondences to include other Indo-European languages, such as Sanskrit and Greek, and the full range of consonants involved. In 1816 Franz Bopp published On the System of Conjugation in Sanskrit in which he investigated a common origin of Sanskrit, Persian, Greek, Latin, and German. In 1833 he began publishing the Comparative Grammar of Sanskrit, Zend, Greek, Latin, Lithuanian, Old Slavic, Gothic, and German.[7]

In 1822, Jacob Grimm formulated what is now known as Grimm's law as a general rule in his Deutsche Grammatik. Grimm showed correlations between the Germanic and other Indo-European languages, and showed that sound change affects an entire language systematically, and not just some words.[8] The Neogrammarians proposed that sound laws have no exceptions, as shown in Verner's law, published in 1876, which resolved apparent exceptions to Grimm’s law by exploring the role that accent (stress) played in language change.[9]

August Schleicher's A Compendium of the Comparative Grammar of the Indo-European, Sanskrit, Greek and Latin Languages (1874–77) was an early attempt to reconstruct the proto-Indo-European language.[10]

By the early 1900s, well-defined descriptions of PIE had been developed that are still accepted today. The largest developments since then were the discovery of the Anatolian and Tocharian languages and the acceptance of the laryngeal theory. This theory aims to produce greater regularity in the linguistic reconstruction of Proto-Indo-European phonology than in the reconstruction produced by the comparative method.

Julius Pokorny's Indogermanisches etymologisches Wörterbuch ("Indo-European Etymological Dictionary", 1959) gave a detailed, though conservative, overview of the lexical knowledge accumulated up until that time. Kuryłowicz's 1956 Apophonie, gave a better understanding of the Indo-European ablaut. From the 1960s, knowledge of Anatolian became certain enough to establish its relationship to PIE.

Historical and geographical setting

There are several competing hypotheses about when, where, and by whom PIE was spoken. In the most popular[11][12] model, first put forward by Marija Gimbutas, the Kurgan hypothesis, Kurgans from the Pontic–Caspian steppe north of the Black Sea were the original speakers of PIE.[13][14]

According to the theory PIE became widespread because its speakers, the Kurgans, were able to migrate into a large area of Europe and Asia because of technologies including the domestication of the horse, herding, and the use of wheeled vehicles.

The people of these cultures were nomadic pastoralists, who, according to the model, by the early 3rd millennium BC had expanded throughout the Pontic-Caspian steppe and into Eastern Europe.[15]

Mainstream linguistic estimates of the time between PIE and the earliest attested texts (c. nineteenth century BC; see Kültepe texts) range around 1,500 to 2,500 years.

Other theories include the Anatolian hypothesis,[16] the Armenia hypothesis, the Paleolithic Continuity Theory, and the indigenous Aryans theory.

Subfamilies (clades)

The following are listed by their theoretical glottochronological development:[16][17][18]

Description Modern descendants
Proto-Anatolian All now extinct, the best attested being the Hittite language. None
Proto-Tocharian An extinct branch known from manuscripts dating from the 6th to the 8th century AD, which were found in northwest China. None
Proto-Italic This included many languages, but only descendants of Latin survive. Portuguese and Galician, Spanish, Catalan, French, Italian, Romanian, Aromanian, Rhaeto-Romance.
Proto-Celtic The ancestor language of all known Celtic languages. These languages were once spoken across Europe, but modern Celtic languages are mostly confined to the north-western edge of Europe. Irish, Scottish Gaelic, Welsh, Breton, Cornish, Manx.
Proto-Germanic The reconstructed proto-language of the Germanic languages. It developed into three branches: West Germanic, East Germanic (now extinct), and North Germanic. English, German, Afrikaans, Dutch, Norwegian, Danish, Swedish, Frisian, Icelandic, Faroese.
Proto-Balto-Slavic Branched into the Baltic languages and the Slavic languages Baltic Latvian and Lithuanian; Slavic Russian, Ukrainian, Belarussian, Polish, Czech, Slovakian, Serbo-Croatian, Bulgarian, Slovenian, Macedonian.
Proto-Indo-Iranian Branched into the Indo-Aryan, Iranian and Nuristani languages. Nuristani; Indic Hindustani, Bengali, Punjabi, Dardic; Iranic Persian, Pashto, Balochi, Kurdish, Zaza.
Proto-Armenian Eastern Armenian, Western Armenian
Proto-Greek Modern Greek, Romeyka, Tsakonian

Other possible groupings include: Italo-Celtic, Graeco-Aryan, Graeco-Armenian, Graeco-Phrygian, Daco-Thracian, and Thraco-Illyrian.

Marginally attested languages

The Lusitanian language is a marginally attested language found in the area of modern Portugal.

The Paleo-Balkan languages, which occur in or near the Balkan peninsula, do not appear to be members of any of the subfamilies of PIE but are so poorly attested that proper classification of them is not possible.


Proto-Indo-European phonology has been reconstructed in some detail. Notable features of the most widely accepted (but not uncontroversial) reconstruction include three series of stop consonants reconstructed as voiceless, voiced, and breathy voiced; sonorant consonants that could be used syllablically; three so-called laryngeal consonants, whose exact pronunciation is not well-established but which are believed to have existed in part based on their visible effects on adjacent sounds; the fricative /s/; and a five-vowel system of which /e/ and /o/ were the most frequently occurring vowels.

The Proto-Indo-European accent is reconstructed today as having had variable lexical stress, which could appear on any syllable and whose position often varied among different members of a paradigm (e.g. between singular and plural of a verbal paradigm). Stressed syllables received a higher pitch, therefore it is often said that PIE had pitch accent. The location of the stress is associated with ablaut variations, especially between normal-grade vowels (/e/ and /o/) and zero-grade (i.e. lack of a vowel), but not entirely predictable from it.

The accent is best preserved in Vedic Sanskrit and (in the case of nouns) Ancient Greek, and indirectly attested in a number of phenomena in other IE languages. To account for mismatches between the accent of Vedic Sanskrit and Ancient Greek, as well as a few other phenomena, a few historical linguists prefer to reconstruct PIE as a tone language where each morpheme had an inherent tone; the sequence of tones in a word then evolved, according to that hypothesis, into the placement of lexical stress in different ways in different IE branches.



Proto-Indo-European roots are basic morphemes carrying a lexical meaning. PIE was a fusional language, in which the grammatical relationships between words were signaled through inflectional morphemes (usually endings). By addition of suffixes they form word stems, and by addition of desinences (usually endings), these form inflected nouns or verbs.


The Indo-European ablaut is the variation in vowels which occurred both within inflectional morphology (different grammatical forms of a noun or verb) and derivational morphology between, for example, a verb and an associated verbal noun. Originally, all categories were distinguished both by ablaut and different endings, but the loss of endings in some later Indo-European languages has led them to use ablaut alone to distinguish grammatical categories, as in the Modern English words sing, sang, sung.


Proto-Indo-European nouns are declined for eight or nine cases:[19]

There were three grammatical genders:


Proto-Indo-European pronouns are difficult to reconstruct owing to their variety in later languages. PIE had personal pronouns in the first and second grammatical person, but not the third person, where demonstrative pronouns were used instead. The personal pronouns had their own unique forms and endings, and some had two distinct stems; this is most obvious in the first person singular, where the two stems are still preserved in English I and me. There were also two varieties for the accusative, genitive and dative cases, a stressed and an enclitic form.[20]

Personal pronouns[20]
First person Second person
Singular Plural Singular Plural
Nominative *h₁eǵ(oH/Hom) *wei *tuH *yuH
Accusative *h₁mé, *h₁me *nsmé, *nōs *twé *usmé, *wōs
Genitive *h₁méne, *h₁moi *ns(er)o-, *nos *tewe, *toi *yus(er)o-, *wos
Dative *h₁méǵʰio, *h₁moi *nsmei, *ns *tébʰio, *toi *usmei
Instrumental *h₁moí *nsmoí *toí *usmoí
Ablative *h₁med *nsmed *tued *usmed
Locative *h₁moí *nsmi *toí *usmi


Proto-Indo-European verbs were complex and, like the noun, exhibits a system of ablaut. The most basic categorization for the Indo-European verb was grammatical aspect. Verbs were classed as:

Verbs have at least four grammatical moods:

Verbs had two grammatical voices:

Verbs had three grammatical persons: (first, second and third)

Verbs had three grammatical numbers:

Verbs were also marked by a highly developed system of participles, one for each combination of tense and voice, and an assorted array of verbal nouns and adjectival formations.

The following table shows two possible reconstructions of the PIE verb endings. Sihler's reconstruction largely represents the current consensus among Indo-Europeanists, while Beekes' is a radical rethinking of thematic verbs; although not widely accepted, it is included to show an example of more far-reaching recent research.

Sihler (1995)[21]
Athematic Thematic
Singular 1st *-mi *-oh₂
2nd *-si *-esi
3rd *-ti *-eti/-ei
Dual 1st *-wos *-owos
2nd *-th₁es *-eth₁es
3rd *-tes *-etes
Plural 1st *-mos *-omos
2nd *-te *-ete
3rd *-nti *-onti


Proto-Indo-European numerals are generally reconstructed as follows:

one *Hoi-no-/*Hoi-wo-/*Hoi-k(ʷ)o-; *sem-
two *d(u)wo-
three *trei- (full grade), *tri- (zero grade)
four *kʷetwor- (o-grade), *kʷetur- (zero grade)
(see also the kʷetwóres rule)
five *penkʷe
six *s(w)eḱs; originally perhaps *weḱs
seven *septm̥
eight *oḱtō, *oḱtou or *h₃eḱtō, *h₃eḱtou
nine *(h₁)newn̥
ten *deḱm̥(t)

Rather than specifically 100, *ḱm̥tóm may originally have meant "a large number".[22]


Proto-Indo-European particles could be used both as adverbs and postpositions, like *upo "under, below". The postpositions became prepositions in most daughter languages. Other reconstructible particles include negators (*ne, *mē), conjunctions (*kʷe "and", *wē "or" and others) and an interjection (*wai!, an expression of woe or agony).


The syntax of the older Indo-European languages has been studied in earnest since at least the late nineteenth century, by such scholars as Hermann Hirt and Berthold Delbrück. In the second half of the twentieth century, interest in the topic increased and led to reconstructions of Proto-Indo-European syntax.[23]

Since all the early attested IE languages were inflectional, PIE is thought to have relied largely on morphological markers, rather than word order, to signal syntactic relationships within sentences.[24] Still, a default (unmarked) word order is thought to have existed in PIE. This was reconstructed by Jacob Wackernagel as being subject–verb–object (SVO), based on evidence in Vedic Sanskrit, and the SVO hypothesis still has some adherents, but as of 2015 the "broad consensus" among PIE scholars is that PIE would have been a subject–object–verb (SOV) language.[25]

The SOV default word order with other orders used to express emphasis (e.g., verb–subject–object to emphasize the verb) is attested in Old Indic, Old Iranian, Old Latin and Hittite, while traces of it can be found in the enclitic personal pronouns of the Tocharian languages.[24] A shift from OV to VO order is posited to have occurred in late PIE, since many of the descendant languages have this order: modern Greek, Romance and Albanian prefer SVO, Insular Celtic has VSO as the default order, and even the Anatolian languages show some signs of this word order shift.[26] The inconsistent order preference in Baltic, Slavic and Germanic can be attributed to contact with outside OV languages.[26]

Relationships to other language families

Many hypothesized higher-level relationships between Proto-Indo-European and other language families have been proposed, but these are highly controversial. Among them:

See also


  1. "linguistics - The comparative method | science". Retrieved 2016-07-27.
  2. "Comparative linguistics". Encyclopedia Britannica. Retrieved 27 August 2016.
  3. "Sir William Jones | British orientalist and jurist". Retrieved 2016-09-03.
  4. Auroux, Sylvain (2000). History of the Language Sciences. Berlin, New York: Walter de Gruyter. p. 1156. ISBN 3-11-016735-2.
  5. Roger Blench Archaeology and Language: methods and issues. In: A Companion To Archaeology. J. Bintliff ed. 52–74. Oxford: Basil Blackwell, 2004.
  6. Wheeler, Kip. "The Sanskrit Connection: Keeping Up With the Joneses". Dr.Wheeler's Website. Retrieved 16 April 2013.
  7. "Franz Bopp | German philologist". Retrieved 2016-08-26.
  8. "Grimm's law | linguistics". Retrieved 2016-08-26.
  9. "Neogrammarian | German scholar". Retrieved 2016-08-26.
  10. "August Schleicher | German linguist". Retrieved 2016-08-26.
  11. Anthony, David W; Ringe, Done (2015). "The Indo-European Homeland from Linguistic and Archaeological Perspectives". Annual Review of Linguistics (1): 199–219.
  12. Mallory, J. P. (1991). In Search of the Indo-Europeans. Thames & Hudson. p. 185. ISBN 978-0500276167.
  13. Anthony, David W. (2007). The horse, the wheel, and language : how bronze-age riders from the Eurasian steppes shaped the modern world (8th reprint. ed.). Princeton, N.J.: Princeton University Press. ISBN 0-691-05887-3.
  14. Balter, Michael (13 February 2015). "Mysterious Indo-European homeland may have been in the steppes of Ukraine and Russia". © 2015 American Association for the Advancement of Science. Retrieved 2015-02-17.
  15. Gimbutas, Marija (1985). "Primary and Secondary Homeland of the Indo-Europeans: comments on Gamkrelidze-Ivanov articles". Journal of Indo-European Studies (Spring - summer).
  16. 1 2 Bouckaert, Remco; Lemey, P.; Dunn, M.; Greenhill, S. J.; Alekseyenko, A. V.; Drummond, A. J.; Gray, R. D.; Suchard, M. A.; et al. (24 August 2012), "Mapping the Origins and Expansion of the Indo-European Language Family", Science, 337 (6097): 957–960, Bibcode:2012Sci...337..957B, doi:10.1126/science.1219669, PMC 4112997Freely accessible, PMID 22923579
  17. Blažek, Václav. "On the internal classification of Indo-European languages: survey" (PDF). Retrieved 30 July 2016.
  18. Gray, Russell D; Atkinson, Quentin D (27 November 2003), "Language-tree divergence times support the Anatolian theory of Indo-European origin" (PDF), Nature, NZ: Auckland, 426 (6965): 435–39, Bibcode:2003Natur.426..435G, doi:10.1038/nature02029, PMID 14647380
  19. Fortson, Benjamin (2004). Indo-European language and culture : an introduction. Malden (USA): Blackwell. p. 102. ISBN 1-4051-0316-7.
  20. 1 2 Beekes, Robert; Gabriner, Paul (1995). Comparative Indo-European linguistics : an introduction. Amsterdam: J. Benjamins Publishing Company. pp. 147, 212–217, 233, 243. ISBN 978-1556195044.
  21. 1 2 Sihler, Andrew L. (1995). New comparative grammar of Greek and Latin. New York u.a.: Oxford Univ. Press. ISBN 0-19-508345-8.
  22. Lehmann, Winfried P (1993), Theoretical Bases of Indo-European Linguistics, London: Routledge, pp. 252–55, ISBN 0-415-08201-3
  23. Kulikov, Leonid; Lavidas, Nikolaos, eds. (2015). "Preface". Proto-Indo-European Syntax and its Development. John Benjamins.
  24. 1 2 Mallory, J. P.; Adams, Douglas Q., eds. (1997). "Proto-Indo-European". Encyclopedia of Indo-European Culture. Taylor & Francis. p. 463.
  25. Hock, Hans Henrich (2015). "Proto-Indo-European verb-finality: Reconstruction, typology, validation". In Kulikov, Leonid; Lavidas, Nikolaos. Proto-Indo-European Syntax and its Development. John Benjamins.
  26. 1 2 Lehmann, Winfred P. (1974). Proto-Indo-European Syntax. University of Texas Press. p. 250.

Further reading

Look up Appendix:List of Proto-Indo-European roots in Wiktionary, the free dictionary.
Proto-Indo-European test of Wikipedia at Wikimedia Incubator
This article is issued from Wikipedia - version of the 11/28/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.