For other uses, see Proto-language (disambiguation).
Tree model of historical linguistics. The proto-languages stand at the branch points, or nodes: 15, 6, 20 and 7. The leaf languages, or end points, are 2, 5, 9 and 31. The root language is 15. By convention, the Proto-languages are named Proto-5-9, Proto-2-5-9 and Proto-31, or Common 5-9, etc. The overall Ursprache has a proto name reflecting the ordinary name of the entire family, such as Germanic, Italic, etc. The links between nodes indicate descent or genetic descent. All the languages in the tree are related. Nodes 6 and 20 are the daughters of 15, their parent. Nodes 6 and 20 are cognates or sister languages, etc. The leaf languages must be attested by some sort of documentation, even a lexical list of a few words. All the proto-languages are hypothetical, or reconstructed languages; however sometimes documentation is found that supports their former existence.

A proto-language in the tree model of historical linguistics is a language – usually hypothetical or reconstructed, and unattested – from which a number of attested, or documented, known languages are believed to have descended by evolution, or slow modification of the proto-language into languages that form a language family.

In the strict sense, a proto-language is the latest common ancestor of a language family (immediately before the start of the divergence into the attested idioms) and thus corresponds to the most recent common ancestor in biology, although the term is often used more loosely. Moreover, a group of idioms (such as a dialect cluster) which are not considered separate languages (for whichever reasons) can also be described as descending from a unitary proto-language.

Occasionally, the German term Ursprache (from Ur- "primordial" and Sprache "language", pronounced [ˈʔuːɐ.ʃpʁaː.xə]) is used instead.

Definition and verification

Typically, the proto-language is not known directly. It is by definition a linguistic reconstruction formulated by applying the comparative method to a group of languages featuring similar characteristics.[1] The tree is a statement of similarity and a hypothesis that the similarity results from descent from a common language.

The comparative method, a process of deduction, begins from a set of characteristics, or characters, found in the attested languages. If the entire set can be accounted for by descent from the proto-language, which must contain the proto-forms of them all, the tree, or phylogeny, is regarded as a complete explanation and by Occam's razor, is given credibility. More recently such a tree has been termed "perfect" and the characters labelled "compatible."

No trees but the smallest branches are ever found to be perfect, in part because languages also evolve through horizontal transfer with their neighbours. Typically, credibility is given to the hypotheses of highest compatibility. The differences in compatibility must be explained by various applications of the wave model. The level of completeness of the reconstruction achieved varies, depending on how complete the evidence is from the descendant languages and on the formulation of the characters by the linguists working on it. Not all characters are suitable for the comparative method. For example, lexical items that are loans from a different language do not reflect the phylogeny to be tested, and if used will detract from the compatibility. Getting the right dataset for the comparative method is a major task in historical linguistics.

Some universally accepted proto-languages are Proto-Indo-European, Proto-Uralic, and Proto-Dravidian.

In a few fortuitous instances, which have been used to verify the method and the model (and probably ultimately inspired it), a literary history exists from as early as a few millennia ago, allowing the descent to be traced in detail. The early daughter languages, and even the proto-language itself, may be attested in surviving texts. For example, Latin is the proto-language of the Romance language family, which includes such modern languages as French, Italian, Portuguese, Romanian, Catalan and Spanish. Likewise, Proto-Norse, the ancestor of the modern Scandinavian languages, is attested, albeit in fragmentary form, in the Elder Futhark. Although there are no very early Indo-Aryan inscriptions, the Indo-Aryan languages of modern India all go back to Vedic Sanskrit (or dialects very closely related to it), which has been preserved in texts accurately handed down by parallel oral and written traditions for many centuries.

The first person to offer systematic reconstructions of an unattested proto-language was August Schleicher; he did so for Proto-Indo-European in 1861.[2]

Proto-X vs. Pre-X

Normally, the term "Proto-X" refers to the last common ancestor of a group of languages, occasionally attested but most commonly reconstructed through the comparative method. An earlier stage of the same language, reconstructed through the method of internal reconstruction, is termed "Pre-X". This terminology is used, for example, in the case of Proto-Indo-European and Pre-Indo-European; likewise for Proto-Germanic and Pre-Germanic. As Pre-X is sometimes also used for a postulated substratum (for example Pre-Germanic can also refer a hypothetical Germanic substratum), the more precise term Pre-Proto-X is sometimes used, for a stage older than the last common ancestor of the attested language varieties employed in the reconstruction.

When multiple historical stages of a single language exist, the oldest attested stage is normally termed "Old X" (e.g. "Old English", "Old Korean"). For an earlier, hypothetical stage, reconstructed through the method of internal reconstruction, terminology differs, with some authors using "Proto-X" (e.g. "Proto-English") and others using "Pre-X" (e.g. "Pre-English"). The case of Irish is somewhat different; the language known as Old Irish is the language in which the first significant texts are known, but an older stage named Primitive Irish is also attested, though much more sparsely. This is similar to the situation of Old Norse and Proto-Norse, where both are attested but the latter only fragmentarily.


There are no objective criteria for the evaluation of different reconstruction systems yielding different proto-languages. Many researchers concerned with linguistic reconstruction agree that the traditional comparative method is an "intuitive undertaking".[3]

The bias of the researchers regarding the accumulated implicit knowledge can also lead to erroneous assumptions and their generalization. Kortlandt (1993) lists several cases where such general assumptions concerning "the nature of language" turned out to hinder the research in the field of historical linguistics. Linguists make their judgments on what they consider "natural" for a language to change, and "As a result, our reconstructions tend to have a strong bias toward the average language type known to the investigator."[4]

The advent of wave model raised new issues in the domain of linguistic reconstruction, causing the reevaluation of old reconstruction systems and depriving the proto-language of its "uniform character". This is evident in Karl Brugmann's skepticism that the reconstruction systems could ever reflect a linguistic reality.[5] Ferdinand de Saussure went further, completely rejecting a positive specification of the sound values of reconstruction systems.[6]

In general, the issue of the nature of proto-language remains unsolved, with linguists taking the realist or abstractionist position. Even the widely studied proto-languages, such as Proto-Indo-European, have suffered criticism due to being typological outliers with respect to the reconstructed phonemic inventory. The alternatives such as glottalic theory, despite representing a typologically less rare system, have not gained wider acceptance, with some researchers even suggesting the use of indexes to represent the disputed series of plosives. On the other end of spectrum, Pulgram (1959:424) suggests that Proto-Indo-European reconstructions are just "a set of reconstructed formulae not representative of any reality". In the same vein Julius Pokorny in his study on Indo-European claims that the linguistic term IE parent language is merely an abstraction that does not exist in reality, and it should be understood as consisting of dialects possibly dating back to the paleolithic, during which these formed the linguistic structure of the IE language group.[7] In his view, Indo-European is solely a system of isoglosses which held together dialects which were spoken by various tribes, from which the historically attested Indo-European languages emerged.[7]

See also


  1. Koerner, E F K (1999), Linguistic historiography: projects & prospects, Amsterdam studies in the theory and history of linguistic science; Ser. 3, Studies in the history of the language sciences, Amsterdam [u.a.]: J. Benjamins, p. 109, First, the historical linguist does not reconstruct a language (or part of the language) but a model which represents or is intended to represent the underlying system or systems of such a language.
  2. Lehmann 1993, p. 26.
  3. Schwink, Frederick W.: Linguistic Typology, Universality and the Realism of Reconstruction, Washington 1994. "Part of the process of “becoming” a competent Indo-Europeanist has always been recognized as coming to grasp “intuitively” concepts and types of changes in language so as to be able to pick and choose between alternative explanations for the history and development of specific features of the reconstructed language and its offspring.""
  4. Kortlandt (1993:9)
  5. Brugann & Delbrück:25)
  6. Saussure (1969:303)
  7. 1 2 Pokorny (1953:79–80)


This article is issued from Wikipedia - version of the 9/7/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.