Romanization of Persian
Persian alphabet |
---|
ا ب پ ت ث ج چ ح خ د ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ن و ه ی |
Perso-Arabic script |
Romanization of Persian or Latinization of Persian is the representation of the Persian language (Farsi, Dari and Tajik) with the Latin script. Several different romanization schemes exist, each with its own set of rules driven by its own set of ideological goals.
Romanization paradigms
Because the Perso-Arabic script is an abjad writing system (with a consonant-heavy inventory of letters), many distinct words in standard Persian can have identical spellings, with widely varying pronunciations that differ in their (unwritten) vowel sounds. Thus a romanization paradigm can follow either transliteration (which mirrors spelling and orthography) or transcription (which mirrors pronunciation and phonology).
The Latin script plays in today’s world the role of a second script. For the proof of this assertion it is sufficient to take a look at the city and street signs or the Internet addresses in all countries. On the other hand, experience has shown that efforts to teach millions of Iranian young people abroad in reading and writing Persian mostly prove to be unsuccessful, due to the lack of daily contact with the Persian script. It seems that a way out of this dilemma has been found; and that is the use of the Latin script parallel to the Persian script.
Transliteration
Transliteration (in the strict sense) attempts to be a complete representation of the original writing, so that an informed reader should be able to reconstruct the original spelling of unknown transliterated words. Transliterations of Persian are used to represent individual Persian words or short quotations, in scholarly texts in English or other languages that do not use the Arabic alphabet.
A transliteration will still have separate representations for different consonants of the Persian alphabet that are pronounced identically in Persian. Therefore, transliterations of Persian are often based on transliterations of Arabic.[1] The representation of the vowels of the Perso-Arabic alphabet is also complex, and transliterations are based on the written form.
Transliterations commonly used in the English-speaking world include BGN/PCGN romanization and ALA-LC Romanization.
Non-academic English-language quotation of Persian words usually uses a simplification of one of the strict transliteration schemes (typically omitting diacritical marks) and/or unsystematic choices of spellings meant to guide English speakers using English spelling rules towards an approximation of the Persian sounds.
Transcription
Transcriptions of Persian attempt to straightforwardly represent Persian phonology in the Latin script, without requiring a close or reversible correspondence with the Perso-Arabic script, and also without requiring a close correspondence to English phonetic values of Roman letters.
Until recently, there was no official standard for the transcription of the Persian using the Latin script. Fortunately, all ministries and government agencies have now been instructed to implement a standard under the name Persian Romanization System .[2][3] By creating a set of rules an attempt has been made to generalize the application of this standard so that one can write any Persian texts on the basis of this standard. Based on this set of rules infrastructure tools for spell checking including an extension for OpenOffice and an add-on for mozilla FireFox and Thunderbird,[4][5] transcription of texts including a Google Chrome extension,[6] translation of words, and language teaching have been developed within a project called Alefbā-ye 2om, literally 2nd alphabet.[7] Using those tools few books have been transcribed including the Divan of Hafez and the Rubaiyat of Omar Khayyam.[8][9]
Main romanization schemes
- DMG (1969), a strict scientific system by the German Oriental Society (Deutsche Morgenländische Gesellschaft). It corresponds to Deutsches Institut für Normung standard DIN 31635.[10]
- ALA-LC (1997), the ALA-LC romanization.[11]
- BGN/PCGN (1958), the BGN/PCGN romanization.[12]
- EI (1960), the system used in early editions of Encyclopædia Iranica.[10]
- EI (2012), its contemporary modification.[13]
- UN (1967), the Iranian national system (1966), that was approved by the UNGEGN in 1967.[14][15]
- UN (2012), its contemporary modification.[14][16]
Comparison table
the UN(2012) is of most interest, since this is the only style that was set by the Government of Iran and approved by the United Nations. It is now officially used in Iran. Specifically, the dictionary is used.[17]
Unicode | Persian letter | IPA | DMG (1969) | ALA-LC (1997) | BGN/PCGN (1958) | EI (1960) | EI (2012) | UN (1967) | UN (2012) | ||
---|---|---|---|---|---|---|---|---|---|---|---|
U+0627 | ا | ʔ, ∅[lower-alpha 1] | ʾ, —[lower-alpha 2] | ’, —[lower-alpha 2] | ʾ | ||||||
U+0628 | ب | b | b | ||||||||
U+067E | پ | p | p | ||||||||
U+062A | ت | t | t | ||||||||
U+062B | ث | s | s̱ | s̱ | s̄ | t͟h | ṯ | s̄ | s | ||
U+062C | ج | dʒ | ǧj | j | j | d͟j | j | j | |||
U+0686 | چ | tʃ | č | ch | ch | č | č | ch | č | ||
U+062D | ح | h | ḥ | ḥ | ḩ/ḥ[lower-alpha 3] | ḥ | ḥ | ḩ | h | ||
U+062E | خ | x | ḫ | kh | kh | k͟h | ḵ | kh | x | ||
U+062F | د | d | d | ||||||||
U+0630 | ذ | z | ẕ | ẕ | z̄ | d͟h | ḏ | z̄ | z | ||
U+0631 | ر | r | r | ||||||||
U+0632 | ز | z | z | ||||||||
U+0698 | ژ | ʒ | ž | zh | zh | z͟h | ž | zh | ž | ||
U+0633 | س | s | s | ||||||||
U+0634 | ش | ʃ | š | sh | sh | s͟h | š | sh | š | ||
U+0635 | ص | s | ṣ | ṣ | ş/ṣ[lower-alpha 3] | ṣ | ṣ | ş | s | ||
U+0636 | ض | z | ż | z̤ | ẕ | ḍ | ż | ẕ | z | ||
U+0637 | ط | t | ṭ | ṭ | ţ/ṭ[lower-alpha 3] | ṭ | ṭ | ţ | t | ||
U+0638 | ظ | z | ẓ | ẓ | z̧/ẓ[lower-alpha 3] | ẓ | ẓ | z̧ | z | ||
U+0639 | ع | ∅ | ʿ | ‘ | ’[lower-alpha 2] | ‘ | ‘ | ʿ | ʿ | ||
U+063A | غ | ɣ | ġ | gh | gh | g͟h | ḡ | gh | q | ||
U+0641 | ف | f | f | ||||||||
U+0642 | ق | ɢ~ɣ | q | ḳ | ḳ | q | |||||
U+06A9 | ک | k | k | ||||||||
U+06AF | گ | ɡ | g | ||||||||
U+0644 | ل | l | l | ||||||||
U+0645 | م | m | m | ||||||||
U+0646 | ن | n | n | ||||||||
U+0648 | و | v~w[lower-alpha 1][lower-alpha 4] | v | v, w[lower-alpha 5] | v | ||||||
U+0647 | ه | h[lower-alpha 1] | h | h | h[lower-alpha 6] | h | h | h[lower-alpha 6] | h[lower-alpha 6] | ||
U+0629 | ة | ∅, t | — | h[lower-alpha 7] | — | t[lower-alpha 8] | h[lower-alpha 7] | — | — | ||
U+06CC | ی | j[lower-alpha 1] | y | ||||||||
U+0621 | ء | ʔ, ∅ | ʾ | ’ | ʾ | ||||||
U+0624 | ؤ | ʔ, ∅ | ʾ | ’ | ʾ | ||||||
U+0626 | ئ | ʔ, ∅ | ʾ | ’ | ʾ |
Unicode | Final | Medial | Initial | Isolated | IPA | DMG (1969) | ALA-LC (1997) | BGN/PCGN (1958) | EI (2012) | UN (1967) | UN (2012) |
---|---|---|---|---|---|---|---|---|---|---|---|
U+064E | ◌َ | ◌َ | اَ | ◌َ | æ | a | a | a | a | a | a |
U+064F | ◌ُ | ◌ُ | اُ | ◌ُ | o | o | o | o | u | o | o |
U+0648 U+064F | ◌ﻮَ | ◌ﻮَ | — | ◌وَ | o[lower-alpha 10] | o | o | o | u | o | o |
U+0650 | ◌ِ | ◌ِ | اِ | ◌ِ | e | e | i | e | e | e | e |
U+064E U+0627 | ◌َا | ◌َا | أ | ◌َا | ɑː~ɒː | ā | ā | ā | ā | ā | ā |
U+0622 | ◌ﺂ | ◌ﺂ | آ | ◌آ | ɑː~ɒː | ā, ʾā[lower-alpha 11] | ā, ’ā[lower-alpha 11] | ā | ā | ā | ā |
U+064E U+06CC | ◌َﯽ | — | — | ◌َی | ɑː~ɒː | ā | á | á | ā | á | ā |
U+06CC U+0670 | ◌ﯽٰ | — | — | ◌یٰ | ɑː~ɒː | ā | á | á | ā | ā | ā |
U+064F U+0648 | ◌ُﻮ | ◌ُﻮ | اُو | ◌ُو | uː, oː[lower-alpha 5] | ū | ū | ū | u, ō[lower-alpha 5] | ū | u |
U+0650 U+06CC | ◌ِﯽ | ◌ِﯿ | اِﯾ | ◌ِی | iː, eː[lower-alpha 5] | ī | ī | ī | i, ē[lower-alpha 5] | ī | i |
U+064E U+0648 | ◌َﻮ | ◌َﻮ | اَو | ◌َو | ow~aw[lower-alpha 5] | au | aw | ow | ow, aw[lower-alpha 5] | ow | ow |
U+064E U+06CC | ◌َﯽ | ◌َﯿ | اَﯾ | ◌َی | ej~aj[lower-alpha 5] | ai | ay | ey | ey, ay[lower-alpha 5] | ey | ey |
U+064E U+06CC | ◌ﯽ | — | — | ◌ی | –e, –je | –e, –ye | –i, –yi | –e, –ye | –e, –ye | –e, –ye | –e, –ye |
U+06C0 | ◌ﮥ | — | — | ◌ﮤ | –je | –ye | –’i | –ye | –ye | –ye | –ye |
Notes:
- 1 2 3 4 Used as a vowel as well.
- 1 2 3 Not transliterated at the beginning of words.
- 1 2 3 4 Dot below may be equally used instead of cedilla.
- ↑ At the beginning of words the combination ⟨خو⟩ was pronounced /xw/ or /xʷ/ in Classical Persian. In modern varieties the glide /ʷ/ has been lost, though the spelling has not been changed. It may be still heard in Dari as a relict pronunciation. The combination /xʷa/ was changed to /xo/ (see below).
- 1 2 3 4 5 6 7 8 9 In Dari.
- 1 2 3 Not transliterated at the end of words.
- 1 2 In the combination ⟨یة⟩ at the end of words.
- ↑ When used instead of ⟨ت⟩ at the end of words.
- ↑ Diacritical signs (harakat) are rarely written.
- ↑ After ⟨خ⟩ from the earlier /xʷa/. Often transliterated as xwa or xva. E. g. خور /xor/ "sun" was /xʷar/ in Classical Persian.
- 1 2 After vowels.
Pre-Islamic period
In the pre-Islamic period Old and Middle Persian employed various scripts including Old Persian cuneiform, Pahlavi and Avestan scripts. For each period there are established transcriptions and transliterations by prominent linguists.[13][18][19][20][21]
IPA | Old Persian[lower-roman 1][lower-roman 2] | Middle Persian (Pahlavi)[lower-roman 1] | Avestan[lower-roman 1] |
---|---|---|---|
Consonants | |||
p | p | ||
f | f | ||
b | b | ||
β~ʋ~w | — | β | β/w |
t | t | t, t̰ | |
θ | θ/ϑ | ||
d | d | ||
ð | — | (δ) | δ |
θr | ç/ϑʳ | θʳ/ϑʳ | |
s | s | ||
z | z | ||
ʃ | š | š, š́, ṣ̌ | |
ʒ | ž | ||
c~tʃ | c/č | ||
ɟ~dʒ | j/ǰ | ||
k | k | ||
x | x | x, x́ | |
xʷ | xʷ/xᵛ | ||
g | g | g, ġ | |
ɣ | ɣ/γ | ||
h | h | ||
m | m | m, m̨ | |
ŋ | — | ŋ, ŋʷ | |
ŋʲ | — | ŋ́ | |
n | n | n, ń, ṇ | |
r | r | ||
l | l | ||
w~ʋ~v | v | w | v |
j | y | y, ẏ | |
Vowels | |||
Short | |||
a | a | ||
ã | — | ą, ą̇ | |
ə | — | ə | |
e | — | (e) | e |
i | i | ||
o | — | (o) | o |
u | u | ||
Long | |||
aː | ā | ||
ɑː~ɒː | — | å/ā̊ | |
ə | — | ə̄ | |
əː | — | ē | |
iː | ī | ||
oː | — | ō | |
uː | ū |
Notes:
Other romanization schemes
Bahá'í Persian romanization
Bahá'ís use a system standardized by Shoghi Effendi, which he initiated in a general letter on March 12, 1923.[22] The Bahá'í transliteration scheme was based on a standard adopted by the Tenth International Congress of Orientalists which took place in Geneva in September 1894. Shoghi Effendi changed some details of the Congress's system, most notably in the use of digraphs in certain cases (e.g. sh instead of š), and in incorporating the solar letters when writing the definite article al- (Arabic: ال) according to pronunciation (e.g. ar-Rahim, as-Saddiq, instead of al-Rahim, al-Saddiq).
A detailed introduction to the Bahá'í Persian romanization can usually be found at the back of a Bahá'í scripture.
ASCII Internet romanizations
It is common to write Persian language with only English letters especially when commenting in weblogs or when using cellphones to send SMS.
Tajik Latin alphabet
The Tajik language or Tajik Persian is a variety of the Persian language. It was written in Tajik SSR in a standardized Latin script from 1926 until the late 1930s, when the script was officially changed to Cyrillic. However, Tajik phonology differs slightly from that of Persian in Iran. As the result of these two factors romanization schemes of the Tajik Cyrillic script follow rather different principles.[23]
A a | B ʙ | C c | Ç ç | D d | E e | F f | G g | Ƣ ƣ | H h | I i | Ī ī |
/a/ | /b/ | /tʃ/ | /dʒ/ | /d/ | /e/ | /f/ | /a/ | /ʁ/ | /h/ | /i/ | /ˈi/ |
J j | K k | L l | M m | N n | O o | P p | Q q | R r | S s | Ş ş | T t |
/j/ | /k/ | /l/ | /m/ | /n/ | /o/ | /p/ | /q/ | /ɾ/ | /s/ | /ʃ/ | /t/ |
U u | Ū ū | V v | X x | Z z | Ƶ ƶ | ' | |||||
/u/ | /u/ | /v/ | /χ/ | /z/ | /ʒ/ | /ʔ/ |
See also
- Persian alphabet
- Persian language tutorial books for beginners
- Persian phonology
- Romanization of Arabic
References
- ↑ Joachim, Martin D. (1993). Languages of the world: cataloging issues and problems. New York: Haworth Press. p. 137. ISBN 1560245204.
- ↑ http://geonames.ncc.org.ir/HomePage.aspx?TabID=4583&Site=geonames.ncc.org&Lang=en-US
- ↑ http://www.eki.ee/wgrs/rom1_fa.htm
- ↑ OpenOffice Spell Checker Dictionary. Based on Alefbā-ye 2om.
- ↑ mozilla Spell Checker Dictionary. Based on Alefbā-ye 2om.
- ↑ Virastlive. A Google Chrome extension for transcription of Persian texts based on Alefbā-ye 2om.
- ↑ Alefbā-ye 2om. A parallel script for the Persian language.
- ↑ Divan of Hafez. A transcription based on Alefbā-ye 2om.
- ↑ Rubaiyat of Omar Khayyam. A transcription based on Alefbā-ye 2om.
- 1 2 Pedersen, Thomas T. "Persian (Farsi)" (PDF). http://transliteration.eki.ee/. External link in
|website=
(help) - ↑ "Persian" (PDF). The Library of Congress.
- ↑ "Romanization system for Persian (Dari and Farsi). BGN/PCGN 1958 System" (PDF).
- 1 2 "Transliteration". Encyclopædia Iranica.
- 1 2 "Persian" (PDF). UNGEGN.
- ↑ Toponymic Guidelines for map and other editors – Revised edition 1998. Working Paper No. 41. Submitted by the Islamic Republic of Iran. UNGEGN, 20th session. New York, 17–28 January 2000.
- ↑ New Persian Romanization System. E/CONF.101/118/Rev.1*. Tenth United Nations Conference on the Standardization of Geographical Names. New York, 31 July – 9 August 2012.
- ↑ http://www.eki.ee/wgrs/rom1_fa.htm
- ↑ Bartholomae, Christian (1904). Altiranisches Wörterbuch. Strassburg. p. XXIII.
- ↑ Kent, Roland G. (1950). Old Persian. New Heaven, Connecticut. pp. 12–13.
- ↑ MacKenzie, D. N. (1971). "Transcription". A Concise Pahlavi Dictionary. London.
- ↑ Hoffmann, Karl; Forssman, Bernhard (1996). Avestische Laut- und Flexionslehre. Innsbruck. pp. 41–44. ISBN 3-85124-652-7.
- ↑ Effendi, Shoghi (1974). Bahá'í Administration. Wilmette, Illinois, USA: Bahá'í Publishing Trust. p. 43. ISBN 0-87743-166-3.
- ↑ Pedersen, Thomas T. "Tajik" (PDF). http://transliteration.eki.ee/. External link in
|website=
(help)
External links
- Comparison of DMG, UN, ALA-LC, BGN/PCGN, EI, ISO 233-3 transliterations
- Transliteration of Non-Roman Scripts
- A PARALLEL SCRIPT FOR THE PERSIAN LANGUAGE
- Virastlive, a Google Chrome transcription extension for Persian texts based on Alefbā-ye 2om
- Iranian Committee on the Standardization of Geographical Names(ICSGN)