ISO 11940 : diforc'h etre ar stummoù

Endalc’h diverket Danvez ouzhpennet
D r2.7.1) (Robot ouzhpennet: de:ISO 11940
-saozneg
Linenn 92:
* Treuzksrivet e vez <big>{{lang|th|เชียงใหม่}}</big> '''''echīyngıh̄m̀''''' hervez ISO 11940 (kv. ''Chiang Mai'' hervez [[Sistem Treuzskrivañ Boutin Taiek Real]])
 
==Variations==
===Causes===
The standard specifies the order in which the accents should be typed, but not all input systems will record accents in the order in which they are typed. Unicode specifies two normalised forms for letters with multiple accents, and transliterated text is highly likely to be stored in one of these forms. This complicates automatic back-transliteration. As Unicode-compliant processes must handle such variations correctly, the transliterations on this page have been chosen for ease of display - present day rendering systems may display equivalent forms differently.
 
Many fonts display novel combinations of consonants and accents badly. For example, the Institute of the Estonian Language publishes on the web an explanation of the application of the standard to Thai, and with one exception this seems to be a comply with the standard. The exception is that, except for the macron, accents over consonants are actually offset to the right, giving the impression that they have been entered as the corresponding non-combining characters. The standard specifies the transliterations in codepoints, but someone working from this free explanation could easily deduce that the spacing forms of the tone accents should be used.
===ICU (CLDR 1.4.1)===
The [[Kedrannoù Etrebroadel evit Unicode|Kedrannoù Etrebroadel evit Unicode (ICU)]] implementation, recorded in Version 1.4.1 of the Common Local Data Repository sponsored by [[Unicode]], uses a prime instead of a horn in the transliteration of consonants. This affects the transliteration of ฅ kho khon, ฒ tho phuthao and ษ so bo ruesi. ฏ to patak is also transliterated differently, as ''t̩'' rather than ''ṭ''.
 
This implementation transliterates ำ as&nbsp; ''ả'' instead of ''å'' to avoid ambiguity with the hypothetical Thai script sequence ะํ (sara a, nikkhahit). The ICU implementation transliterates ฺ phinthu as ˌ instead of ̥ to avoid problems with Unicode normalisation. This has the side effect of improving legibility when applied to a underdotted consonant.
 
The ICU implementation transliterates ฯ paiyannoi as ''‡'' (double dagger) and angkhankhu as ''||'' (two ASCII vertical bars). As the ICU implementation uses Unicode, it cannot reliably distinguish angkhandiao from paiyannoi without a semantic analysis, and makes no such attempt.
 
The character sequencing of the ICU implementation is different. It transposes preposed vowels with the following consonant, and processes the marks on a consonant in the order in which they are stored in memory. Most Thai input methods ensure that the marks are stored in bottom to top order.
 
For example, under this implementation {{lang|th|ภาษาไทย}} transliterates to p̣hās̄ʹāthịy|th and {{lang|th|เชียงใหม่}} to cheīyngh̄ım̀.
 
Finally, this implementation generates transliterations in [[Unicode normalization|Unicode Normalisation]] Form C (NFC).
 
== Gwelit ivez: ==