Skip to content

Commit 724ffa0

Browse files
committed
Update README
1 parent 5ea375b commit 724ffa0

File tree

1 file changed

+21
-21
lines changed

1 file changed

+21
-21
lines changed

README.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
[![Build Status](http://ci.sparse.tech/api/badges/sparsetech/translit-scala/status.svg)](http://ci.sparse.tech/sparsetech/translit-scala)
33
[![Maven Central](https://img.shields.io/maven-central/v/tech.sparse/translit-scala_2.12.svg)](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22tech.sparse%22%20AND%20a%3A%22translit-scala_2.12%22)
44

5-
translit-scala is a transliteration library for Scala and Scala.js. It implements transliteration rules for Slavic languages. It supports converting texts from the Latin to the Cyrillic alphabet.
5+
translit-scala is a transliteration library for Scala and Scala.js. It implements transliteration rules for Slavic languages. It supports converting texts from the Latin to the Cyrillic alphabet and vice-versa.
66

77
## Compatibility
88
| Back end | Scala versions |
@@ -52,7 +52,7 @@ We decompose letters in their Latin transliteration more consistently than Natio
5252
* Volodymyr (Володимир)
5353
* blyz'ko (близько)
5454

55-
The Latin letter *y* is also the phonetic basis of four letters in the Slavic alphabet: я, є, ї, ю. They get transliterated accordingly:
55+
The Latin letter *y* is also the phonetic basis of four letters (iotated vowels) in the Ukrainian alphabet: я, є, ї, ю. They get transliterated accordingly:
5656

5757
* ya → я
5858
* ye → є
@@ -63,20 +63,23 @@ Unlike National 2010, we always use the same transliteration regardless of the p
6363

6464
The accented counterpart of и is й and is represented by a separate letter, *j*.
6565

66-
*Example:* Zhurs'kyj (Згурський)
66+
*Example:* Zgurs'kyj (Згурський)
6767

6868
#### Soft Signs and Apostrophes
6969
The second change to National 2010 is that we try to restore soft signs and apostrophes:
7070

7171
* Ukrayins'kyj (Український), malen'kyj (маленький)
7272
* m'yaso (м'ясо), matir'yu (матір'ю)
7373

74+
In National 2010, *g* is mapped to *ґ* which is phonetically accurate, though the letter is fairly uncommon in Ukrainian. Therefore, *ґ* is represented by *g'*.
75+
7476
This feature is experimental and can be disabled by setting `apostrophes` to `false`.
7577

7678
#### Convenience mappings
7779
Another modification was to provide the following mappings:
7880

7981
* c → ц
82+
* h → х
8083
* q → щ
8184
* w → ш
8285
* x → ж
@@ -91,9 +94,7 @@ Note that these mappings are phonetically inaccurate. However, using them still
9194
* Another advantage is the proximity on the English keyboard layout:
9295
* *q* and *w* are located next to each other; *ш* and *щ* characters are phonetically close
9396
* *z* and *x* are located next to each other; *з* and *ж* characters are phonetically close
94-
95-
#### Precedence
96-
The replacement patterns are applied sequentially by traversing the input character-by-character. In some cases, a rule spanning multiple characters should not be applied. An example is the word: схильність. The transliteration of *сх* corresponds to two separate letters *s* and *h*, which would map to *ш*. To prevent this, one can place a vertical bar between the two characters. The full transliteration then looks as follows: *s|hyl'nist*
97+
* *h* is mapped to *х* since it is a common letter, *kh* is only needed in case *h* is ambiguous
9798

9899
## Russian
99100
The Russian rules are similar to the Ukrainian ones.
@@ -105,12 +106,6 @@ Some differences are:
105106
* Soft sign: *'* for ь
106107
* Hard sign: *`* for ъ
107108

108-
### Precedence
109-
As with the Ukrainian rules, a vertical bar can be placed to avoid certain rules from being applied.
110-
111-
* красивые: krasivy|e
112-
* сходить: s|hodit
113-
114109
### Mapping
115110
| Latin | Cyrillic |
116111
|-------|----------|
@@ -141,7 +136,7 @@ As with the Ukrainian rules, a vertical bar can be placed to avoid certain rules
141136
| y | ы |
142137
| z | з |
143138
| ' | ь |
144-
| " | ъ |
139+
| \` | ъ |
145140
| ch | ч |
146141
| sh | ш |
147142
| ya | я |
@@ -151,15 +146,20 @@ As with the Ukrainian rules, a vertical bar can be placed to avoid certain rules
151146
| yu | ю |
152147
| shch | щ |
153148

154-
#### Examples
155-
| Russian | Transliterated |
156-
|---------|----------------|
157-
| Привет | Privet |
158-
| Съел | S"el |
159-
| Щётка | Shchyotka |
160-
| Льдина | L'dina |
149+
### Examples
150+
| Russian | Transliterated |
151+
|----------|----------------|
152+
| Привет | Privet |
153+
| Съел | S\`el |
154+
| Щётка | Shchyotka |
155+
| Льдина | L'dina |
156+
| красивые | krasivye |
157+
| сходить | skhodit' |
158+
159+
## Internals
160+
The replacement patterns are applied sequentially by traversing the input character-by-character. The functions `latinToCyrillicIncremental` and `cyrillicToLatinIncremental` take the left context which is needed for some rules. The result indicates the number of characters to remove and a replacement string.
161161

162-
### Credits
162+
## Credits
163163
The rules and examples were adapted from the following libraries:
164164

165165
* [translit-english-ukrainian](https://github.com/MarkovSergii/translit-english-ukrainian)

0 commit comments

Comments
 (0)