How transliteration works¶
Transliteration searches for attribute values when entering a query by transliteration. Works in simple string and array attributes. At the same time, the search for the attribute in Cyrillic also works.
How transliteration occurs:
Filtering is performed using analysis-ice for OpenSearch via the ICU library (International Components for Unicode) - icu.unicode.org.
The following text transformation rule is used (description of the rules - https://unicode-org.github.io/icu/userguide/transforms/general/): Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC
You can check the transliteration of the text here: https://icu4c-demos.unicode.org/icu-bin/translit
Insert the above rule into the Compound 1 field.
Enter the text you are interested in in the Input field.
Click Transform – the transliterated text will be displayed in the Output 1 field.
Cyrillic to Latin transliteration table
Cyrillic |
Latin |
Cyrillic |
Latin |
Cyrillic |
Latin |
а |
a |
м |
m |
щ |
s |
б |
b |
н |
n |
ъ |
" |
в |
v |
о |
o |
ы |
y |
г |
g |
п |
p |
ь |
' |
д |
d |
р |
r |
э |
e |
е |
e |
с |
s |
ю |
u |
ё |
e |
т |
t |
я |
a |
ж |
z |
у |
u |
||
з |
z |
ф |
f |
||
и |
i |
х |
h |
||
й |
j |
ц |
c |
||
к |
k |
ч |
c |
||
л |
l |
ш |
s |
Selection of the sorting method
When sorting by attribute with transliteration, by default sorting occurs by the original value. Optionally, a selection of sorting by transliterated value is available (in the attribute settings in the data model).
Impact on comparison
The original (not transliterated) attribute values are used for matching.
Influence on the uniqueness of attributes
To check the uniqueness of attributes, the original (not transliterated) attribute values are used.
Enabling the transliteration support option on existing attributes
When this option is enabled on existing attributes, it is necessary to perform a reindexing operation with data cleaning and updating mappings on the affected registries/directories (i.e., when this option is changed, the registry attribute is on this registry; the directory attribute is on this directory; the link attribute is on registries at both ends of the link; the attribute of a nested object – on all registries that use this nested object).