Romanizing system of dataset
#1
by
Comet0322
- opened
Hello, I am curious about which Romanization system is used for Manchu in your dataset. I use the Möllendorff system, but I found that characters like ū, š, and ž cannot be tokenized properly.
Abkai Latin transliteration was used. Please refer to our paper for more details.
https://arxiv.org/pdf/2311.17492
Thank you. I will check it out.