Publications
Google Scholar | DBLP | Semantic Scholar
Journal
- Mao, J., Ding, C., Kaing, H., Tanaka, H., Utiyama, M., & Matsumoto, T. (2025). Data Augmentation for Low-Resource Languages in Multilingual Dependency Parsing. Journal of Natural Language Processing, 32(1), 219–251. [Paper]
- Kaing, H., Ding, C., Utiyama, M., Sumita, E., Sudoh, K., & Nakamura, S. (2021). Constituency Parsing by Cross-Lingual Delexicalization. IEEE Access, 9, 141571–141578. [Paper]
- Kaing, H., Ding, C., Utiyama, M., Sumita, E., Sam, S., Seng, S., Sudoh, K., & Nakamura, S. (2021). Towards tokenization and part-of-speech tagging for Khmer: Data and discussion. Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 20(6), 1–16. [Paper]
- Kann, B., Chay-intr, T., Kaing, H., & Theeramunkong, T. (2019). Khmer Treebank Construction via Interactive Tree Visualization. International Journal of Information Technology and Electrical Engineering (IJITEE), 3(3), 67–74. [Paper]
International Conference (Peer-Reviewed)
- Dabre, R., Kaing, H., & Song, H. (2025). BYTF: How Good Are Byte Level N-Gram F-Scores for Automatic Machine Translation Evaluation? MT Summit. (to appear)
- Tran, V.-H., Dabre, R., Kaing, H., Song, H., Tanaka, H., & Utiyama, M. (2025). Exploiting Word Sense Disambiguation in Large Language Models for Machine Translation. Proceedings of the First Workshop on Language Models for Low-Resource Languages, 135–144. [Paper]
- Kaing, H., Dabre, R., Song, H., Tran, V.-H., Tanaka, H., & Utiyama, M. (2025). PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation. Proceedings of the 31st International Conference on Computational Linguistics, 1309–1322. [Paper]
- Joshi, A., Kanojia, D., Lent, H., Kaing, H., & Song, H. (2025). Connecting Ideas in’Lower-Resource’Scenarios: NLP for National Varieties, Creoles and Other Low-resource Scenarios. Proceedings of the 2025 International Conference on Computational Linguistics (COLING 2025). [Paper] [Slides] (tutorial)
- Mao, J., Ding, C., Kaing, H., Tanaka, H., Utiyama, M., & Matsumoto, T. (2024). Overcoming Early Saturation on Low-Resource Languages in Multilingual Dependency Parsing. Proceedings of the Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD)@ LREC-COLING 2024, 63–69. [Paper]
- Song, H., Kaing, H., & Dabre, R. (2024). Linguistically Motivated Neural Machine Translation. The 25th Annual Conference of the European Association for Machine Translation (EAMT 2024). [Slides] (tutorial)
- Kaing, H., Ding, C., Tanaka, H., & Utiyama, M. (2024). Robust Neural Machine Translation for Abugidas by Glyph Perturbation. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), 311–318. [Paper]
- Mao, J., Ding, C., Kaing, H., Tanaka, H., Utiyama, M., & Matsumoto, T. (2023). Improving Zero-Shot Dependency Parsing by Unsupervised Learning. Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation (PACLIC), 217–226. [Paper]
- Kaing, H., Ding, C., Sudoh, K., Utiyama, M., Sumita, E., & Nakamura, S. (2021). Multi-Source Cross-Lingual Constituency Parsing. Proceedings of the 18th International Conference on Natural Language Processing (ICON), 341–346. [Paper]
- Mon, A. M., Ding, C., Kaing, H., Soe, K. M., Utiyama, M., & Sumita, E. (2020). A Myanmar (Burmese)-English named entity transliteration dictionary. Proceedings of the Twelfth Language Resources and Evaluation Conference (LREC), 2980–2983. [Paper]
- Marie, B., Kaing, H., Mon, A. M., Ding, C., Fujita, A., Utiyama, M., & Sumita, E. (2019). Supervised and unsupervised machine translation for Myanmar-English and Khmer-English. Proceedings of the 6th Workshop on Asian Translation (WAT), 68–75. [Paper]
- Besacier, L., Lecouteux, B., Luong, N.-Q., Kaing, H., & Salah, M. H. (2014). Word confidence estimation for speech translation. International Workshop on Spoken Language Translation (IWSLT). [Paper]
Domestic Conference (Non Peer-Reviewed)
- Kaing, H., Song, H., Ding, C., Mao, J., Tanaka, H., & Utiyama, M. (2025). Towards Scene Text Translation for Complex Writing Systems. 言語処理学会 第30回年次大会.
- Kaing, H., Ding, C., Song, H., Mao, J., Tanaka, H., & Utiyama, M. (2024). Robust Neural Machine Translation for Abugidas by Glyph Perturbation. 言語処理学会 第30回年次大会.
- Kaing, H., Ding, C., Utiyama, M., Sumita, E., & Chea, V. (2016). Improving english-to-khmer statistical machine translation using part-of-speech information. Proc. of KNLP.
- Kaing, H., Kak, S., Chea, V., & Sam, S. (2017). OCR Post-Processing for Khmer language: Error detection using Conditional Random Field. Proc. of ONA.