SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages
Abstract
Large Language Models (LLMs) have shown remarkable abilities across various tasks, yet their development has predominantly centered on high-resource languages like English and Chinese, leaving low-resource languages underserved. To address this disparity, we present SeaLLMs 3, the latest iteration of the SeaLLMs model family, tailored for Southeast Asian languages. This region, characterized by its rich linguistic diversity, has lacked adequate language technology support. SeaLLMs 3 aims to bridge this gap by covering a comprehensive range of languages spoken in this region, including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese. Leveraging efficient language enhancement techniques and a specially constructed instruction tuning dataset, SeaLLMs 3 significantly reduces training costs while maintaining high performance and versatility. Our model excels in tasks such as world knowledge, mathematical reasoning, translation, and instruction following, achieving state-of-the-art performance among similarly sized models. Additionally, we prioritized safety and reliability by addressing both general and culture-specific considerations and incorporated mechanisms to reduce hallucinations. This work underscores the importance of inclusive AI, showing that advanced LLM capabilities can benefit underserved linguistic and cultural communities.
Community
Awesome workπ It's great to see LLMs embracing diverse languages. Does DAMO have plans to develop models for other languages as well?
Hey Adina! Yes, we are always interested in developing models for more languages, especially those underrepresented ones. Do you have any suggestions π€?
Very impressive work!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback (2024)
- YuLan: An Open-source Large Language Model (2024)
- Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities (2024)
- IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models (2024)
- Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 8
Browse 8 models citing this paperDatasets citing this paper 0
No dataset linking this paper