jackJessada SuperkingbasSKB commited on
Commit
78e9114
•
1 Parent(s): 701e828

update readme.md (#2)

Browse files

- update readme.md (a4de955b458a2110d5ea06509a1b12db0dccdf5c)


Co-authored-by: Pakawat Phasook <SuperkingbasSKB@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -16,17 +16,17 @@ tags:
16
  - code
17
  - legal
18
  ---
19
- # OpenThaiLLM-DoodNiLT-Instruct: Thai & China Large Language Model (Instruct)
20
- **OpenThaiLLM-DoodNiLT-Instruct** is an 7 billion parameter instruct model designed for Thai 🇹🇭 & China 🇨🇳 language.
21
- It demonstrates competitive performance with GPT-3.5-turbo and llama-3-typhoon-v1.5-8b-instruct, and is optimized for application use cases, Retrieval-Augmented Generation (RAG),
22
- constrained generation, and reasoning tasks.is a Thai 🇹🇭 & China 🇨🇳 large language model with 7 billion parameters, and it is based on Qwen2-7B.
23
  ## Introduction
24
 
25
- Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.
26
 
27
  Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc.
28
 
29
- Qwen2-7B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
30
 
31
  For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
32
  <br>
@@ -81,9 +81,9 @@ print(response)
81
  ```
82
 
83
  ## Evaluation Performance Few-shot (5 shot)
84
- | Model | ONET | IC | TGAT | TPAT-1 | A-Level | Average (ThaiExam) | M3Exam (1 shot) | MMLU |
85
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
86
- | DoodNiLT-7B | **0.5185** | **0.6421** | **0.6461** | **0.4224** | **0.3937** | **0.5245** | **0.5355** | 0.6644 |
87
  | llama-3-typhoon-v1.5-8b | 0.3765 | 0.3473 | 0.5538 | 0.4137 | 0.2913 | 0.3965 | 0.4312 | 0.6451 |
88
  | OpenThaiGPT-1.0.0-7B | 0.3086 | 0.3052 | 0.4153 | 0.3017 | 0.2755 | 0.3213 | 0.255 | 0.3512 |
89
  | Meta-Llama-3.1-8B | 0.3641 | 0.2631 | 0.2769 | 0.3793 | 0.1811 | 0.2929 | 0.4239 | 0.6591 |
 
16
  - code
17
  - legal
18
  ---
19
+ # OpenThaiLLM-Prebuilt: Thai & China & English Large Language Model
20
+ **OpenThaiLLM-Prebuilt** is an 7 billion parameter instruct model designed for Thai 🇹🇭 & China 🇨🇳 language.
21
+ It demonstrates an amazing result, and is optimized for application use cases, Retrieval-Augmented Generation (RAG), Web deployment
22
+ constrained generation, and reasoning tasks.is a Thai 🇹🇭 & China 🇨🇳 large language model with 7 billion parameters, and it is based on Qwen2.5-7B.
23
  ## Introduction
24
 
25
+ Qwen2.5 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.
26
 
27
  Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc.
28
 
29
+ Qwen2.5-7B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
30
 
31
  For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
32
  <br>
 
81
  ```
82
 
83
  ## Evaluation Performance Few-shot (5 shot)
84
+ | Model | ONET | IC | TGAT | TPAT-1 | A-Level | Average ThaiExam) | MMLU | M3Exam (1 shot) | M6Exam(5shot) |
85
  | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
86
+ | OpenthaiLLM-Prebuilt-7B | **0.5493** | **0.6315** | **0.6307** | **0.4655** | **0.37** | **0.5294** | **0.7054** | **0.5705** | **0.596** |
87
  | llama-3-typhoon-v1.5-8b | 0.3765 | 0.3473 | 0.5538 | 0.4137 | 0.2913 | 0.3965 | 0.4312 | 0.6451 |
88
  | OpenThaiGPT-1.0.0-7B | 0.3086 | 0.3052 | 0.4153 | 0.3017 | 0.2755 | 0.3213 | 0.255 | 0.3512 |
89
  | Meta-Llama-3.1-8B | 0.3641 | 0.2631 | 0.2769 | 0.3793 | 0.1811 | 0.2929 | 0.4239 | 0.6591 |