File size: 3,922 Bytes
f927306
 
 
ff0230e
f927306
 
98ae011
 
f927306
 
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
 
 
 
 
f61a0db
 
11d7e35
f61a0db
11d7e35
 
 
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
 
f61a0db
11d7e35
f61a0db
 
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
 
 
f61a0db
11d7e35
 
f61a0db
11d7e35
f61a0db
11d7e35
 
 
 
 
 
 
 
 
 
 
 
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
 
11d7e35
f61a0db
11d7e35
 
 
 
 
 
 
f61a0db
11d7e35
f61a0db
11d7e35
f61a0db
 
9c812c5
 
 
 
 
 
 
f61a0db
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
license: llama2
datasets:
- yentinglin/traditional_mandarin_instructions
language:
- zh
widget:
 - text: "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: 你好,請問你可以幫我寫一封推薦信嗎? ASSISTANT:"
library_name: transformers
pipeline_tag: text-generation
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/CmusIT5OlSXvFrbTJ7l-C.png" alt="Taiwan LLM Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>

# 🌟 Checkout [Taiwan-LLM Demo Chat-UI](http://www.twllm.com) 🌟

# Model Card for Taiwan LLM 13B v0.0 chat

Taiwan LLM is an advanced language model tailored for Traditional Chinese, focusing on the linguistic and cultural contexts of Taiwan. 
Developed from a large base model, it's enriched with diverse Taiwanese textual sources and refined through Supervised Fine-Tuning. 
This model excels in language understanding and generation, aligning closely with Taiwan's cultural nuances. 
It demonstrates improved performance on various benchmarks like TC-Eval, showcasing its contextual comprehension and cultural relevance. 
For detailed insights into Taiwan LLM's development and features, refer to our [technical report](https://github.com/MiuLab/Taiwan-LLaMa/blob/main/twllm_paper.pdf).


## Model description

- **Model type:** A 13B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- **Language(s) (NLP):** Primarily Traditional Chinese (zh-tw)
- **Finetuned from model:** [meta-llama/Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/MiuLab/Taiwan-LLaMa
- **Demo:** https://twllm.com/

## Performance


![image/png](https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/HTwIzw6RDha2-PhuWqSuI.png)

## Intended uses

Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:

```python
# pip install transformers>=4.34
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="yentinglin/Taiwan-LLaMa-v0.0", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "你是一個人工智慧助理",
    },
    {"role": "user", "content": "東北季風如何影響台灣氣候?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```

### Training hyperparameters

![image/png](https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/MdvHwdUvH-c926qyRAw7K.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/kKpkvxDzOEyiAoTqmzRYO.png)


![image/png](https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/FsnlJ_fkRxf7fn5RKZnjE.png)

The following hyperparameters were used during training:
- learning_rate: 5e-05
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 5.0

## Citation

If you find Taiwan LLM is useful in your work, please cite it with:

```
@misc{lin2023taiwan,
      title={Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model}, 
      author={Yen-Ting Lin and Yun-Nung Chen},
      year={2023},
      eprint={2311.17487},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```