aashish1904 commited on
Commit
394986d
•
1 Parent(s): 139f592

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +142 -0
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: llama3.1
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
8
+ language:
9
+ - en
10
+ - zh
11
+ tags:
12
+ - llama-factory
13
+ - orpo
14
+
15
+ ---
16
+
17
+ ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
18
+
19
+ # QuantFactory/Llama3.1-8B-Chinese-Chat-GGUF
20
+ This is quantized version of [shenzhi-wang/Llama3.1-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat) created using llama.cpp
21
+
22
+ # Original Model Card
23
+
24
+
25
+ > [!CAUTION]
26
+ > For optimal performance, we refrain from fine-tuning the model's identity. Thus, inquiries such as "Who are you" or "Who developed you" may yield random responses that are not necessarily accurate.
27
+
28
+ > [!IMPORTANT]
29
+ > If you enjoy our model, please **give it a star on our Hugging Face repo** and kindly [**cite our model**](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat#citation). Your support means a lot to us. Thank you!
30
+
31
+
32
+ # Updates
33
+
34
+ - 🚀🚀🚀 [July 24, 2024] We now introduce [shenzhi-wang/Llama3.1-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat)! The training dataset contains >100K preference pairs, and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities!
35
+ - 🔥 We provide the official **q4_k_m, q8_0, and f16 GGUF** versions of Llama3.1-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat/tree/main/gguf!
36
+
37
+
38
+ # Model Summary
39
+
40
+ llama3.1-8B-Chinese-Chat is an instruction-tuned language model for Chinese & English users with various abilities such as roleplaying & tool-using built upon the Meta-Llama-3.1-8B-Instruct model.
41
+
42
+ Developers: [Shenzhi Wang](https://shenzhi-wang.netlify.app)\*, [Yaowei Zheng](https://github.com/hiyouga)\*, Guoyin Wang (in.ai), Shiji Song, Gao Huang. (\*: Equal Contribution)
43
+
44
+ - License: [Llama-3.1 License](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE)
45
+ - Base Model: Meta-Llama-3.1-8B-Instruct
46
+ - Model Size: 8.03B
47
+ - Context length: 128K (reported by [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), untested for our Chinese model)
48
+
49
+ # 1. Introduction
50
+
51
+ This is the first model specifically fine-tuned for Chinese & English users based on the [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct). The fine-tuning algorithm used is ORPO [1].
52
+
53
+
54
+ [1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).
55
+
56
+ Training framework: [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
57
+
58
+ Training details:
59
+
60
+ - epochs: 3
61
+ - learning rate: 3e-6
62
+ - learning rate scheduler type: cosine
63
+ - Warmup ratio: 0.1
64
+ - cutoff len (i.e. context length): 8192
65
+ - orpo beta (i.e. $\lambda$ in the ORPO paper): 0.05
66
+ - global batch size: 128
67
+ - fine-tuning type: full parameters
68
+ - optimizer: paged_adamw_32bit
69
+
70
+
71
+
72
+ # 2. Usage
73
+
74
+ ## 2.1 Usage of Our BF16 Model
75
+
76
+ 1. Please upgrade the `transformers` package to ensure it supports Llama3.1 models. The current version we are using is `4.43.0`.
77
+
78
+ 2. Use the following Python script to download our BF16 model
79
+
80
+ ```python
81
+ from huggingface_hub import snapshot_download
82
+ snapshot_download(repo_id="shenzhi-wang/Llama3.1-8B-Chinese-Chat", ignore_patterns=["*.gguf"]) # Download our BF16 model without downloading GGUF models.
83
+ ```
84
+
85
+ 3. Inference with the BF16 model
86
+
87
+ ```python
88
+ import torch
89
+ import transformers
90
+ from transformers import AutoModelForCausalLM, AutoTokenizer
91
+
92
+ model_id = "/Your/Local/Path/to/Llama3.1-8B-Chinese-Chat"
93
+ dtype = torch.bfloat16
94
+
95
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
96
+ model = AutoModelForCausalLM.from_pretrained(
97
+ model_id,
98
+ device_map="cuda",
99
+ torch_dtype=dtype,
100
+ )
101
+
102
+ chat = [
103
+ {"role": "user", "content": "写一首关于机器学习的诗。"},
104
+ ]
105
+ input_ids = tokenizer.apply_chat_template(
106
+ chat, tokenize=True, add_generation_prompt=True, return_tensors="pt"
107
+ ).to(model.device)
108
+
109
+ outputs = model.generate(
110
+ input_ids,
111
+ max_new_tokens=8192,
112
+ do_sample=True,
113
+ temperature=0.6,
114
+ top_p=0.9,
115
+ )
116
+ response = outputs[0][input_ids.shape[-1] :]
117
+ print(tokenizer.decode(response, skip_special_tokens=True))
118
+ ```
119
+
120
+ ## 2.2 Usage of Our GGUF Models
121
+
122
+ 1. Download our GGUF models from the [gguf_models folder](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat/tree/main/gguf);
123
+ 2. Use the GGUF models with [LM Studio](https://lmstudio.ai/);
124
+ 3. You can also follow the instructions from https://github.com/ggerganov/llama.cpp/tree/master#usage to use gguf models.
125
+
126
+
127
+ # Citation
128
+
129
+ If our Llama3.1-8B-Chinese-Chat is helpful, please kindly cite as:
130
+
131
+ ```
132
+ @misc {shenzhi_wang_2024,
133
+ author = { Wang, Shenzhi and Zheng, Yaowei and Wang, Guoyin and Song, Shiji and Huang, Gao },
134
+ title = { Llama3.1-8B-Chinese-Chat },
135
+ year = 2024,
136
+ url = { https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat },
137
+ doi = { 10.57967/hf/2779 },
138
+ publisher = { Hugging Face }
139
+ }
140
+ ```
141
+
142
+