MaziyarPanahi commited on
Commit
6db1cb4
1 Parent(s): 2853cef

Create README.md (#3)

Browse files

- Create README.md (97e6f5023ee220fa93e36ffaf224e6199de05b2c)

Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: llama3
5
+ library_name: transformers
6
+ tags:
7
+ - axolotl
8
+ - finetune
9
+ - facebook
10
+ - meta
11
+ - pytorch
12
+ - llama
13
+ - llama-3
14
+ - chatml
15
+ base_model: meta-llama/Meta-Llama-3-70B-Instruct
16
+ datasets:
17
+ - MaziyarPanahi/truthy-dpo-v0.1-axolotl
18
+ model_name: Llama-3-70B-Instruct-v0.1
19
+ pipeline_tag: text-generation
20
+ license_name: llama3
21
+ license_link: LICENSE
22
+ inference: false
23
+ model_creator: MaziyarPanahi
24
+ quantized_by: MaziyarPanahi
25
+ ---
26
+
27
+ <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
28
+
29
+
30
+ # MaziyarPanahi/Llama-3-70B-Instruct-v0.1
31
+
32
+ This model is a fine-tune of `meta-llama/Meta-Llama-3-70B-Instruct` model. This version comes with `<|im_start|>` and `<|im_end|>` as extra tokens to avoid taking up extra tokens via ChatML prompt.
33
+
34
+ # ⚡ Quantized GGUF
35
+
36
+ All GGUF models are available here: [MaziyarPanahi/Llama-3-70B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-70B-Instruct-v0.1-GGUF)
37
+
38
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
39
+ coming soon.
40
+
41
+ # Prompt Template
42
+
43
+ This model uses `ChatML` prompt template:
44
+
45
+ ```
46
+ <|im_start|>system
47
+ {System}
48
+ <|im_end|>
49
+ <|im_start|>user
50
+ {User}
51
+ <|im_end|>
52
+ <|im_start|>assistant
53
+ {Assistant}
54
+ ````
55
+
56
+ # How to use
57
+
58
+ You can use this model by using `MaziyarPanahi/Llama-3-70B-Instruct-v0.1` as the model name in Hugging Face's
59
+ transformers library.
60
+
61
+ ```python
62
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
63
+ from transformers import pipeline
64
+ import torch
65
+
66
+ model_id = "MaziyarPanahi/Llama-3-70B-Instruct-v0.1"
67
+
68
+ model = AutoModelForCausalLM.from_pretrained(
69
+ model_id,
70
+ torch_dtype=torch.bfloat16,
71
+ device_map="auto",
72
+ trust_remote_code=True,
73
+ # attn_implementation="flash_attention_2"
74
+ )
75
+
76
+ tokenizer = AutoTokenizer.from_pretrained(
77
+ model_id,
78
+ trust_remote_code=True
79
+ )
80
+
81
+ streamer = TextStreamer(tokenizer)
82
+
83
+ pipeline = pipeline(
84
+ "text-generation",
85
+ model=model,
86
+ tokenizer=tokenizer,
87
+ model_kwargs={"torch_dtype": torch.bfloat16},
88
+ streamer=streamer
89
+ )
90
+
91
+ # Then you can use the pipeline to generate text.
92
+
93
+ messages = [
94
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
95
+ {"role": "user", "content": "Who are you?"},
96
+ ]
97
+
98
+ prompt = tokenizer.apply_chat_template(
99
+ messages,
100
+ tokenize=False,
101
+ add_generation_prompt=True
102
+ )
103
+
104
+ terminators = [
105
+ tokenizer.eos_token_id,
106
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
107
+ ]
108
+
109
+ outputs = pipeline(
110
+ prompt,
111
+ max_new_tokens=2048,
112
+ eos_token_id=terminators,
113
+ do_sample=True,
114
+ temperature=0.6,
115
+ top_p=0.95,
116
+ )
117
+ print(outputs[0]["generated_text"][len(prompt):])
118
+ ```