asyafiqe commited on
Commit
c95d99a
1 Parent(s): cc338fc

Upload 21 files

Browse files
Merak-7B-v2.ggmlv3.q2_K.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:61df1704d3931a41c39f4cae1fd61b098c2e743b199457b39543a953282e6276
3
+ size 2866807424
Merak-7B-v2.ggmlv3.q3_K.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd685a94e7eb2c675a2cc7dd551e9a27f19972a34cf4edea5102d847296c3b93
3
+ size 3282248320
Merak-7B-v2.ggmlv3.q3_K_L.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7e34d9e2a45eb437c2dcffaf8b2e9f083de7f3df990c381eff3972008b8dfdb
3
+ size 3596821120
Merak-7B-v2.ggmlv3.q3_K_M.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd685a94e7eb2c675a2cc7dd551e9a27f19972a34cf4edea5102d847296c3b93
3
+ size 3282248320
Merak-7B-v2.ggmlv3.q3_K_S.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddc5dbd282f3421010d2ae25bb7592690d0d652fa58152a3fd610937ffd343ed
3
+ size 2948014720
Merak-7B-v2.ggmlv3.q4_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef369ae99188fa31efa8ce8064dfb1f2569bc2f0b7d88df007954006e23d4c73
3
+ size 3825517184
Merak-7B-v2.ggmlv3.q4_1.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2302c2d294ff540f8478c03b7dd3fe89ef2ff92c2dcab2b3be1225c9bafe90fc
3
+ size 4238459520
Merak-7B-v2.ggmlv3.q4_K.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6f7833e2105ff45600090c6e1b151e12f31edf4e661690e69d2465920aaa3c7
3
+ size 4080714368
Merak-7B-v2.ggmlv3.q4_K_M.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6f7833e2105ff45600090c6e1b151e12f31edf4e661690e69d2465920aaa3c7
3
+ size 4080714368
Merak-7B-v2.ggmlv3.q4_K_S.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:302fa7eb5f309f29e518749a3f212a9899c2718b5989e1806c1bd5b4922882b2
3
+ size 3825517184
Merak-7B-v2.ggmlv3.q5_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30235f60a776ab67bab45bed8f4b59ac245060ba458a4a1c208636023764cdd7
3
+ size 4651401856
Merak-7B-v2.ggmlv3.q5_1.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6da242005ae7e49fb1b673440d7487b06250872e9e39aa8f59502d2a19ebba6
3
+ size 5064344192
Merak-7B-v2.ggmlv3.q5_K.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d385dbd36822e630d2a283c8a45e18657cfa8cb689fda83173798c3bc1bc51a0
3
+ size 4782867072
Merak-7B-v2.ggmlv3.q5_K_M.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d385dbd36822e630d2a283c8a45e18657cfa8cb689fda83173798c3bc1bc51a0
3
+ size 4782867072
Merak-7B-v2.ggmlv3.q5_K_S.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efd2ccf1ec630b9697d1d9cec399856f21653803f01954e7fead471a82ef2f8f
3
+ size 4651401856
Merak-7B-v2.ggmlv3.q6_K.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:daf9bd82304f1aad497c94b21ef18c5ff1b3e78060183fddbc5e56f0473d0c04
3
+ size 5528904320
Merak-7B-v2.ggmlv3.q8_0.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be25a29674d376ec78da75ffad692920bd13f1d213b5e9ebba41ef7e366329ad
3
+ size 7129055872
Notice ADDED
@@ -0,0 +1 @@
 
 
1
+ Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
README.md CHANGED
@@ -1,3 +1,229 @@
1
  ---
2
  license: llama2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ model_type: llama
4
+ inference: false
5
+ datasets:
6
+ - wikipedia
7
+ language:
8
+ - id
9
+ - en
10
+ pipeline_tag: text-generation
11
+ tags:
12
+ - facebook
13
+ - meta
14
+ - pytorch
15
+ - llama
16
+ - llama-2
17
  ---
18
+ # MERAK-7B-V2 GGML
19
+ readme adapted from [TheBloke](https://huggingface.co/TheBloke)
20
+
21
+ These files are GGML format model files for [MERAK-7B-V2](https://huggingface.co/Ichsan2895/Merak-7B-v2).
22
+
23
+ GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) and libraries and UIs which support this format, such as:
24
+ * [KoboldCpp](https://github.com/LostRuins/koboldcpp), a powerful GGML web UI with full GPU acceleration out of the box. Especially good for story telling.
25
+ * [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with GPU acceleration via the c_transformers backend.
26
+ * [LM Studio](https://lmstudio.ai/), a fully featured local GUI. Supports full GPU accel on macOS. Also supports Windows, without GPU accel.
27
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most popular web UI. Requires extra steps to enable GPU accel via llama.cpp backend.
28
+ * [ctransformers](https://github.com/marella/ctransformers), a Python library with LangChain support and OpenAI-compatible AI server.
29
+ * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with OpenAI-compatible API server.
30
+
31
+ <!-- compatibility_ggml start -->
32
+ ## Compatibility
33
+
34
+ ### Original llama.cpp quant methods: `q4_0, q4_1, q5_0, q5_1, q8_0`
35
+
36
+ These are guaranteed to be compatible with any UIs, tools and libraries released since late May. They may be phased out soon, as they are largely superseded by the new k-quant methods.
37
+
38
+ ### New k-quant methods: `q2_K, q3_K_S, q3_K_M, q3_K_L, q4_K_S, q4_K_M, q5_K_S, q6_K`
39
+
40
+ These new quantisation methods are compatible with llama.cpp as of June 6th, commit `2d43387`.
41
+
42
+ They are now also compatible with recent releases of text-generation-webui, KoboldCpp, llama-cpp-python, ctransformers, rustformers and most others. For compatibility with other tools and libraries, please check their documentation.
43
+
44
+ ## Explanation of the new k-quant methods
45
+ <details>
46
+ <summary>Click to see details</summary>
47
+
48
+ The new methods available are:
49
+ * GGML_TYPE_Q2_K - "type-1" 2-bit quantization in super-blocks containing 16 blocks, each block having 16 weight. Block scales and mins are quantized with 4 bits. This ends up effectively using 2.5625 bits per weight (bpw)
50
+ * GGML_TYPE_Q3_K - "type-0" 3-bit quantization in super-blocks containing 16 blocks, each block having 16 weights. Scales are quantized with 6 bits. This end up using 3.4375 bpw.
51
+ * GGML_TYPE_Q4_K - "type-1" 4-bit quantization in super-blocks containing 8 blocks, each block having 32 weights. Scales and mins are quantized with 6 bits. This ends up using 4.5 bpw.
52
+ * GGML_TYPE_Q5_K - "type-1" 5-bit quantization. Same super-block structure as GGML_TYPE_Q4_K resulting in 5.5 bpw
53
+ * GGML_TYPE_Q6_K - "type-0" 6-bit quantization. Super-blocks with 16 blocks, each block having 16 weights. Scales are quantized with 8 bits. This ends up using 6.5625 bpw
54
+ * GGML_TYPE_Q8_K - "type-0" 8-bit quantization. Only used for quantizing intermediate results. The difference to the existing Q8_0 is that the block size is 256. All 2-6 bit dot products are implemented for this quantization type.
55
+
56
+ Refer to the Provided Files table below to see what files use which methods, and how.
57
+ </details>
58
+ <!-- compatibility_ggml end -->
59
+
60
+ ## Provided files
61
+ | Name | Quant method | Bits | Use case |
62
+ | ---- | ---- | ---- | ----- |
63
+ | Merak-7B-v2.ggmlv3.q2_K.bin | q2_K | 2 | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.vw and feed_forward.w2 tensors, GGML_TYPE_Q2_K for the other tensors. |
64
+ | Merak-7B-v2.ggmlv3.q3_K_L.bin | q3_K_L | 3 | New k-quant method. Uses GGML_TYPE_Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
65
+ | Merak-7B-v2.ggmlv3.q3_K_M.bin | q3_K_M | 3 | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
66
+ | Merak-7B-v2.ggmlv3.q3_K_S.bin | q3_K_S | 3 | New k-quant method. Uses GGML_TYPE_Q3_K for all tensors |
67
+ | Merak-7B-v2.ggmlv3.q4_0.bin | q4_0 | 4 | Original quant method, 4-bit. |
68
+ | Merak-7B-v2.ggmlv3.q4_1.bin | q4_1 | 4 | Original quant method, 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. |
69
+ | Merak-7B-v2.ggmlv3.q4_K_M.bin | q4_K_M | 4 | New k-quant method. Uses GGML_TYPE_Q6_K for half of the attention.wv and feed_forward.w2 tensors, else GGML_TYPE_Q4_K |
70
+ | Merak-7B-v2.ggmlv3.q4_K_S.bin | q4_K_S | 4 | New k-quant method. Uses GGML_TYPE_Q4_K for all tensors |
71
+ | Merak-7B-v2.ggmlv3.q5_0.bin | q5_0 | 5 | Original quant method, 5-bit. Higher accuracy, higher resource usage and slower inference. |
72
+ | Merak-7B-v2.ggmlv3.q5_1.bin | q5_1 | 5 | Original quant method, 5-bit. Even higher accuracy, resource usage and slower inference. |
73
+ | Merak-7B-v2.ggmlv3.q5_K_M.bin | q5_K_M | 5 | New k-quant method. Uses GGML_TYPE_Q6_K for half of the attention.wv and feed_forward.w2 tensors, else GGML_TYPE_Q5_K |
74
+ | Merak-7B-v2.ggmlv3.q5_K_S.bin | q5_K_S | 5 | New k-quant method. Uses GGML_TYPE_Q5_K for all tensors |
75
+ | Merak-7B-v2.ggmlv3.q6_K.bin | q6_K | 6 | New k-quant method. Uses GGML_TYPE_Q8_K for all tensors - 6-bit quantization |
76
+ | lMerak-7B-v2.ggmlv3.q8_0.bin | q8_0 | 8 | Original quant method, 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
77
+
78
+
79
+ ## How to run in `text-generation-webui`
80
+
81
+ Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md).
82
+
83
+ # Original model card: 6TH PROTOTYPE OF MERAK-7B-V2!
84
+
85
+ Merak-7B is the Large Language Model of Indonesia Languange
86
+
87
+ This model is based on Meta Llama-2-7B-Chat-HF and fine tuned by some of Indonesia Wikipedia articles that I cleaned before.
88
+
89
+ Leveraging QLoRA (QLora: Efficient Finetuning of Quantized LLMs), Merak-7B is able to run with 16 GB VRAM
90
+
91
+ Licensed under Creative Commons-By Attribution-Share Alike-Non Commercial (CC-BY-SA-NC 4.0) Merak-7B empowers AI enthusiasts, researchers alike.
92
+
93
+ Big thanks to all my friends and communities that help to build our first model. Feel free, to ask me about the model and please share the news on your social media.
94
+
95
+ ## HOW TO USE
96
+ ### Installation
97
+ Please make sure you have installed CUDA driver in your system, Python 3.10 and PyTorch 2. Then install this library in terminal
98
+ ```
99
+ pip install bitsandbytes==0.39.1
100
+ pip install transformers==4.31.0
101
+ pip install peft==0.4.0
102
+ pip install accelerate==0.20.3
103
+ pip install einops==0.6.1 scipy sentencepiece datasets
104
+ ```
105
+ ### Using BitsandBytes and it run with >= 10 GB VRAM GPU
106
+ [![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Cl1tO1QIYNWHR8K-nQe6xIaUvaLwxXCq?usp=sharing)
107
+ ```
108
+ import torch
109
+ from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
110
+ from peft import PeftModel, PeftConfig
111
+
112
+ model_id = "Ichsan2895/Merak-7B-v2"
113
+ config = AutoConfig.from_pretrained(model_id)
114
+
115
+ BNB_CONFIG = BitsAndBytesConfig(load_in_4bit=True,
116
+ bnb_4bit_compute_dtype=torch.bfloat16,
117
+ bnb_4bit_use_double_quant=True,
118
+ bnb_4bit_quant_type="nf4",
119
+ )
120
+
121
+ model = AutoModelForCausalLM.from_pretrained(model_id,
122
+ quantization_config=BNB_CONFIG,
123
+ device_map="auto",
124
+ trust_remote_code=True)
125
+
126
+ tokenizer = LlamaTokenizer.from_pretrained(model_id)
127
+
128
+ def generate_response(question: str) -> str:
129
+ prompt = f"<|prompt|>{question}\n<|answer|>".strip()
130
+
131
+ encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
132
+ with torch.inference_mode():
133
+ outputs = model.generate(input_ids=encoding.input_ids,
134
+ attention_mask=encoding.attention_mask,
135
+ eos_token_id=tokenizer.pad_token_id,
136
+ do_sample=False,
137
+ num_beams=2,
138
+ temperature=0.3,
139
+ repetition_penalty=1.2,
140
+ max_length=200)
141
+
142
+ response = tokenizer.decode(outputs[0], skip_special_tokes=True)
143
+
144
+ assistant_start = "<|answer|>"
145
+ response_start = response.find(assistant_start)
146
+ return response[response_start + len(assistant_start) :].strip()
147
+
148
+ prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
149
+ print(generate_response(prompt))
150
+ ```
151
+
152
+
153
+ ### From my experience, For better answer, please don’t use BitsandBytes 4-bit Quantization, but it using higher VRAM
154
+ [![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1uUaeI4-Zzuk0m9Xjg1Dw45YZs402EgWz?usp=sharing)
155
+ ```
156
+ import torch
157
+ from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
158
+ from peft import PeftModel, PeftConfig
159
+
160
+ model_id = "Ichsan2895/Merak-7B-v2"
161
+ config = AutoConfig.from_pretrained(model_id)
162
+ model = AutoModelForCausalLM.from_pretrained(model_id,
163
+ device_map="auto",
164
+ trust_remote_code=True)
165
+
166
+ tokenizer = LlamaTokenizer.from_pretrained(model_id)
167
+
168
+ def generate_response(question: str) -> str:
169
+ prompt = f"<|prompt|>{question}\n<|answer|>".strip()
170
+
171
+ encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
172
+ with torch.inference_mode():
173
+ outputs = model.generate(input_ids=encoding.input_ids,
174
+ attention_mask=encoding.attention_mask,
175
+ eos_token_id=tokenizer.pad_token_id,
176
+ do_sample=False,
177
+ num_beams=2,
178
+ temperature=0.3,
179
+ repetition_penalty=1.2,
180
+ max_length=200)
181
+
182
+ response = tokenizer.decode(outputs[0], skip_special_tokes=True)
183
+
184
+ assistant_start = "<|answer|>"
185
+ response_start = response.find(assistant_start)
186
+ return response[response_start + len(assistant_start) :].strip()
187
+
188
+ prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
189
+ print(generate_response(prompt))
190
+ ```
191
+
192
+ ## CHANGELOG
193
+ **v1** = The first Merak-7B model. We selected and cleaned about 200k ID wikipedia articles.
194
+ **v2** = Finetuned version of first Merak-7B model. We finetuned again with the same ID Wikipedia articles except it changes prompt-style in the questions.
195
+
196
+ ## CITATION
197
+ ```
198
+ @Paper{arXiv,
199
+ author = {Touvron, et al},
200
+ title = {Llama 2: Open Foundation and Fine-Tuned Chat Models},
201
+ journal = {arXiv preprint arXiv:2307.09288},
202
+ year = {2023}
203
+ }
204
+
205
+ @ONLINE{wikidump,
206
+ author = "Wikimedia Foundation",
207
+ title = "Wikimedia Downloads",
208
+ url = "https://dumps.wikimedia.org"
209
+ }
210
+
211
+ @inproceedings{wolf-etal-2020-transformers,
212
+ title = "Transformers: State-of-the-Art Natural Language Processing",
213
+ author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
214
+ booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
215
+ month = oct,
216
+ year = "2020",
217
+ address = "Online",
218
+ publisher = "Association for Computational Linguistics",
219
+ url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
220
+ pages = "38--45"
221
+ }
222
+
223
+ @article{dettmers2023qlora,
224
+ title = {QLoRA: Efficient Finetuning of Quantized LLMs},
225
+ author = {Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
226
+ journal = {arXiv preprint arXiv:2305.14314},
227
+ year = {2023}
228
+ }
229
+ ```
USE_POLICY.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Llama 2 Acceptable Use Policy
2
+
3
+ Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found at [ai.meta.com/llama/use-policy](http://ai.meta.com/llama/use-policy).
4
+
5
+ ## Prohibited Uses
6
+ We want everyone to use Llama 2 safely and responsibly. You agree you will not use, or allow others to use, Llama 2 to:
7
+
8
+ 1. Violate the law or others’ rights, including to:
9
+ 1. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:
10
+ 1. Violence or terrorism
11
+ 2. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material
12
+ 3. Human trafficking, exploitation, and sexual violence
13
+ 4. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.
14
+ 5. Sexual solicitation
15
+ 6. Any other criminal activity
16
+ 2. Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals
17
+ 3. Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services
18
+ 4. Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices
19
+ 5. Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws
20
+ 6. Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials
21
+ 7. Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system
22
+
23
+
24
+
25
+ 2. Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following:
26
+ 1. Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State
27
+ 2. Guns and illegal weapons (including weapon development)
28
+ 3. Illegal drugs and regulated/controlled substances
29
+ 4. Operation of critical infrastructure, transportation technologies, or heavy machinery
30
+ 5. Self-harm or harm to others, including suicide, cutting, and eating disorders
31
+ 6. Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual
32
+
33
+
34
+
35
+ 3. Intentionally deceive or mislead others, including use of Llama 2 related to the following:
36
+ 1. Generating, promoting, or furthering fraud or the creation or promotion of disinformation
37
+ 2. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
38
+ 3. Generating, promoting, or further distributing spam
39
+ 4. Impersonating another individual without consent, authorization, or legal right
40
+ 5. Representing that the use of Llama 2 or outputs are human-generated
41
+ 6. Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement
42
+ 4. Fail to appropriately disclose to end users any known dangers of your AI system
43
+
44
+ Please report any violation of this Policy, software “bug,” or other problems that could lead to a violation of this Policy through one of the following means:
45
+
46
+ * Reporting issues with the model: [github.com/facebookresearch/llama](http://github.com/facebookresearch/llama)
47
+ * Reporting risky content generated by the model: [developers.facebook.com/llama_output_feedback](http://developers.facebook.com/llama_output_feedback)
48
+ * Reporting bugs and security concerns: [facebook.com/whitehat/info](http://facebook.com/whitehat/info)
49
+ * Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: [LlamaUseReport@meta.com](mailto:LlamaUseReport@meta.com)
50
+
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "meta-llama/Llama-2-7b-chat-hf",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 4096,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 11008,
12
+ "max_position_embeddings": 4096,
13
+ "model_type": "llama",
14
+ "num_attention_heads": 32,
15
+ "num_hidden_layers": 32,
16
+ "num_key_value_heads": 32,
17
+ "pretraining_tp": 1,
18
+ "rms_norm_eps": 1e-06,
19
+ "rope_scaling": null,
20
+ "tie_word_embeddings": false,
21
+ "torch_dtype": "float16",
22
+ "transformers_version": "4.32.0.dev0",
23
+ "use_cache": true,
24
+ "vocab_size": 32000
25
+ }