teddylee777 commited on
Commit
558ff6c
1 Parent(s): 47b3e9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +139 -0
README.md CHANGED
@@ -1,3 +1,142 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
2
  license: llama3
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ - ko
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - facebook
8
+ - meta
9
+ - pytorch
10
+ - llama
11
+ - llama-3
12
+ - llama-3-ko
13
  license: llama3
14
+ license_name: llama3
15
+ license_link: https://llama.meta.com/llama3/license
16
  ---
17
+
18
+ - Original model is [beomi/Llama-3-Open-Ko-8B](https://huggingface.co/beomi/Llama-3-Open-Ko-8B)
19
+ - quantized using [llama.cpp](https://github.com/ggerganov/llama.cpp)
20
+
21
+
22
+ ## Ollama
23
+
24
+ Modelfile
25
+
26
+ ```
27
+ FROM Llama-3-Open-Ko-8B-Q8_0.gguf
28
+
29
+ TEMPLATE """{{- if .System }}
30
+ <s>{{ .System }}</s>
31
+ {{- end }}
32
+ <s>Human:
33
+ {{ .Prompt }}</s>
34
+ <s>Assistant:
35
+ """
36
+
37
+ SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions."""
38
+
39
+ PARAMETER temperature 0
40
+ PARAMETER num_predict 3000
41
+ PARAMETER num_ctx 4096
42
+ PARAMETER stop <s>
43
+ PARAMETER stop </s>
44
+ ```
45
+
46
+
47
+ > Update @ 2024.04.24: Release Llama-3-Open-Ko-8B model & [Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
48
+
49
+ ## Model Details
50
+
51
+ **Llama-3-Open-Ko-8B**
52
+
53
+ The Llama-3-Open-Ko-8B model is a continued pretrained language model based on the Llama-3-8B framework. This model is trained with over 60GB of deduplicated texts sourced from publicly available resources. With the new Llama-3 tokenizer, the model has been pretrained with more than 17.7B tokens, which is slightly more than that processed by the Korean tokenizer of Llama-2. Training was conducted on a TPUv5e-256, supported by Google's TRC program.
54
+
55
+ **Llama-3-Open-Ko-8B-Instruct-preview**
56
+
57
+ The Instruction model, named Llama-3-Open-Ko-8B-Instruct-preview, incorporates concepts from the [Chat Vector paper](https://arxiv.org/abs/2310.04799). This model is a preview and has not been fine-tuned with any Korean instruction set, making it a strong starting point for developing new chat and instruct models.
58
+
59
+ **Meta Llama-3**
60
+
61
+ Developed and released by Meta, the Meta Llama 3 family of large language models (LLMs) are optimized for dialogue use cases and excel across common industry benchmarks, emphasizing helpfulness and safety.
62
+
63
+ **Model Developers**: Junbum Lee (Beomi)
64
+
65
+ **Variations**: Llama-3-Open-Ko is available in one configuration — 8B.
66
+
67
+ **Input/Output**: Models accept text input and generate text and code.
68
+
69
+ **Model Architecture**: Llama 3 utilizes an optimized transformer architecture.
70
+
71
+ <table>
72
+ <tr>
73
+ <td>
74
+ </td>
75
+ <td><strong>Training Data</strong>
76
+ </td>
77
+ <td><strong>Params</strong>
78
+ </td>
79
+ <td><strong>Context length</strong>
80
+ </td>
81
+ <td><strong>GQA</strong>
82
+ </td>
83
+ <td><strong>Token count</strong>
84
+ </td>
85
+ <td><strong>Knowledge cutoff</strong>
86
+ </td>
87
+ </tr>
88
+ <tr>
89
+ <td rowspan="2" >Llama-3-Open-Ko
90
+ </td>
91
+ <td rowspan="2" >Same as Open-Solar-Ko Dataset
92
+ </td>
93
+ <td>8B
94
+ </td>
95
+ <td>8k
96
+ </td>
97
+ <td>Yes
98
+ </td>
99
+ <td rowspan="2" >17.7B+
100
+ </td>
101
+ <td>Jun, 2023
102
+ </td>
103
+ </tr>
104
+ </table>
105
+
106
+ *Dataset list available [here](https://huggingface.co/beomi/OPEN-SOLAR-KO-10.7B/tree/main/corpus)
107
+
108
+ ## Intended Use
109
+
110
+ **Commercial and Research Applications**: Llama 3 is designed for use in English, tailored for assistant-like chat in its instruction-tuned models, while the pretrained models are versatile across various natural language generation tasks.
111
+
112
+ **Out-of-scope**: Any use violating applicable laws, regulations, or the Acceptable Use Policy and Llama 3 Community License is prohibited.
113
+
114
+ ### Responsibility & Safety
115
+
116
+ Meta's commitment to Responsible AI includes steps to limit misuse and harm while supporting the open source community. Developers are encouraged to implement safety best practices and use resources like [Meta Llama Guard 2](https://llama.meta.com/purple-llama/) and [Code Shield](https://llama.meta.com/purple-llama/) to tailor safety needs specifically to their use cases.
117
+
118
+ #### Responsible Release
119
+
120
+ Following a rigorous process against misuse, we ensure all safety and ethical guidelines are adhered to, as detailed in our [Responsible Use Guide](https://llama.meta.com/responsible-use-guide/).
121
+
122
+ ## Ethical Considerations and Limitations
123
+
124
+ Llama 3 is built on the principles of openness, inclusivity, and helpfulness, designed to be accessible and valuable across diverse backgrounds and use cases. Developers should undertake thorough safety testing and tuning for specific applications before deployment.
125
+
126
+ ## Citation Instructions
127
+
128
+ Llama-3-Open-Ko
129
+ @article{llama3openko,
130
+ title={Llama-3-Open-Ko},
131
+ author={L, Junbum},
132
+ year={2024},
133
+ url={https://huggingface.co/beomi/Llama-3-Open-Ko-8B}
134
+ }
135
+
136
+ Original Llama-3
137
+ @article{llama3modelcard,
138
+ title={Llama 3 Model Card},
139
+ author={AI@Meta},
140
+ year={2024},
141
+ url={https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
142
+ }