File size: 9,316 Bytes
49aef49 6d30494 49aef49 67e230b 49aef49 efd0ba9 68f80ca 744c282 7108830 3dcdd6b 5940292 3dcdd6b d4e0fa0 5027184 5f0ad12 1d2066a 3dcdd6b 05078ae a4013ca 363cd2f a898ed4 a4013ca d754ecb a4013ca d754ecb a4013ca d754ecb a4013ca 363cd2f d754ecb a4013ca a32164c ef418d5 a32164c 363cd2f 313981f a32164c 313981f a32164c d754ecb a32164c d754ecb a32164c 313981f d754ecb 313981f a32164c d754ecb a32164c d754ecb a32164c 363cd2f 3dcdd6b 20a2232 2879674 3dcdd6b 7bce14b 49aef49 3dcdd6b f7d38ee 691aa18 4f52ce4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
---
datasets:
- danielpark/gorani-100k-llama2-13b-instruct
language:
- en
library_name: bitsandbytes, transformers, peft, accelerate, bitsandbytes, datasets, deepspeed, trl
pipeline_tag: text-generation
---
# Sample weight
Sample eval in [Open LLM Leaderboard](https://huggingface.co/datasets/open-llm-leaderboard/details_danielpark__gorani-100k-llama2-13b-instruct)
## GORANI 100k
- LFM: [llama2-13b-chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)
- Model: [danielpark/gorani-100k-llama2-13b-instruct](https://huggingface.co/danielpark/gorani-100k-llama2-13b-instruct)
- Dataset: [danielpark/gorani-100k](https://huggingface.co/danielpark/gorani-100k)
- **License**: This model is licensed under the Meta's [LLaMA2 license](https://github.com/facebookresearch/llama/blob/main/LICENSE). You may not use it commercially, and you must adhere to the licenses of the included datasets. Therefore, I currently adopt the strictest and most restrictive license. Please refrain from using it for commercial purposes under any circumstances until an official license is issued.
<br>
KORANI is derived from GORANI, a project within llama2 that experiments with the distribution of appropriate datasets to transfer or distill knowledge based on English datasets. Officially, it's called Grid Of Ranvier Node In llama2 (GORANI), based on the biological term Ranvier Node, and aims to explore the optimal dataset for transferring knowledge in various languages and specific domains. Due to strict licensing issues with English datasets, GORANI is primarily for research purposes. Therefore, we are refining and training a commercially usable Korean dataset on top of llama2, based on the experimental results of the GORANI project, and this project is named KORANI (Korean GORANI).
- I have conducted preliminary experiments using various techniques such as RoPE scaling, Attention Sinks, and Flash Attention 1 and 2, SWA(Sliding Window Attention), GQA(Grouped Query Attention).
- Please do not use the current model weights as they are not official model weight.
- The most stringent non-commercial use license (CC-BY-NC-4.0) among the licenses of the datasets used for training is also applied to the model weights.
- On 2023-11-12, it was decided that all projects would be kept private. It may be released in a non-public model format on cloud platforms by 2024.
<br>
## Template
Instruction model
```
### System:
{{ System prompt }}
### User:
{{ New user input }}
### Assistant:
{Assistant}
```
Chat model
```
<s>[INST] <<SYS>>
{{ System prompt }}
<</SYS>>
{{ New user message }} [/INST]
```
<details>
<summary> Templates </summary>
#### Instruct model template
For safety, I used the default system message from Llama-2. But if a system message is specified in any datasets, I use that content.
```python
### System:
{{ System prompt }}
### User:
{{ New user input }}
### Input:
{{ Optional additional user input }}
### Assistant:
{{ New assistant answer }}
```
need to be converted to
```
### System:
{{ System prompt }}
### User:
{{ New user input }}
### Assistant:
{Assistant}
```
## Chat model template
```
# https://github.com/huggingface/blog/blob/main/llama2.md#how-to-prompt-llama-2
<s>[INST] <<SYS>>
{{ System prompt }}
<</SYS>>
{{ New user message }} [/INST]
```
```
<s>[INST] {{ User input }} [/INST] {{ New assistant answer }} </s>[INST] {{ New user input }} [/INST]
```
```
# Llama2 official
default_llama2_system_prompt = "Always assist with care, respect, and truth. Respond with utmost utility yet securely. Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity."
# https://docs.mistral.ai/usage/guardrailing
more_safety_guardrail = """
You're given a list of moderation categories as below:
- illegal: Illegal activity.
- child abuse: child sexual abuse material or any content that exploits or harms children.
- hate violence harassment: Generation of hateful, harassing, or violent content: content that expresses, incites, or promotes hate based on identity, content that intends to harass, threaten, or bully an individual, content that promotes or glorifies violence or celebrates the suffering or humiliation of others.
- malware: Generation of malware: content that attempts to generate code that is designed to disrupt, damage, or gain unauthorized access to a computer system.
- physical harm: activity that has high risk of physical harm, including: weapons development, military and warfare, management or operation of critical infrastructure in energy, transportation, and water, content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
- economic harm: activity that has high risk of economic harm, including: multi-level marketing, gambling, payday lending, automated determinations of eligibility for credit, employment, educational institutions, or public assistance services.
- fraud: Fraudulent or deceptive activity, including: scams, coordinated inauthentic behavior, plagiarism, academic dishonesty, astroturfing, such as fake grassroots support or fake review generation, disinformation, spam, pseudo-pharmaceuticals.
- adult: Adult content, adult industries, and dating apps, including: content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness), erotic chat, pornography.
- political: Political campaigning or lobbying, by: generating high volumes of campaign materials, generating campaign materials personalized to or targeted at specific demographics, building conversational or interactive systems such as chatbots that provide information about campaigns or engage in political advocacy or lobbying, building products for political campaigning or lobbying purposes.
- privacy: Activity that violates people's privacy, including: tracking or monitoring an individual without their consent, facial recognition of private individuals, classifying individuals based on protected characteristics, using biometrics for identification or assessment, unlawful collection or disclosure of personal identifiable information or educational, financial, or other protected records.
- unqualified law: Engaging in the unauthorized practice of law, or offering tailored legal advice without a qualified person reviewing the information.
- unqualified financial: Offering tailored financial advice without a qualified person reviewing the information.
- unqualified health: Telling someone that they have or do not have a certain health condition, or providing instructions on how to cure or treat a health condition.
Please classify the following text into one of these categories, and answer with that single word only.
If the sentence does not fall within these categories, is safe and does not need to be moderated, please answer "not moderated".
"""
```
The context-dialogue refers to the system prompt I suppose, and in this case we have N samples:
```
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
USER: <user_msg_1>
ASSISTANT: <reply_1>
```
```
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
USER:
ASSISTANT:
USER:
ASSISTANT:
```
```
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
USER: <user_msg_1>
ASSISTANT: <reply_1>
USER: <user_msg_2>
ASSISTANT: <reply_2>
...
USER: <user_msg_N>
ASSISTANT: <reply_N>
```
For each sample S, at training time, we pass the context:
```
[SYSTEM: ](SYSTEM: Act as if you were Napoleon who like playing cricket.
USER: <user_msg_1>
ASSISTANT: <reply_1>
USER: <user_msg_2>
ASSISTANT: <reply_2>
...
USER: <user_msg_S-1>
ASSISTANT: <reply_S-1>
USER: <user_msg_S>
```
And we train the probabilities on ASSISTANT: <reply_S>: the loss is zeroed out for all tokens before, then you compute cross-entropy between ground-truth (the actual <reply_S>) and the predicted tokens P(t>k|t0, ..., tk) where k is the length of the sequence
</details>
## Update
- Since we cannot control resources, we will record the schedule retrospectively.
| Update Schedule | Task Description | Status |
|-----------------|----------------------------|--------|
| 23-10-05 | Completed training - 19.7k 13b weight (specific data)| Done |
| 23-10-06 | Submitted hf model weights (REV 01) | Done |
| 23-10-20 | Q.C | Done |
| 23-11-12 | Changed to a private project. | Kept private |
## Caution
The model weights and dataset have not been properly curated yet and are strictly prohibited for use under any license. In relation to this, the developers do not assume any responsibility, either implicitly or explicitly.
## Revisions
| Revision | Commit Hash | Updated | Train Process | Status |
| ---------------|------------------------------------------------------------|------------|------------------|---------------|
| Revision 01 | [6d30494fa8da84128499d55075eef57094336d03](https://huggingface.co/danielpark/gorani-100k-llama2-13b-instruct/commit/6d30494fa8da84128499d55075eef57094336d03) | 23.10.04 | 19,740/100,000 | On Training |
|