File size: 5,069 Bytes
2d11f2c 617cb18 2d11f2c adf2159 2d11f2c 617cb18 6da9ca7 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 2d11f2c adf2159 617cb18 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
---
license: other
language:
- ko
pipeline_tag: question-answering
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This is a merged version from the trained QLoRa Adapter, [jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K](jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K).
Also the adapter was trained above the foundation model [meta-llama/Llama-2-7b-chat-hf](meta-llama/Llama-2-7b-chat-hf).
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [Jangmin Oh](https://huggingface.co/jangmin)
- **Model type:** llama2
- **Language(s) (NLP):** ko
- **License:** You shoud keep the meta's llama license. Please visit: https://ai.meta.com/resources/models-and-libraries/llama-downloads/
- **Finetuned from model:** [meta-llama/Llama-2-7b-chat-hf](meta-llama/Llama-2-7b-chat-hf)
## Uses
Step 1. load the model and the tokenizer.
```python
merged_model_hub_id = 'jangmin/merged-llama2-7b-chat-hf-food-order-understanding-30K'
tokenizer = AutoTokenizer.from_pretrained(merged_model_hub_id)
model = AutoModelForCausalLM.from_pretrained(merged_model_hub_id, device_map="auto", torch_dtype=torch.float16, cache_dir=cache_dir)
```
Step 2. prepare auxiliary tools
```python
instruction_prompt_template = """### ๋ค์ ์ฃผ๋ฌธ ๋ฌธ์ฅ์ ๋ถ์ํ์ฌ ์์๋ช
, ์ต์
๋ช
, ์๋์ ์ถ์ถํด์ค.
### ๋ช
๋ น: {0} ### ์๋ต:
"""
def generate_helper(pipeline, query):
prompt = instruction_prompt_template.format(query)
out = pipeline(prompt, max_new_tokens=256, do_sample=False, eos_token_id=tokenizer.eos_token_id)
generated_text = out[0]["generated_text"][len(prompt):]
return generated_text
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
stat_dic = pd.DataFrame.from_records([generate_helper(pipe, query) for query in evaluation_queries])
```
Step 3. let's rock & roll.
```python
print(generate_helpher(pipe, "์์ด์ค์๋ฉ๋ฆฌ์นด๋
ธ ํจ์ฌ์ด์ฆ ํ์ ํ๊ณ ์. ๋ธ๊ธฐ์ค๋ฌด๋ ํ์ ์ฃผ์ธ์. ๋, ์ฝ๋๋ธ๋ฃจ๋ผ๋ผ ํ๋์."))
```
## Bias, Risks, and Limitations
Please refer [jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K](jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K) for the information about Bias, Risk, and Limitations.
## Training Details
### Training Procedure
Please refer [jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K](jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K). You can find the fine-tuning strategy.
### Merging Procedure
To merge the adapter on the pretrained model, I wrote following codes.
Step 1. initialize.
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, AutoConfig, pipeline
from peft import PeftModel, PeftConfig, AutoPeftModelForCausalLM
peft_model_id = "jangmin/qlora-llama2-7b-chat-hf-food-order-understanding-30K"
config = PeftConfig.from_pretrained(peft_model_id)
IGNORE_INDEX = -100
DEFAULT_PAD_TOKEN = "[PAD]"
```
Step 2. load the fine-tuned model and the tokenzer.
```python
device_map = "cpu"
trained_model = AutoPeftModelForCausalLM.from_pretrained(
peft_model_id,
low_cpu_mem_usage=True,
return_dict=True,
torch_dtype=torch.float16,
device_map=device_map,
cache_dir=cache_dir
)
tokenizer = AutoTokenizer.from_pretrained(
config.base_model_name_or_path,
padding_side='right',
tokenizer_type="llama",
trust_remote_code=True,
cache_dir=cache_dir
)
```
Step 3. Modify the model and the tokenizer to treat the `PAD` token. (llama tokenizer needs to incorporate the pad token into the vocabulary. )
```python
def smart_tokenizer_and_embedding_resize(
special_tokens_dict: Dict,
tokenizer: transformers.PreTrainedTokenizer,
model: transformers.PreTrainedModel,
):
"""Resize tokenizer and embedding.
Note: This is the unoptimized version that may make your embedding size not be divisible by 64.
"""
num_new_tokens = tokenizer.add_special_tokens(special_tokens_dict)
model.resize_token_embeddings(len(tokenizer))
if num_new_tokens > 0:
input_embeddings_data = model.get_input_embeddings().weight.data
input_embeddings_avg = input_embeddings_data[:-num_new_tokens].mean(
dim=0, keepdim=True
)
input_embeddings_data[-num_new_tokens:] = input_embeddings_avg
if with_pad_token and tokenizer._pad_token is None:
smart_tokenizer_and_embedding_resize(
special_tokens_dict=dict(pad_token=DEFAULT_PAD_TOKEN),
tokenizer=tokenizer,
model=trained_model,
)
trained_model.config.pad_token_id = tokenizer.pad_token_id
```
Step 4. merge and push to hub.
```python
merged_model = trained_model.merge_and_unload()
hub_id = "jangmin/merged-llama2-7b-chat-hf-food-order-understanding-30K"
merged_model.push_to_hub(hub_id, max_shard_size="4GB", safe_serialization=True, commit_message='recommit after pad_token was treated.')
``` |