kfkas
/

Llama-2-ko-7b-Chat

Text Generation

llama-2-ko-chat

text-generation-inference

Model card Files Files and versions Community

kfkas commited on Jul 27, 2023

Commit

1122542

•

1 Parent(s): 4f20837

Update README.md

Files changed (1) hide show

README.md +46 -0

README.md CHANGED Viewed

@@ -53,6 +53,52 @@ Llama-2-Ko-7b-Chat은 [beomi/llama-2-ko-7b 40B](https://huggingface.co/beomi/lla
 <img src=https://github.com/taemin6697/Paper_Review/assets/96530685/b9a697a2-ef06-4b1c-97e1-e72b20d9a8b5 style="max-width: 700px; width: 100%" />
 ---
 ## Note for oobabooga/text-generation-webui
 Remove `ValueError` at `load_tokenizer` function(line 109 or near), in `modules/models.py`.

 <img src=https://github.com/taemin6697/Paper_Review/assets/96530685/b9a697a2-ef06-4b1c-97e1-e72b20d9a8b5 style="max-width: 700px; width: 100%" />
 ---
+### Inference
+```python
+def gen(x, model, tokenizer, device):
+    prompt = (
+        f"아래는 작업을 설명하는 명령어입니다. 요청을 적절히 완료하는 응답을 작성하세요.\n\n### 명령어:\n{x}\n\n### 응답:"
+    )
+    len_prompt = len(prompt)
+    gened = model.generate(
+        **tokenizer(prompt, return_tensors="pt", return_token_type_ids=False).to(
+            device
+        ),
+        max_new_tokens=1024,
+        early_stopping=True,
+        do_sample=True,
+        top_k=10,
+        top_p=0.92,
+        no_repeat_ngram_size=3,
+        eos_token_id=2,
+        repetition_penalty=1.2,
+        num_beams=3
+    )
+    return tokenizer.decode(gened[0])[len_prompt:]
+def LLM_infer(input):
+    device = (
+        torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
+    )
+    model_id = "kfkas/legal-llama-2-ko-7b-Chat"
+    model = AutoModelForCausalLM.from_pretrained(
+        model_id, device_map={"": 0},torch_dtype=torch.float16, low_cpu_mem_usage=True
+    )
+    tokenizer = AutoTokenizer.from_pretrained(model_id)
+    model.eval()
+    model.config.use_cache = (True)
+    tokenizer.pad_token = tokenizer.eos_token
+    output = gen(input, model=model, tokenizer=tokenizer, device=device)
+    return output
+if __name__ == "__main__":
+    text = LLM_infer("살인죄를 알려줘")
+    print(text)
+```
 ## Note for oobabooga/text-generation-webui
 Remove `ValueError` at `load_tokenizer` function(line 109 or near), in `modules/models.py`.