Text Generation
Adapters
Thai
instruction-finetuning
Thaweewat commited on
Commit
1a233bb
1 Parent(s): 61afed1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md CHANGED
@@ -17,4 +17,67 @@ datasets:
17
 
18
  Buffala-LoRA is a 7B-parameter LLaMA model finetuned to follow instructions. It is trained on the Stanford Alpaca (TH), WikiTH, Pantip and IAppQ&A dataset and makes use of the Huggingface LLaMA implementation. For more information, please visit [the project's website](https://github.com/tloen/alpaca-lora).
19
 
 
 
 
 
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  Buffala-LoRA is a 7B-parameter LLaMA model finetuned to follow instructions. It is trained on the Stanford Alpaca (TH), WikiTH, Pantip and IAppQ&A dataset and makes use of the Huggingface LLaMA implementation. For more information, please visit [the project's website](https://github.com/tloen/alpaca-lora).
19
 
20
+ ## Issues and what next?
21
+ - The model still lacks a significant amount of world knowledge, so it is necessary to fine-tune it on larger Thai datasets.
22
+ - Currently, there is no translation prompt. We plan to fine-tune the model on the SCB Thai-English dataset soon.
23
+ - The model works well with the LangChain Search agent (Serpapi), which serves as a hotfix for world knowledge.
24
 
25
+
26
+ ## How to use
27
+
28
+ ```python
29
+ import torch
30
+ from peft import PeftModel
31
+ from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
32
+
33
+
34
+ device = "cuda"
35
+
36
+ tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
37
+ model = LlamaForCausalLM.from_pretrained(
38
+ "decapoda-research/llama-7b-hf",
39
+ load_in_8bit=True,
40
+ torch_dtype=torch.float16,
41
+ device_map="auto",
42
+ )
43
+ model = PeftModel.from_pretrained(
44
+ model,
45
+ "Thaweewat/thai-buffala-lora-7b-v0-1",
46
+ torch_dtype=torch.float16,
47
+ )
48
+
49
+
50
+ def evaluate(
51
+ instruction,
52
+ input=None,
53
+ temperature=0.1,
54
+ top_p=0.75,
55
+ top_k=40,
56
+ num_beams=4,
57
+ max_new_tokens=128,
58
+ **kwargs,
59
+ ):
60
+ prompt = generate_prompt(instruction, input)
61
+ inputs = tokenizer(prompt, return_tensors="pt")
62
+ input_ids = inputs["input_ids"].to(device)
63
+ generation_config = GenerationConfig(
64
+ temperature=temperature,
65
+ top_p=top_p,
66
+ top_k=top_k,
67
+ num_beams=num_beams,
68
+ **kwargs,
69
+ )
70
+ with torch.no_grad():
71
+ generation_output = model.generate(
72
+ input_ids=input_ids,
73
+ generation_config=generation_config,
74
+ return_dict_in_generate=True,
75
+ output_scores=True,
76
+ max_new_tokens=max_new_tokens,
77
+ )
78
+ s = generation_output.sequences[0]
79
+ output = tokenizer.decode(s)
80
+ return output.split("### Response:")[1].strip()
81
+
82
+ evaluate(instruction = "จงแก้สมการต่อไปนี้ X เท่ากับเท่าไหร่", input="X+Y=15 and Y=7")
83
+ """ X = 8"""