Text Generation
Transformers
PyTorch
Indonesian
English
llama
text-generation-inference
Inference Endpoints
Ichsan2895 commited on
Commit
1ecc53c
1 Parent(s): 98beccf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -1
README.md CHANGED
@@ -9,7 +9,7 @@ language:
9
  pipeline_tag: text-generation
10
  ---
11
 
12
- # THIS IS 1st PROTOTYPE OF MERAK-7B-v3!
13
 
14
  Merak-7B is the Large Language Model of Indonesian Language
15
 
@@ -21,6 +21,103 @@ Merak-7B and all of its derivatives are Licensed under Creative Commons-By Attri
21
 
22
  Big thanks to all my friends and communities that help to build our first model. Feel free, to ask me about the model and please share the news on your social media.
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  ## CHANGELOG
25
  **v3** = Fine tuned by [Ichsan2895/OASST_Top1_Indonesian](https://huggingface.co/datasets/Ichsan2895/OASST_Top1_Indonesian) & [Ichsan2895/alpaca-gpt4-indonesian](https://huggingface.co/datasets/Ichsan2895/alpaca-gpt4-indonesian)
26
  **v2** = Finetuned version of first Merak-7B model. We finetuned again with the same ID Wikipedia articles except it changes prompt-style in the questions. It has 600k ID wikipedia articles.
 
9
  pipeline_tag: text-generation
10
  ---
11
 
12
+ # HAPPY TO ANNOUNCE THE RELEASE OF MERAK-7B-V3!
13
 
14
  Merak-7B is the Large Language Model of Indonesian Language
15
 
 
21
 
22
  Big thanks to all my friends and communities that help to build our first model. Feel free, to ask me about the model and please share the news on your social media.
23
 
24
+ ## HOW TO USE
25
+ ### Installation
26
+ Please make sure you have installed CUDA driver in your system, Python 3.10 and PyTorch 2. Then install this library in terminal
27
+ ```
28
+ pip install bitsandbytes==0.39.1
29
+ pip install transformers==4.31.0
30
+ pip install peft==0.4.0
31
+ pip install accelerate==0.20.3
32
+ pip install einops==0.6.1 scipy sentencepiece datasets
33
+ ```
34
+ ### Using BitsandBytes and it run with >= 10 GB VRAM GPU
35
+ [![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1gCCHo2KvqLr8Sf6aIbE9NLkOpn_w4v94?usp=drive_link)
36
+ ```
37
+ import torch
38
+ from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
39
+ from peft import PeftModel, PeftConfig
40
+
41
+ model_id = "Ichsan2895/Merak-7B-v3"
42
+ config = AutoConfig.from_pretrained(model_id)
43
+
44
+ BNB_CONFIG = BitsAndBytesConfig(load_in_4bit=True,
45
+ bnb_4bit_compute_dtype=torch.bfloat16,
46
+ bnb_4bit_use_double_quant=True,
47
+ bnb_4bit_quant_type="nf4",
48
+ )
49
+
50
+ model = AutoModelForCausalLM.from_pretrained(model_id,
51
+ quantization_config=BNB_CONFIG,
52
+ device_map="auto",
53
+ trust_remote_code=True)
54
+
55
+ tokenizer = LlamaTokenizer.from_pretrained(model_id)
56
+
57
+ def generate_response(question: str) -> str:
58
+ prompt = f"<|prompt|>{question}\n<|answer|>".strip()
59
+
60
+ encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
61
+ with torch.inference_mode():
62
+ outputs = model.generate(input_ids=encoding.input_ids,
63
+ attention_mask=encoding.attention_mask,
64
+ eos_token_id=tokenizer.pad_token_id,
65
+ do_sample=False,
66
+ num_beams=2,
67
+ temperature=0.3,
68
+ repetition_penalty=1.2,
69
+ max_length=200)
70
+
71
+ response = tokenizer.decode(outputs[0], skip_special_tokes=True)
72
+
73
+ assistant_start = "<|answer|>"
74
+ response_start = response.find(assistant_start)
75
+ return response[response_start + len(assistant_start) :].strip()
76
+
77
+ prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
78
+ print(generate_response(prompt))
79
+ ```
80
+
81
+
82
+ ### From my experience, For better answer, please don’t use BitsandBytes 4-bit Quantization, but it using higher VRAM
83
+ [![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rOeFT9cC2OzlW6CUoEe4rpXn2USI3l-E?usp=drive_link)
84
+ ```
85
+ import torch
86
+ from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
87
+ from peft import PeftModel, PeftConfig
88
+
89
+ model_id = "Ichsan2895/Merak-7B-v3"
90
+ config = AutoConfig.from_pretrained(model_id)
91
+ model = AutoModelForCausalLM.from_pretrained(model_id,
92
+ device_map="auto",
93
+ trust_remote_code=True)
94
+
95
+ tokenizer = LlamaTokenizer.from_pretrained(model_id)
96
+
97
+ def generate_response(question: str) -> str:
98
+ prompt = f"<|prompt|>{question}\n<|answer|>".strip()
99
+
100
+ encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
101
+ with torch.inference_mode():
102
+ outputs = model.generate(input_ids=encoding.input_ids,
103
+ attention_mask=encoding.attention_mask,
104
+ eos_token_id=tokenizer.pad_token_id,
105
+ do_sample=False,
106
+ num_beams=2,
107
+ temperature=0.3,
108
+ repetition_penalty=1.2,
109
+ max_length=200)
110
+
111
+ response = tokenizer.decode(outputs[0], skip_special_tokes=True)
112
+
113
+ assistant_start = "<|answer|>"
114
+ response_start = response.find(assistant_start)
115
+ return response[response_start + len(assistant_start) :].strip()
116
+
117
+ prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
118
+ print(generate_response(prompt))
119
+ ```
120
+
121
  ## CHANGELOG
122
  **v3** = Fine tuned by [Ichsan2895/OASST_Top1_Indonesian](https://huggingface.co/datasets/Ichsan2895/OASST_Top1_Indonesian) & [Ichsan2895/alpaca-gpt4-indonesian](https://huggingface.co/datasets/Ichsan2895/alpaca-gpt4-indonesian)
123
  **v2** = Finetuned version of first Merak-7B model. We finetuned again with the same ID Wikipedia articles except it changes prompt-style in the questions. It has 600k ID wikipedia articles.