Ichsan2895
/

Merak-7B-v3

@@ -9,7 +9,7 @@ language:
 pipeline_tag: text-generation
 ---
-# THIS IS 1st PROTOTYPE OF MERAK-7B-v3!
 Merak-7B is the Large Language Model of Indonesian Language
@@ -21,6 +21,103 @@ Merak-7B and all of its derivatives are Licensed under Creative Commons-By Attri
 Big thanks to all my friends and communities that help to build our first model. Feel free, to ask me about the model and please share the news on your social media.
 ## CHANGELOG
 **v3** = Fine tuned by [Ichsan2895/OASST_Top1_Indonesian](https://huggingface.co/datasets/Ichsan2895/OASST_Top1_Indonesian) & [Ichsan2895/alpaca-gpt4-indonesian](https://huggingface.co/datasets/Ichsan2895/alpaca-gpt4-indonesian)
 **v2** = Finetuned version of first Merak-7B model. We finetuned again with the same ID Wikipedia articles except it changes prompt-style in the questions. It has 600k ID wikipedia articles.

 pipeline_tag: text-generation
 ---
+# HAPPY TO ANNOUNCE THE RELEASE OF MERAK-7B-V3!
 Merak-7B is the Large Language Model of Indonesian Language
 Big thanks to all my friends and communities that help to build our first model. Feel free, to ask me about the model and please share the news on your social media.
+## HOW TO USE
+### Installation
+Please make sure you have installed CUDA driver in your system, Python 3.10 and PyTorch 2. Then install this library in terminal
+```
+pip install bitsandbytes==0.39.1
+pip install transformers==4.31.0
+pip install peft==0.4.0
+pip install accelerate==0.20.3
+pip install einops==0.6.1 scipy sentencepiece datasets
+```
+### Using BitsandBytes and it run with >= 10 GB VRAM GPU
+[![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1gCCHo2KvqLr8Sf6aIbE9NLkOpn_w4v94?usp=drive_link)
+```
+import torch
+from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
+from peft import PeftModel, PeftConfig
+model_id = "Ichsan2895/Merak-7B-v3"
+config = AutoConfig.from_pretrained(model_id)
+BNB_CONFIG = BitsAndBytesConfig(load_in_4bit=True,
+                                bnb_4bit_compute_dtype=torch.bfloat16,
+                                bnb_4bit_use_double_quant=True,
+                                bnb_4bit_quant_type="nf4",
+    )
+model = AutoModelForCausalLM.from_pretrained(model_id,
+                                             quantization_config=BNB_CONFIG,
+                                             device_map="auto",
+                                             trust_remote_code=True)
+tokenizer = LlamaTokenizer.from_pretrained(model_id)
+def generate_response(question: str) -> str:
+  prompt = f"<|prompt|>{question}\n<|answer|>".strip()
+  encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
+  with torch.inference_mode():
+    outputs = model.generate(input_ids=encoding.input_ids,
+                             attention_mask=encoding.attention_mask,
+                             eos_token_id=tokenizer.pad_token_id,
+                             do_sample=False,
+                             num_beams=2,
+                             temperature=0.3,
+                             repetition_penalty=1.2,
+                             max_length=200)
+    response = tokenizer.decode(outputs[0], skip_special_tokes=True)
+    assistant_start = "<|answer|>"
+    response_start = response.find(assistant_start)
+return response[response_start + len(assistant_start) :].strip()
+prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
+print(generate_response(prompt))
+```
+### From my experience, For better answer, please don’t use BitsandBytes 4-bit Quantization, but it using higher VRAM
+[![Open in Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rOeFT9cC2OzlW6CUoEe4rpXn2USI3l-E?usp=drive_link)
+```
+import torch
+from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, BitsAndBytesConfig, LlamaTokenizer
+from peft import PeftModel, PeftConfig
+model_id = "Ichsan2895/Merak-7B-v3"
+config = AutoConfig.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id,
+                                             device_map="auto",
+                                             trust_remote_code=True)
+tokenizer = LlamaTokenizer.from_pretrained(model_id)
+def generate_response(question: str) -> str:
+  prompt = f"<|prompt|>{question}\n<|answer|>".strip()
+  encoding = tokenizer(prompt, return_tensors='pt').to("cuda")
+  with torch.inference_mode():
+    outputs = model.generate(input_ids=encoding.input_ids,
+                             attention_mask=encoding.attention_mask,
+                             eos_token_id=tokenizer.pad_token_id,
+                             do_sample=False,
+                             num_beams=2,
+                             temperature=0.3,
+                             repetition_penalty=1.2,
+                             max_length=200)
+    response = tokenizer.decode(outputs[0], skip_special_tokes=True)
+    assistant_start = "<|answer|>"
+    response_start = response.find(assistant_start)
+return response[response_start + len(assistant_start) :].strip()
+prompt = "Siapa penulis naskah proklamasi kemerdekaan Indonesia?"
+print(generate_response(prompt))
+```
 ## CHANGELOG
 **v3** = Fine tuned by [Ichsan2895/OASST_Top1_Indonesian](https://huggingface.co/datasets/Ichsan2895/OASST_Top1_Indonesian) & [Ichsan2895/alpaca-gpt4-indonesian](https://huggingface.co/datasets/Ichsan2895/alpaca-gpt4-indonesian)
 **v2** = Finetuned version of first Merak-7B model. We finetuned again with the same ID Wikipedia articles except it changes prompt-style in the questions. It has 600k ID wikipedia articles.