AhmedBou
/

Arabic-Meta-Llama-3.1-8B-LoRA

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AhmedBou commited on Jul 29, 2024

Commit

eafcbce

·

verified ·

1 Parent(s): 96a4f67

Update README.md

Files changed (1) hide show

README.md +49 -1

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
 language:
 - en
 license: apache-2.0
 tags:
 - text-generation-inference
@@ -9,14 +10,61 @@ tags:
 - unsloth
 - llama
 - trl
 ---
 # Uploaded  model
 - **Developed by:** AhmedBou
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
 language:
 - en
+- ar
 license: apache-2.0
 tags:
 - text-generation-inference
 - unsloth
 - llama
 - trl
+datasets:
+- AhmedBou/Arabic_instruction_dataset_for_llm_ft
 ---
 # Uploaded  model
+For Inference Using thid LoRA adapters please use this code
+````Python
+# Installs Unsloth, Xformers (Flash Attention) and all other packages!
+!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
+!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
+````
+````Python
+from unsloth import FastLanguageModel
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "AhmedBou/Arabic-Meta-Llama-3.1-8B_LoRA", # YOUR MODEL YOU USED FOR TRAINING
+    max_seq_length = 2048,
+    dtype = None,
+    load_in_4bit = True,
+)
+FastLanguageModel.for_inference(model) # Enable native 2x faster inference
+alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+{}
+### Input:
+{}
+### Response:
+{}"""
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        "قم بصياغة الجملة الإنجليزية التالية باللغة العربية.", # instruction
+        "We hope that the last cases will soon be resolved through the mechanisms established for this purpose.", # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+from transformers import TextStreamer
+text_streamer = TextStreamer(tokenizer)
+_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
+````
 - **Developed by:** AhmedBou
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)