AhmedBou
/

Arabic-Meta-Llama-3.1-8B-LoRA

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Arabic-Meta-Llama-3.1-8B-LoRA / README.md

AhmedBou's picture

Update README.md

95acb72 verified 5 months ago

|

history blame contribute delete

3.03 kB

	---
	base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
	language:
	- en
	- ar
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	datasets:
	- AhmedBou/Arabic_instruction_dataset_for_llm_ft
	---

	A suitable name for this section could be:

	# Model Description

	This model is fine-tuned from LLama 3.1 8B, enhanced for improved capability in the Arabic language.
	It was fine-tuned on 10,000 samples using Alpaca prompt instructions.

	Please refer to this repository when using the model.

	## To perform inference using these LoRA adapters, please use the following code:


	````Python
	# Installs Unsloth, Xformers (Flash Attention) and all other packages!
	!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
	!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes
	````

	````Python
	from unsloth import FastLanguageModel
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = "AhmedBou/Arabic-Meta-Llama-3.1-8B_LoRA", # YOUR MODEL YOU USED FOR TRAINING
	max_seq_length = 2048,
	dtype = None,
	load_in_4bit = True,
	)
	FastLanguageModel.for_inference(model) # Enable native 2x faster inference

	alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	{}

	### Input:
	{}

	### Response:
	{}"""

	inputs = tokenizer(
	[
	alpaca_prompt.format(
	"قم بصياغة الجملة الإنجليزية التالية باللغة العربية.", # instruction
	"We hope that the last cases will soon be resolved through the mechanisms established for this purpose.", # input
	"", # output - leave this blank for generation!
	)
	], return_tensors = "pt").to("cuda")

	from transformers import TextStreamer
	text_streamer = TextStreamer(tokenizer)
	_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
	````

	````Markdown

	The Outout is:

	<\|begin_of_text\|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	قم بصياغة الجملة الإنجليزية التالية باللغة العربية.

	### Input:
	We hope that the last cases will soon be resolved through the mechanisms established for this purpose.

	### Response:
	وأملنا في أن يكون هناك حل سريع للمواد الأخيرة من خلال الآليات المحددة لهذا الغرض.<\|end_of_text\|>

	````

	# Uploaded model
	- Developed by: AhmedBou
	- License: apache-2.0
	- Finetuned from model : unsloth/meta-llama-3.1-8b-bnb-4bit

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)