LloroV3 / README.md

Update README.md

57fd89e verified 6 months ago

6.21 kB

	---
	library_name: transformers
	base_model: codellama/CodeLlama-7b-Instruct-hf
	license: llama2
	datasets:
	- semantixai/LloroV3
	language:
	- pt
	tags:
	- code
	- analytics
	- analise-dados
	- portugues-BR

	co2_eq_emissions:
	emissions: 1320
	source: "Lacoste, Alexandre, et al. “Quantifying the Carbon Emissions of Machine Learning.” ArXiv (Cornell University), 21 Oct. 2019, https://doi.org/10.48550/arxiv.1910.09700."
	training_type: "fine-tuning"
	geographical_location: "Council Bluffs, Iowa, USA."
	hardware_used: "1 A100 40GB GPU"
	---

	Lloro 7B

	<img src="https://cdn-uploads.huggingface.co/production/uploads/653176dc69fffcfe1543860a/h0kNd9OTEu1QdGNjHKXoq.png" width="300" alt="Lloro-7b Logo"/>

	Lloro, developed by Semantix Research Labs , is a language Model that was trained to effectively perform Portuguese Data Analysis in Python. It is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf, that was trained on synthetic datasets. The fine-tuning process was performed using the QLORA metodology on a GPU A100 with 40 GB of RAM.

	Model description

	Model type: A 7B parameter fine-tuned on synthetic datasets.

	Language(s) (NLP): Primarily Portuguese, but the model is capable to understand English as well

	Finetuned from model: codellama/CodeLlama-7b-Instruct-hf

	What is Lloro's intended use(s)?

	Lloro is built for data analysis in Portuguese contexts .

	Input : Text

	Output : Text (Code)

	Usage

	Using Transformers

	```python
	#Import required libraries
	import torch
	from transformers import (
	AutoModelForCausalLM,
	AutoTokenizer
	)

	#Load Model
	model_name = "semantixai/LloroV2"
	base_model = AutoModelForCausalLM.from_pretrained(
	model_name,
	return_dict=True,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	#Load Tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)


	#Define Prompt
	user_prompt = "Desenvolva um algoritmo em Python para calcular a média e a mediana dos preços de vendas por tipo de material do produto."
	system = "Provide answers in Python without explanations, only the code"
	prompt_template = f"[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user_prompt}[/INST]"

	#Call the model
	input_ids = tokenizer([prompt_template], return_tensors="pt")["input_ids"].to("cuda")


	outputs = base_model.generate(
	input_ids,
	do_sample=True,
	top_p=0.95,
	max_new_tokens=1024,
	temperature=0.1,
	)

	#Decode and retrieve Output
	output_text = tokenizer.batch_decode(outputs, skip_prompt=True, skip_special_tokens=False)
	display(output_text)
	```

	Using an OpenAI compatible inference server (like [vLLM](https://docs.vllm.ai/en/latest/index.html))

	```python
	from openai import OpenAI

	client = OpenAI(
	api_key="EMPTY",
	base_url="http://localhost:8000/v1",
	)
	user_prompt = "Desenvolva um algoritmo em Python para calcular a média e a mediana dos preços de vendas por tipo de material do produto."
	completion = client.chat.completions.create(temperature=0.1,frequency_penalty=0.1,model="semantixai/Lloro",messages=[{"role":"system","content":"Provide answers in Python without explanations, only the code"},{"role":"user","content":user_prompt}])
	```

	Params
	Training Parameters
	\| Params \| Training Data \| Examples \| Tokens \| LR \|
	\|----------------------------------\|-----------------------------------\|---------------------------------\|----------\|--------\|
	\| 7B \| Pairs synthetic instructions/code \| 74222 \| 9 351 532\| 2e-4 \|

	Model Sources

	Test Dataset Repository: <https://huggingface.co/datasets/semantixai/LloroV3>

	Model Dates: Lloro was trained between February 2024 and April 2024.

	Performance
	\| Modelo \| LLM as Judge \| Code Bleu Score \| Rouge-L \| CodeBert- Precision \| CodeBert-Recall \| CodeBert-F1 \| CodeBert-F3 \|
	\|----------------\|--------------\|------------------\|---------\|----------------------\|-----------------\|-------------\|-------------\|
	\| GPT 3.5 \| 94.29% \| 0.3538 \| 0.3756 \| 0.8099 \| 0.8176 \| 0.8128 \| 0.8164 \|
	\| Instruct -Base \| 88.77% \| 0.3666 \| 0.3351 \| 0.8244 \| 0.8025 \| 0.8121 \| 0.8052 \|
	\| Instruct -FT \| 97.95% \| 0.5967 \| 0.6717 \| 0.9090 \| 0.9182 \| 0.9131 \| 0.9171 \|

	Training Infos:
	The following hyperparameters were used during training:

	\| Parameter \| Value \|
	\|---------------------------\|--------------------------\|
	\| learning_rate \| 2e-4 \|
	\| weight_decay \| 0.0001 \|
	\| train_batch_size \| 7 \|
	\| eval_batch_size \| 7 \|
	\| seed \| 42 \|
	\| optimizer \| Adam - paged_adamw_32bit \|
	\| lr_scheduler_type \| cosine \|
	\| lr_scheduler_warmup_ratio \| 0.06 \|
	\| num_epochs \| 4.0 \|

	QLoRA hyperparameters
	The following parameters related with the Quantized Low-Rank Adaptation and Quantization were used during training:

	\| Parameter \| Value \|
	\|------------------\|-----------\|
	\| lora_r \| 64 \|
	\| lora_alpha \| 256 \|
	\| lora_dropout \| 0.1 \|
	\| storage_dtype \| "nf4" \|
	\| compute_dtype \| "bfloat16"\|

	Experiments
	\| Model \| Epochs \| Overfitting \| Final Epochs \| Training Hours \| CO2 Emission (Kg) \|
	\|-----------------------\|--------\|-------------\|--------------\|-----------------\|-------------------\|
	\| Code Llama Instruct \| 1 \| No \| 1 \| 3.01 \| 0.43 \|
	\| Code Llama Instruct \| 4 \| Yes \| 3 \| 9.25 \| 1.32 \|

	Framework versions

	\| Library \| Version \|
	\|---------------\|-----------\|
	\| bitsandbytes \| 0.40.2 \|
	\| Datasets \| 2.14.3 \|
	\| Pytorch \| 2.0.1 \|
	\| Tokenizers \| 0.14.1 \|
	\| Transformers \| 4.34.0 \|