FineLlama-3.2-3B-Instruct-ead / README.md

Update README.md

a567fa0 verified 2 months ago

3.84 kB

	---
	library_name: transformers
	license: llama3.2
	---

	# FineLlama-3.2-3B-Instruct-ead

	This repository contains a fine-tuned version of LLaMa-3.2-3B-Instruct specifically trained to understand and generate EAD (Encoded Archival Description) XML format for archival records description.

	## Model Description

	* Base Model: meta-llama/Llama-3.2-3B-Instruct
	* Training Dataset: [Geraldine/Ead-Instruct-38k](https://huggingface.co/datasets/Geraldine/Ead-Instruct-38k)
	* Task: Generation of EAD/XML compliant archival descriptions
	* Training Type: Instruction fine-tuning with PEFT (Parameter Efficient Fine-Tuning) using LoRA

	## Key Features

	* Specialized in generating EAD/XML format for archival metadata
	* Trained on a comprehensive dataset of EAD/XML examples
	* Optimized for archival description tasks
	* Memory efficient through 4-bit quantization

	## Training Details

	### Technical Specifications

	* Quantization: 4-bit quantization using bitsandbytes

	* NF4 quantization type
	* Double quantization enabled
	* bfloat16 compute dtype

	### LoRA Configuration

	```
	- r: 256
	- alpha: 128
	- dropout: 0.05
	- target modules: all-linear
	```

	### Training parameters

	```
	- Epochs: 3
	- Batch Size: 3
	- Gradient Accumulation Steps: 2
	- Learning Rate: 2e-4
	- Warmup Ratio: 0.03
	- Max Sequence Length: 4096
	- Scheduler: Constant
	```

	### Training Infrastructure

	* Libraries: transformers, peft, trl
	* Mixed Precision: FP16/BF16 (based on hardware support)
	* Optimizer: fused adamw

	### Training Notebook

	The training Notebook is available on [Kaggle](https://www.kaggle.com/code/geraldinegeoffroy/ead-finetune-llama-3-2-3b-instruct)

	## Usage

	### Installation

	```
	pip install transformers torch bitsandbytes
	```

	### Loading the model

	```
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	import torch
	from peft import PeftModel, PeftConfig

	# Configure 4-bit quantization
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead"

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	quantization_config=bnb_config
	).to("cuda")

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	```

	### Example usage

	```
	messages = [
	{"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."},
	{"role": "user", "content": "Generate a minimal and compliant <eadheader> template with all required EAD/XML tags"},
	]

	inputs = tokenizer.apply_chat_template(
	messages,
	return_dict=True,
	tokenize = True,
	add_generation_prompt = True, # Must add for generation
	return_tensors = "pt",
	).to("cuda")

	outputs = model.generate(**inputs,
	max_new_tokens = 4096,
	pad_token_id=tokenizer.eos_token_id,
	use_cache = True,)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Limitations

	* The model is specifically trained for EAD/XML format and may not perform well on general archival tasks
	* Performance depends on the quality and specificity of the input prompts
	* Maximum sequence length is limited to 4096 tokens

	## Citation [optional]

	BibTeX:

	```
	@misc{ead-llama,
	author = {Géraldine Geoffroy},
	title = {EAD-XML LLaMa: Fine-tuned LLaMa Model for Archival Description},
	year = {2024},
	publisher = {HuggingFace},
	journal = {HuggingFace Repository},
	howpublished = {\url{https://huggingface.co/Geraldine/FineLlama-3.2-3B-Instruct-ead}}
	}
	```

	## Licence

	This model is subject to the same license as the base LLaMa model. Please refer to Meta's LLaMa license for usage terms and conditions.