Geraldine's picture
Update README.md
a567fa0 verified
---
library_name: transformers
license: llama3.2
---
# FineLlama-3.2-3B-Instruct-ead
This repository contains a fine-tuned version of LLaMa-3.2-3B-Instruct specifically trained to understand and generate EAD (Encoded Archival Description) XML format for archival records description.
## Model Description
* **Base Model**: meta-llama/Llama-3.2-3B-Instruct
* **Training Dataset**: [Geraldine/Ead-Instruct-38k](https://huggingface.co/datasets/Geraldine/Ead-Instruct-38k)
* **Task**: Generation of EAD/XML compliant archival descriptions
* **Training Type**: Instruction fine-tuning with PEFT (Parameter Efficient Fine-Tuning) using LoRA
## Key Features
* Specialized in generating EAD/XML format for archival metadata
* Trained on a comprehensive dataset of EAD/XML examples
* Optimized for archival description tasks
* Memory efficient through 4-bit quantization
## Training Details
### Technical Specifications
* **Quantization**: 4-bit quantization using bitsandbytes
* NF4 quantization type
* Double quantization enabled
* bfloat16 compute dtype
### LoRA Configuration
```
- r: 256
- alpha: 128
- dropout: 0.05
- target modules: all-linear
```
### Training parameters
```
- Epochs: 3
- Batch Size: 3
- Gradient Accumulation Steps: 2
- Learning Rate: 2e-4
- Warmup Ratio: 0.03
- Max Sequence Length: 4096
- Scheduler: Constant
```
### Training Infrastructure
* Libraries: transformers, peft, trl
* Mixed Precision: FP16/BF16 (based on hardware support)
* Optimizer: fused adamw
### Training Notebook
The training Notebook is available on [Kaggle](https://www.kaggle.com/code/geraldinegeoffroy/ead-finetune-llama-3-2-3b-instruct)
## Usage
### Installation
```
pip install transformers torch bitsandbytes
```
### Loading the model
```
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
from peft import PeftModel, PeftConfig
# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead"
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
quantization_config=bnb_config
).to("cuda")
tokenizer = AutoTokenizer.from_pretrained(model_name)
```
### Example usage
```
messages = [
{"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."},
{"role": "user", "content": "Generate a minimal and compliant <eadheader> template with all required EAD/XML tags"},
]
inputs = tokenizer.apply_chat_template(
messages,
return_dict=True,
tokenize = True,
add_generation_prompt = True, # Must add for generation
return_tensors = "pt",
).to("cuda")
outputs = model.generate(**inputs,
max_new_tokens = 4096,
pad_token_id=tokenizer.eos_token_id,
use_cache = True,)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Limitations
* The model is specifically trained for EAD/XML format and may not perform well on general archival tasks
* Performance depends on the quality and specificity of the input prompts
* Maximum sequence length is limited to 4096 tokens
## Citation [optional]
**BibTeX:**
```
@misc{ead-llama,
author = {Géraldine Geoffroy},
title = {EAD-XML LLaMa: Fine-tuned LLaMa Model for Archival Description},
year = {2024},
publisher = {HuggingFace},
journal = {HuggingFace Repository},
howpublished = {\url{https://huggingface.co/Geraldine/FineLlama-3.2-3B-Instruct-ead}}
}
```
## Licence
This model is subject to the same license as the base LLaMa model. Please refer to Meta's LLaMa license for usage terms and conditions.