|
--- |
|
library_name: transformers |
|
license: llama3.2 |
|
--- |
|
|
|
# FineLlama-3.2-3B-Instruct-ead |
|
|
|
This repository contains a fine-tuned version of LLaMa-3.2-3B-Instruct specifically trained to understand and generate EAD (Encoded Archival Description) XML format for archival records description. |
|
|
|
## Model Description |
|
|
|
* **Base Model**: meta-llama/Llama-3.2-3B-Instruct |
|
* **Training Dataset**: [Geraldine/Ead-Instruct-38k](https://huggingface.co/datasets/Geraldine/Ead-Instruct-38k) |
|
* **Task**: Generation of EAD/XML compliant archival descriptions |
|
* **Training Type**: Instruction fine-tuning with PEFT (Parameter Efficient Fine-Tuning) using LoRA |
|
|
|
## Key Features |
|
|
|
* Specialized in generating EAD/XML format for archival metadata |
|
* Trained on a comprehensive dataset of EAD/XML examples |
|
* Optimized for archival description tasks |
|
* Memory efficient through 4-bit quantization |
|
|
|
## Training Details |
|
|
|
### Technical Specifications |
|
|
|
* **Quantization**: 4-bit quantization using bitsandbytes |
|
|
|
* NF4 quantization type |
|
* Double quantization enabled |
|
* bfloat16 compute dtype |
|
|
|
### LoRA Configuration |
|
|
|
``` |
|
- r: 256 |
|
- alpha: 128 |
|
- dropout: 0.05 |
|
- target modules: all-linear |
|
``` |
|
|
|
### Training parameters |
|
|
|
``` |
|
- Epochs: 3 |
|
- Batch Size: 3 |
|
- Gradient Accumulation Steps: 2 |
|
- Learning Rate: 2e-4 |
|
- Warmup Ratio: 0.03 |
|
- Max Sequence Length: 4096 |
|
- Scheduler: Constant |
|
``` |
|
|
|
### Training Infrastructure |
|
|
|
* Libraries: transformers, peft, trl |
|
* Mixed Precision: FP16/BF16 (based on hardware support) |
|
* Optimizer: fused adamw |
|
|
|
### Training Notebook |
|
|
|
The training Notebook is available on [Kaggle](https://www.kaggle.com/code/geraldinegeoffroy/ead-finetune-llama-3-2-3b-instruct) |
|
|
|
## Usage |
|
|
|
### Installation |
|
|
|
``` |
|
pip install transformers torch bitsandbytes |
|
``` |
|
|
|
### Loading the model |
|
|
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
|
|
# Configure 4-bit quantization |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead" |
|
|
|
# Load model and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype="auto", |
|
quantization_config=bnb_config |
|
).to("cuda") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
``` |
|
|
|
### Example usage |
|
|
|
``` |
|
messages = [ |
|
{"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."}, |
|
{"role": "user", "content": "Generate a minimal and compliant <eadheader> template with all required EAD/XML tags"}, |
|
] |
|
|
|
inputs = tokenizer.apply_chat_template( |
|
messages, |
|
return_dict=True, |
|
tokenize = True, |
|
add_generation_prompt = True, # Must add for generation |
|
return_tensors = "pt", |
|
).to("cuda") |
|
|
|
outputs = model.generate(**inputs, |
|
max_new_tokens = 4096, |
|
pad_token_id=tokenizer.eos_token_id, |
|
use_cache = True,) |
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Limitations |
|
|
|
* The model is specifically trained for EAD/XML format and may not perform well on general archival tasks |
|
* Performance depends on the quality and specificity of the input prompts |
|
* Maximum sequence length is limited to 4096 tokens |
|
|
|
## Citation [optional] |
|
|
|
**BibTeX:** |
|
|
|
``` |
|
@misc{ead-llama, |
|
author = {Géraldine Geoffroy}, |
|
title = {EAD-XML LLaMa: Fine-tuned LLaMa Model for Archival Description}, |
|
year = {2024}, |
|
publisher = {HuggingFace}, |
|
journal = {HuggingFace Repository}, |
|
howpublished = {\url{https://huggingface.co/Geraldine/FineLlama-3.2-3B-Instruct-ead}} |
|
} |
|
``` |
|
|
|
## Licence |
|
|
|
This model is subject to the same license as the base LLaMa model. Please refer to Meta's LLaMa license for usage terms and conditions. |