File size: 3,838 Bytes
00423dd
 
c123d27
00423dd
 
c123d27
00423dd
c123d27
00423dd
c123d27
00423dd
c123d27
 
 
 
00423dd
c123d27
00423dd
c123d27
 
 
 
00423dd
 
 
c123d27
00423dd
c123d27
00423dd
c123d27
 
 
00423dd
c123d27
00423dd
c123d27
 
 
 
 
 
00423dd
c123d27
00423dd
c123d27
 
a567fa0
 
 
 
 
c123d27
 
00423dd
c123d27
00423dd
c123d27
 
 
00423dd
c123d27
00423dd
c123d27
00423dd
c123d27
00423dd
c123d27
00423dd
c123d27
 
 
00423dd
c123d27
00423dd
c123d27
 
 
 
00423dd
c123d27
 
 
 
 
 
 
00423dd
c123d27
00423dd
c123d27
 
 
 
 
 
00423dd
c123d27
 
00423dd
c123d27
00423dd
c123d27
 
 
 
 
00423dd
c123d27
 
 
 
 
 
 
00423dd
c123d27
 
 
 
00423dd
c123d27
 
00423dd
c123d27
00423dd
c123d27
 
 
00423dd
 
 
 
 
c123d27
 
 
 
 
 
 
 
 
 
00423dd
c123d27
00423dd
c123d27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
library_name: transformers
license: llama3.2
---

# FineLlama-3.2-3B-Instruct-ead

This repository contains a fine-tuned version of LLaMa-3.2-3B-Instruct specifically trained to understand and generate EAD (Encoded Archival Description) XML format for archival records description.

## Model Description

* **Base Model**: meta-llama/Llama-3.2-3B-Instruct
* **Training Dataset**: [Geraldine/Ead-Instruct-38k](https://huggingface.co/datasets/Geraldine/Ead-Instruct-38k)
* **Task**: Generation of EAD/XML compliant archival descriptions
* **Training Type**: Instruction fine-tuning with PEFT (Parameter Efficient Fine-Tuning) using LoRA

## Key Features

* Specialized in generating EAD/XML format for archival metadata
* Trained on a comprehensive dataset of EAD/XML examples
* Optimized for archival description tasks
* Memory efficient through 4-bit quantization

## Training Details

### Technical Specifications

* **Quantization**: 4-bit quantization using bitsandbytes

  * NF4 quantization type
  * Double quantization enabled
  * bfloat16 compute dtype

### LoRA Configuration

```
- r: 256
- alpha: 128
- dropout: 0.05
- target modules: all-linear
```

### Training parameters

```
- Epochs: 3
- Batch Size: 3
- Gradient Accumulation Steps: 2
- Learning Rate: 2e-4
- Warmup Ratio: 0.03
- Max Sequence Length: 4096
- Scheduler: Constant
```

### Training Infrastructure

* Libraries: transformers, peft, trl
* Mixed Precision: FP16/BF16 (based on hardware support)
* Optimizer: fused adamw

### Training Notebook

The training Notebook is available on [Kaggle](https://www.kaggle.com/code/geraldinegeoffroy/ead-finetune-llama-3-2-3b-instruct)

## Usage

### Installation

```
pip install transformers torch bitsandbytes
```

### Loading the model

```
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
from peft import PeftModel, PeftConfig

# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead"

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    quantization_config=bnb_config
).to("cuda")

tokenizer = AutoTokenizer.from_pretrained(model_name)
```

### Example usage

```
messages = [
  {"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."},
  {"role": "user", "content": "Generate a minimal and compliant <eadheader> template with all required EAD/XML tags"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_dict=True,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

outputs = model.generate(**inputs, 
                         max_new_tokens = 4096, 
                         pad_token_id=tokenizer.eos_token_id,
                         use_cache = True,)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Limitations

* The model is specifically trained for EAD/XML format and may not perform well on general archival tasks
* Performance depends on the quality and specificity of the input prompts
* Maximum sequence length is limited to 4096 tokens

## Citation [optional]

**BibTeX:**

```
@misc{ead-llama,
  author = {Géraldine Geoffroy},
  title = {EAD-XML LLaMa: Fine-tuned LLaMa Model for Archival Description},
  year = {2024},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Geraldine/FineLlama-3.2-3B-Instruct-ead}}
}
```

## Licence

This model is subject to the same license as the base LLaMa model. Please refer to Meta's LLaMa license for usage terms and conditions.