--- library_name: transformers license: llama3.2 --- # FineLlama-3.2-3B-Instruct-ead This repository contains a fine-tuned version of LLaMa-3.2-3B-Instruct specifically trained to understand and generate EAD (Encoded Archival Description) XML format for archival records description. ## Model Description * **Base Model**: meta-llama/Llama-3.2-3B-Instruct * **Training Dataset**: [Geraldine/Ead-Instruct-38k](https://huggingface.co/datasets/Geraldine/Ead-Instruct-38k) * **Task**: Generation of EAD/XML compliant archival descriptions * **Training Type**: Instruction fine-tuning with PEFT (Parameter Efficient Fine-Tuning) using LoRA ## Key Features * Specialized in generating EAD/XML format for archival metadata * Trained on a comprehensive dataset of EAD/XML examples * Optimized for archival description tasks * Memory efficient through 4-bit quantization ## Training Details ### Technical Specifications * **Quantization**: 4-bit quantization using bitsandbytes * NF4 quantization type * Double quantization enabled * bfloat16 compute dtype ### LoRA Configuration ``` - r: 256 - alpha: 128 - dropout: 0.05 - target modules: all-linear ``` ### Training parameters ``` - Epochs: 3 - Batch Size: 3 - Gradient Accumulation Steps: 2 - Learning Rate: 2e-4 - Warmup Ratio: 0.03 - Max Sequence Length: 4096 - Scheduler: Constant ``` ### Training Infrastructure * Libraries: transformers, peft, trl * Mixed Precision: FP16/BF16 (based on hardware support) * Optimizer: fused adamw ### Training Notebook The training Notebook is available on [Kaggle](https://www.kaggle.com/code/geraldinegeoffroy/ead-finetune-llama-3-2-3b-instruct) ## Usage ### Installation ``` pip install transformers torch bitsandbytes ``` ### Loading the model ``` from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig import torch from peft import PeftModel, PeftConfig # Configure 4-bit quantization bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) model_name = "Geraldine/FineLlama-3.2-3B-Instruct-ead" # Load model and tokenizer model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", quantization_config=bnb_config ).to("cuda") tokenizer = AutoTokenizer.from_pretrained(model_name) ``` ### Example usage ``` messages = [ {"role": "system", "content": "You are an expert in EAD/XML generation for archival records metadata."}, {"role": "user", "content": "Generate a minimal and compliant template with all required EAD/XML tags"}, ] inputs = tokenizer.apply_chat_template( messages, return_dict=True, tokenize = True, add_generation_prompt = True, # Must add for generation return_tensors = "pt", ).to("cuda") outputs = model.generate(**inputs, max_new_tokens = 4096, pad_token_id=tokenizer.eos_token_id, use_cache = True,) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Limitations * The model is specifically trained for EAD/XML format and may not perform well on general archival tasks * Performance depends on the quality and specificity of the input prompts * Maximum sequence length is limited to 4096 tokens ## Citation [optional] **BibTeX:** ``` @misc{ead-llama, author = {GĂ©raldine Geoffroy}, title = {EAD-XML LLaMa: Fine-tuned LLaMa Model for Archival Description}, year = {2024}, publisher = {HuggingFace}, journal = {HuggingFace Repository}, howpublished = {\url{https://huggingface.co/Geraldine/FineLlama-3.2-3B-Instruct-ead}} } ``` ## Licence This model is subject to the same license as the base LLaMa model. Please refer to Meta's LLaMa license for usage terms and conditions.