Model Card for Llama 3 Instruct Fine-Tuned on Deutsche Bahn FAQ
Model Overview
Model Name: islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model
Architecture: Llama 3 Instruct
Quantization: 4-bit NF4 with double quantization
Domain-Specific Fine-Tuning Dataset: islam-hajosman/deutsche_bahn_faq_1k
This model has been fine-tuned to provide improved answers for frequently asked questions (FAQ) from the Deutsche Bahn website. This is part of a Master's thesis project aiming to enhance the model's domain-specific capabilities.
Fine-Tuning Configuration
Quantization Configuration
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
LoRA Configuration
lora_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.0,
bias="none",
target_modules=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'],
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
Training Arguments
training_args = TrainingArguments(
output_dir="./llama3_instruct_fine_tuned_bahn_1k_v1_output",
dataloader_drop_last=False,
save_strategy="epoch",
logging_strategy="steps",
num_train_epochs=30,
logging_steps=1,
per_device_train_batch_size=8,
gradient_accumulation_steps=8,
optim="adamw_8bit",
learning_rate=1e-4,
lr_scheduler_type="cosine",
warmup_ratio=0.1,
bf16=true,
weight_decay=0.0,
run_name="llama3_instruct_fine_tuned_bahn_1k_v1_report",
report_to="wandb"
)
Sequence Length Configuration
max_seq_length
was set to 512, covering 99.3% of the data. Only 7 entries exceeded this length, which were truncated accordingly.
Hardware Used
- GPU: 1x H100 (80 GB PCIe)
- CPU: 26 cores
- RAM: 205.4 GB
- Storage: 1.1 TB SSD
- Cost: $2.5 per hour
Training Summary
- Total Trainable Parameters: 0.915% of 8B parameters
- LoRA-Adaptor Size: 4.37GB
- Training Time and Cost: $2 for 50 minutes
- Number of Steps per Epoch: 16 (based on 1024 samples, batch size 8, gradient accumulation 8)
Performance Metrics
- Training Completed:
- TrainOutput(global_step=480, training_loss=0.28411184588912874, metrics={'train_runtime': 3012.7974, 'train_samples_per_second': 10.197, 'train_steps_per_second': 0.159, 'total_flos': 3.871795189898281e+17, 'train_loss': 0.28411184588912874, 'epoch': 30.0})
Weights & Biases Tracking
Usage
To use this model, load it from Huggingface using the model name islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model
. This model is optimized for providing domain-specific answers to Deutsche Bahn FAQ.
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model")
model = AutoModelForCausalLM.from_pretrained("islam-hajosman/llama3_instruct_fine_tuned_bahn_1k_v1_model")
input_text = "Ihre Frage hier"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.