--- base_model: - meta-llama/Llama-2-7b-hf tags: - generated_from_trainer metrics: - accuracy - f1 model-index: - name: Llama-2-7b-hf-IDMGSP results: [] license: mit datasets: - tum-nlp/IDMGSP language: - da library_name: transformers --- # Llama-2-7b-hf-IDMGSP This model is a LoRA adapter of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on the [tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset. It achieves the following results on the evaluation split: - Loss: 0.1450 - Accuracy: {'accuracy': 0.9759036144578314} - F1: {'f1': 0.9758125472411187} ## Model description Model loaded fine-tuned in 4bit quantization mode using LoRA. ## Intended uses & limitations Labels: `0` non-AI generated, `1` AI generated. For classifying AI generated text. Code to run the inference ```python import transformers import torch import datasets import numpy as np import torch from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel, AutoPeftModelForCausalLM, TaskType import bitsandbytes as bnb class Model(): def __init__(self, name) -> None: # Tokenizer self.tokenizer = transformers.LlamaTokenizer.from_pretrained(self.name) self.tokenizer.pad_token = self.tokenizer.eos_token print(f"Tokenizer: {self.tokenizer.eos_token}; Pad {self.tokenizer.pad_token}") # Model bnb_config = transformers.BitsAndBytesConfig( load_in_4bit = True, bnb_4bit_use_double_quant = True, bnb_4bit_quant_type = "nf4", bnb_4bit_compute_dtype = "bfloat16", ) self.peft_config = LoraConfig( task_type=TaskType.SEQ_CLS, r=8, lora_alpha=16, lora_dropout=0.05, bias="none" ) self.model = transformers.LlamaForSequenceClassification.from_pretrained(self.name, num_labels=2, quantization_config = bnb_config, device_map = "auto" ) self.model.config.pad_token_id = self.model.config.eos_token_id def predict(self, text): inputs = self.tokenize(text) outputs = self.model(**inputs) logits = outputs.logits predictions = torch.argmax(logits, dim=-1) return id2label[predictions.item()] ``` ## Training and evaluation data [tum-nlp/IDMGSP](https://huggingface.co/datasets/tum-nlp/IDMGSP) dataset, `classifier_input` subsplit. ## Training procedure ### Training hyperparameters BitsAndBytes and LoRA config parameters: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/638f0f9ab0525fa370479467/XI1imFyXmzFjCGCkBYClc.png) GPU VRAM Consumption during fine-tuning: 30.6gb The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 32 - eval_batch_size: 32 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - lr_scheduler_warmup_steps: 500 - num_epochs: 5 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | |:-------------:|:-----:|:----:|:---------------:|:--------------------------------:|:--------------------------:| | 0.0766 | 1.0 | 498 | 0.1165 | {'accuracy': 0.9614708835341366} | {'f1': 0.9612813721780804} | | 0.182 | 2.0 | 996 | 0.0934 | {'accuracy': 0.9657379518072289} | {'f1': 0.9648059816939539} | | 0.037 | 3.0 | 1494 | 0.1190 | {'accuracy': 0.9716365461847389} | {'f1': 0.9710182097973841} | | 0.0349 | 4.0 | 1992 | 0.1884 | {'accuracy': 0.96875} | {'f1': 0.9692326702088224} | | 0.0046 | 5.0 | 2490 | 0.1450 | {'accuracy': 0.9759036144578314} | {'f1': 0.9758125472411187} | ### Framework versions - Transformers 4.35.0 - Pytorch 2.0.1 - Datasets 2.14.6 - Tokenizers 0.14.1