leesm
/

llama-2-7b-hf-lora-oki10p

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-2 model fine tuning (TREX-Lab at Seoul Cyber University)

Summary

Base Model : meta-llama/Llama-2-7b-hf
Dataset : heegyu/open-korean-instructions (10%)
Tuning Method
- PEFT(Parameter Efficient Fine-Tuning)
- LoRA(Low-Rank Adaptation of Large Language Models)
Related Articles : https://arxiv.org/abs/2106.09685
Fine-tuning the Llama2 model with a random 10% of Korean chatbot data (open Korean instructions)
Test whether fine tuning of a large language model is possible on A30 GPU*1 (successful)

Developed by: [TREX-Lab at Seoul Cyber University]
Language(s) (NLP): [Korean]
Finetuned from model : [meta-llama/Llama-2-7b-hf]

Fine Tuning Detail

alpha value 16
r value 64 (it seems a bit big...@@)

peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias='none',
    task_type='CAUSAL_LM'
)

Mixed precision : 4bit (bnb_4bit_use_double_quant)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_compute_dtype='float16',
)

Use SFT trainer (https://huggingface.co/docs/trl/sft_trainer)

trainer = SFTTrainer(
    model=peft_model,
    train_dataset=dataset,
    dataset_text_field='text',
    max_seq_length=min(tokenizer.model_max_length, 2048),
    tokenizer=tokenizer,
    packing=True,
    args=training_args
)

Train Result

time taken : executed in 2d 0h 17m

TrainOutput(global_step=2001,
            training_loss=0.6940358212922347,
            metrics={
               'train_runtime': 173852.2333,
               'train_samples_per_second': 0.092,
               'train_steps_per_second': 0.012,
               'train_loss': 0.6940358212922347,
               'epoch': 3.0})

Downloads last month: 16

Safetensors

Model size

6.74B params

Tensor type

FP16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Dataset used to train leesm/llama-2-7b-hf-lora-oki10p