PEFT

licence: LICENCE

Model Card for Model ID

Model Details

Model Description

  • Developed by: Barbara Scalvini, Language Technology Center, University of the Faroe Islands

  • Model type: This is a LoRA adapter for GPT-Sw3 with continued pre-training on Faroese data (BLARK corpus, private Faroese books repository). Training was performed for 4 epochs.

  • Language(s) (NLP): Swedish, English, Norwegian, Danish, Icelandic, Faroese

  • from model [optional]: AI-Sweden-Models/gpt-sw3-40b

How to Get Started with the Model

Use the code below to get started with the model.

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM

config = PeftConfig.from_pretrained("barbaroo/gptsw3_lora_fo_40b")
model = AutoModelForCausalLM.from_pretrained("AI-Sweden-Models/gpt-sw3-40b")
model = PeftModel.from_pretrained(model, "barbaroo/gptsw3_lora_fo_40b")

[More Information Needed]

Training Details

Training Data

We trained our model on a corpus derived from the Basic Language Resource Kit for Faroese. For detailed information about the dataset, please see the BLARK_small Extra training data was taken from a private corpus of Faroese books ( Faroese Books)

Testing Data, Factors & Metrics

Testing Data

Validation/testing was performed on the test split of the Faroese books corpus ( Faroese Books)

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: True
  • load_in_4bit: False
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.6.2.dev0
Downloads last month
26
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for barbaroo/gptsw3_lora_fo_40b

Adapter
(1)
this model

Dataset used to train barbaroo/gptsw3_lora_fo_40b