--- library_name: peft --- ## Config ```python model_name_or_path = "openai/whisper-large-v2" language = "Marathi" language_abbr = "mr" task = "transcribe" dataset_name = "mozilla-foundation/common_voice_11_0" common_voice["train"] = load_dataset(dataset_name, language_abbr, split="train+validation", use_auth_token=True) common_voice["test"] = load_dataset(dataset_name, language_abbr, split="test", use_auth_token=True) feature_extractor = AutoFeatureExtractor.from_pretrained(model_name_or_path) tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, language=language, task=task) processor = AutoProcessor.from_pretrained(model_name_or_path, language=language, task=task) model = AutoModelForSpeechSeq2Seq.from_pretrained(model_name_or_path, load_in_8bit=True, device_map="auto") config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none") model = get_peft_model(model, config) model.print_trainable_parameters() #"trainable params: 15728640 || all params: 1559033600 || trainable%: 1.0088711365810203" ``` ## Training procedure The following `bitsandbytes` quantization config was used during training: - load_in_8bit: True - load_in_4bit: False - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: fp4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float32 ### Framework versions - PEFT 0.5.0 wer=38.514602540132806