Oct 24, 2023

Please I'm getting error trimg to fine tune the model. Please this is the folder stricture
Mydataset/lj_speech/metadata.csv
Mydataset/lj_speech/wavs/audio1.wav

I am getting key error when trying to load the dataset. Please provide me sample code or solution.

sanchit-gandhi

Oct 26, 2023

Hey @Mavisfsew - do you have a reproducible code snippet to load the audio data? You could, for instance, push the datasets to the Hugging Face Hub and write a short code snippet that loads this dataset. If you share that here, I'd be happy to take a look!

Mavisfsew

Oct 27, 2023

Please here is the code. I used the LJ_Speech dataset and its on huggingface.

from transformers import VitsModel, AutoTokenizer
from transformers import TrainingArguments, Trainer
from datasets import load_dataset
import os
import pandas as pd

Ensure the dataset is downloaded and prepared

dataset = load_dataset("lj_speech")

model_name = "facebook/mms-tts-eng"
model = VitsModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True)

training_args = TrainingArguments(
output_dir="./output",
per_device_train_batch_size=8,
num_train_epochs=5,
evaluation_strategy="steps",
eval_steps=500,
save_steps=500,
)

Initialize the Trainer.

trainer = Trainer(
model=model,
args=training_args,
data_collator=tokenize_function,
train_dataset=dataset["train"],
eval_dataset=dataset["validation"],
)

trainer.train()

facebook
/

mms-tts-eng

Key error and unable to load dataset

Ensure the dataset is downloaded and prepared

Initialize the Trainer.