YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card for MBart English to Urdu

This model is designed to translate text from English to Urdu using the mBART architecture, fine-tuned with the BitFit method.

Model Details

Model Description

This model leverages the mBART (Multilingual BART) architecture to perform English-to-Urdu translation. The model was fine-tuned on a custom English-Urdu dataset, and it uses efficient fine-tuning techniques such as BitFit to ensure high-quality translations.

  • Developed by: Mudasir692
  • Model type: mBART
  • Language(s) (NLP): English, Urdu
  • License: MIT
  • Finetuned from model: mBART50

Model Sources

Uses

Direct Use

This model is intended for direct use in English-to-Urdu machine translation tasks.

Downstream Use

The model can be fine-tuned further for specific tasks in machine translation or other NLP tasks involving multilingual data. You can use adapters or prefix tuning.

Out-of-Scope Use

The model is not suitable for domains where high accuracy is critical or where domain-specific training data is required. Misuse in sensitive domains like legal or medical translation should be avoided.

Bias, Risks, and Limitations

The model may inherit biases from the dataset, including socio-political biases and cultural influences in the training data. Testing on domain-specific data is recommended before using it in production.

Recommendations

Users should be made aware of the potential biases in the translations and consider domain-specific testing before deploying the model in real-world applications.

How to Get Started with the Model

To get started with the model, use the following code snippet to load the model and tokenizer, input English sentences, and generate translations to Urdu.

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast

# Load the tokenizer and model
tokenizer = MBart50TokenizerFast.from_pretrained("Mudasir692/mbart-eng-ur")
model = MBartForConditionalGeneration.from_pretrained("Mudasir692/mbart-eng-ur")

# Example input text (English)
input_text = "This is an example sentence."

# Tokenize the input text
inputs = tokenizer(input_text, return_tensors="pt")

# Generate the translation
translated = model.generate(**inputs)

# Decode the translation and print the result
output_text = tokenizer.decode(translated[0], skip_special_tokens=True)
print("Translated Text (Urdu):", output_text)
Downloads last month
5
Safetensors
Model size
611M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.