UrduParaphraseBERT / README.md
mwz's picture
Update README.md
91f9006
|
raw
history blame
1.92 kB
metadata
license: mit
datasets:
  - mwz/ur_para
language:
  - ur
tags:
  - 'paraphrase '
pipeline_tag: text2text-generation

Urdu Paraphrasing Model

This repository contains a pretrained model for Urdu paraphrasing. The model is based on the BERT architecture and has been fine-tuned on a large dataset of Urdu paraphrases.

Model Description

The pretrained model is based on the BERT architecture, specifically designed for paraphrasing tasks in the Urdu language. It has been trained using a large corpus of Urdu text to generate high-quality paraphrases.

Model Details

  • Model Name: Urdu-Paraphrasing-BERT
  • Base Model: BERT
  • Architecture: Transformer
  • Language: Urdu
  • Dataset: Urdu Paraphrasing Dataset mwz/ur_para

How to Use

You can use this pretrained model for generating paraphrases for Urdu text. Here's an example of how to use the model:

from transformers import pipeline

# Load the model
model = pipeline("text2text-generation", model="path_to_pretrained_model")

# Generate paraphrases
input_text = "Urdu input text for paraphrasing."
paraphrases = model(input_text, max_length=128, num_return_sequences=3)

# Print the generated paraphrases
print("Original Input Text:", input_text)
print("Generated Paraphrases:")
for paraphrase in paraphrases:
    print(paraphrase["generated_text"])

Training

The model was trained using the Hugging Face transformers library. The training process involved fine-tuning the base BERT model on the Urdu Paraphrasing Dataset.

Evaluation

The model's performance was evaluated on a separate validation set using metrics such as BLEU, ROUGE, and perplexity. However, please note that the evaluation results may vary depending on the specific use case.

Acknowledgments

  • The pretrained model is based on the BERT architecture developed by Google Research.

License

This model and the associated code are licensed under the MIT License.