Model Card for aktheroy/FT_Translate_en_el_hi

This model is a fine-tuned version of facebook/m2m100_418M, designed for multilingual translation tasks between English (en), Greek (el), and Hindi (hi). The model achieves efficient translation by leveraging the M2M100 architecture, which supports many-to-many language translation.

Model Details

Model Description

  • Developed by: Aktheroy
  • Model type: Transformer-based encoder-decoder
  • Language(s) (NLP): English, Hindi, Greek
  • License: MIT
  • Finetuned from model: facebook/m2m100_418M

Model Sources

Uses

Direct Use

The model can be used for translation tasks between the supported languages (English, Hindi, Greek). Use cases include:

  • Cross-lingual communication
  • Multilingual content generation
  • Language learning assistance

Downstream Use

The model can be fine-tuned further for domain-specific translation tasks, such as medical or legal translations.

Out-of-Scope Use

The model is not suitable for:

  • Translating unsupported languages
  • Generating content for sensitive or harmful purposes

Bias, Risks, and Limitations

While the model supports multilingual translations, it might exhibit:

  • Biases from the pretraining and fine-tuning datasets.
  • Reduced performance for idiomatic expressions or cultural nuances.

Recommendations

Users should:

  • Verify translations, especially for critical applications.
  • Use supplementary tools to validate outputs in sensitive scenarios.

How to Get Started with the Model

Here is an example of how to use the model for translation tasks:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "aktheroy/FT_Translate_en_el_hi"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example input
input_text = "Hello, how are you?"
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "hi"

# Tokenize and generate output
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(translation)

Training Details

Training Data

The model was fine-tuned on a custom dataset containing parallel translations between English, Hindi, and Greek.

Training Procedure

Preprocessing

The dataset was preprocessed to:

  • Normalize text.
  • Tokenize using the M2M100 tokenizer.

Training Hyperparameters

  • Epochs: 10
  • Batch size: 16
  • Learning rate: 5e-5
  • Mixed Precision: Disabled (FP32 used)

Speeds, Sizes, Times

  • Training runtime: 20.3 hours
  • Training samples per second: 17.508
  • Training steps per second: 0.137
  • Final training loss: 0.873

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a held-out test set from the same domains as the training data.

Metrics

  • BLEU score (to be computed during final evaluation).

Results

  • Training Loss: 0.873
  • Detailed BLEU score results will be provided in subsequent updates.

Environmental Impact

  • Hardware Type: MacBook with M3 Pro
  • Hours used: 20.3 hours
  • Cloud Provider: Local hardware
  • Carbon Emitted: Minimal (local training)

Technical Specifications

Model Architecture and Objective

The model is based on the M2M100 architecture, a transformer-based encoder-decoder model designed for multilingual translation without relying on English as an intermediary language.

Compute Infrastructure

Hardware

  • Device: MacBook with M3 Pro

Software

  • Transformers library from Hugging Face
  • Python 3.12

Citation

If you use this model, please cite it as:

APA: Aktheroy (2025). Fine-Tuned M2M100 Translation Model. Hugging Face. Retrieved from https://huggingface.co/aktheroy/FT_Translate_en_el_hi

Model Card Authors

  • Aktheroy

Model Card Contact

For questions or feedback, contact the author via Hugging Face.

Downloads last month
0
Safetensors
Model size
484M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support adapter-transformers models with pipeline type translation

Model tree for aktheroy/FT_Translate_en_el_hi

Adapter
(2)
this model