NanduVardhanreddy
/

Assamese_text_generation

Model card Files Files and versions Community

NanduVardhanreddy commited on 30 days ago

Commit

67a5ed3

·

verified ·

1 Parent(s): 709bd46

Create README.md

Files changed (1) hide show

README.md +86 -0

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+ # Assamese Instruction Following Model using mT5-small
+This project fine-tunes the mT5-small model for Assamese language instruction following tasks. The model is designed to understand questions in Assamese and generate relevant responses.
+## Model Description
+- Base Model: google/mt5-small (Multilingual T5)
+- Fine-tuned on: Assamese instruction-following dataset
+- Task: Question answering and instruction following in Assamese
+- Training Device: Google Colab T4 GPU
+## Dataset
+- Total Examples: 28,910
+- Training Set: 23,128 examples
+- Validation Set: 5,782 examples
+- Format: Instruction-Input-Output pairs in Assamese
+## Training Configuration
+```python
+training_args = Seq2SeqTrainingArguments(
+   num_train_epochs=2,
+   per_device_train_batch_size=4,
+   per_device_eval_batch_size=4,
+   warmup_steps=200,
+   weight_decay=0.01,
+   gradient_accumulation_steps=2
+)
+Model Capabilities
+The model can:
+Process Assamese script input
+Recognize different question types
+Maintain basic Assamese grammar
+Generate responses in Assamese
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+# Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("your-username/mt5-assamese-instructions")
+model = AutoModelForSeq2SeqLM.from_pretrained("your-username/mt5-assamese-instructions")
+# Example input
+text = "জীৱনত কেনেকৈ সফল হ'ব?"  # How to succeed in life?
+# Generate response
+inputs = tokenizer(text, return_tensors="pt", padding=True)
+outputs = model.generate(**inputs)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+Limitations
+Current limitations include:
+Tendency for repetitive responses
+Limited coherence in longer answers
+Basic response structure
+Memory constraints due to T4 GPU
+Future Improvements
+Planned improvements include:
+Better response generation parameters
+Enhanced data preprocessing
+Structural markers in training data
+Optimization for longer responses
+Improved coherence in outputs
+@misc{mt5-assamese-instructions,
+  author = {NanduvardhanReddy},
+  title = {mT5-small Fine-tuned for Assamese Instructions},
+  year = {2024},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Model Hub}
+}
+Acknowledgments
+Google's mT5 team for the base model
+Hugging Face for the transformers library
+Google Colab for computation resources
+License
+This project is licensed under the Apache License 2.0