🇮🇳 Fine-Tuned DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit for Indian Knowledge 🇮🇳

🌟 Model Summary

This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit on India-specific knowledge 🌍. It is trained to provide factual and informative responses related to Indian history, culture, geography, government, and more. 🏛️🗺️📜

This fine-tuned model aims to provide accurate, unbiased, and pro-India responses, ensuring better representation of Indian topics in large language models. 🚀

🏛️ Model Details

📌 Model Description

  • Developed by: Finetuned by Susant-Achary 🇮🇳
  • Funded by : Open Experimentaion and Research
  • Shared by: Susant-Achary
  • Model type: Causal Language Model (LLM) 🧠
  • Language(s) (NLP): English (focused on Indian context)
  • License: MIT(as carrier by original creators DeepSeek)
  • Fine-tuned from: [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit]

📜 Model Sources

🎯 Uses

✅ Direct Use

  • General Knowledge: Answering India-related questions 📚
  • Education: Useful for students & researchers 📖
  • Conversational AI: India-focused chatbots 🤖

✅ Downstream Use

  • News Summarization 📰
  • Legal & Policy AI Assistants ⚖️
  • Indian History & Culture Teaching Aids 🏺

🚫 Out-of-Scope Use

  • Generating Misinformation
  • Hate Speech & Political Propaganda
  • Real-time Legal or Financial Advice

⚠️ Bias, Risks, and Limitations

  • Biases in the training data: While we ensure factual accuracy, some biases may remain from pre-existing datasets.
  • Limited Multilingual Support: Currently optimized for English, though support for Hindi & regional languages is planned.

🔹 Recommendations

  • Users should cross-check critical information from verified sources.
  • Not suitable for real-time decision-making in critical areas like healthcare or finance.

🚀 How to Use the Model

📌 Load Model in Python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "Susant-Achary/Deepseek-R1-India-Finetuned-Distill-Llama-8B-unsloth-bnb-4bit"  # Replace with actual model name

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,  # Optimized for inference
    device_map="auto"
)
model.eval()

def generate_response(prompt, max_new_tokens=150):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens, pad_token_id=tokenizer.eos_token_id)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Test the model
print(generate_response("To which country does Arunachal Pradesh Belong?"))

### Response:
Arunachal Pradesh is a state in **India**.

📊 Training Details

📜 Training Data

  • Source: Curated datasets covering Indian history, geography, government policies, and cultural insights. 🇮🇳
  • Data Augmentation: Added government documents, Wikipedia, and curated question-answer pairs.

🔧 Training Procedure

  • Preprocessing: Standard tokenization with transformers.
  • Fine-tuned on: DeepSeek-R1 using LoRA (Low-Rank Adaptation) for efficient training.
  • Mixed Precision (FP16) for better performance.

⚡ Hyperparameters

  • Batch Size: 4 per GPU
  • Gradient Accumulation Steps: 2
  • Learning Rate: 2e-5
  • Epochs: 3

📈 Evaluation

🎯 Metrics

  • Perplexity: Evaluated on a held-out dataset 🎯
  • BLEU Score: Measured on QA responses 🎯
  • Human Evaluation: Subject-matter expert review 🎯

🌍 Environmental Impact

  • Hardware Used: 2x Tesla T4 GPUs
  • Total Training Time: Approx 15 mins

🏛️ Citation

If you use this model in your research, please cite it as:

@article{deepseek_india_finetuned,
  title={Fine-Tuned DeepSeek Llama for Indian Knowledge},
  author={Susant-Achary},
  year={2025},

}

🇮🇳 Jai Hind! 🚀

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.