--- license: apache-2.0 language: - fa - en library_name: transformers pipeline_tag: text-generation datasets: - myrkur/persian-alpaca-deep-clean --- # Shotor (Llama 3 8B Instruction Tuned on Farsi) shotor Shotor is a Persian language model built upon the llama 3 8B architecture, a multilingual Large Language Model (LLM). It has been fine-tuned using supervised learning techniques and the Dora method for efficient fine-tuning. The model has been specifically tailored and trained on Persian datasets, particularly leveraging the dataset provided by [persian-alpaca-deep-clean](https://huggingface.co/datasets/myrkur/persian-alpaca-deep-clean). ## Usage Here's a sample Python code snippet demonstrating how to use Shotor for text generation: ```python import transformers import torch # Load the Shotor model model_id = "myrkur/shotor" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto", ) # Define user messages messages = [ {"role": "user", "content": "علم بهتر است یا ثروت؟"}, ] # Apply chat template and generate text prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") ] outputs = pipeline( prompt, max_new_tokens=512, eos_token_id=terminators, do_sample=True, temperature=0.5, top_p=0.9, repetition_penalty=1.1 ) print(outputs[0]["generated_text"][len(prompt):]) ``` ## Contributions Contributions to Shotor are welcome! Whether it's enhancing the model's capabilities, improving its performance on specific tasks, or evaluating its performance, your contributions can help advance Persian natural language processing. ## Contact For questions or further information, please contact: - Amir Masoud Ahmadi: [amirmasoud.ahkol@gmail.com](mailto:amirmasoud.ahkol@gmail.com) - Sahar Mirzapour: [saharmirzapoursahar@gmail.com](mailto:saharmirzapoursahar@gmail.com)