|
--- |
|
license: afl-3.0 |
|
language: |
|
- kk |
|
base_model: nur-dev/llama-1.9B-kaz |
|
library_name: transformers |
|
--- |
|
|
|
# LLaMA 1.9B Kazakh Instruct Model |
|
|
|
This repository contains the LLaMA 1.9B model fine-tuned on a Kazakh language dataset for instruction-based tasks. The model is trained to provide helpful, relevant, and context-aware responses to various prompts in Kazakh. It is particularly effective in answering questions, providing explanations, and assisting in educational and professional contexts. |
|
This model comes with an integrated chat template that structures conversations for proper input formatting. The `Tokenizer` supports this feature, allowing for easier interaction by formatting messages before they are passed to the model. |
|
|
|
The template follows this structure: |
|
|
|
```jinja |
|
{%- if messages[0]['role'] == 'system' %} |
|
{%- set offset = 1 %} |
|
{%- else %} |
|
{%- set offset = 0 %} |
|
{%- endif %} |
|
<|begin_of_text|> |
|
{%- for message in messages %} |
|
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n' + message['content'] | trim + '<|eot_id|>' }} |
|
{%- endfor %} |
|
{{- '<|start_header_id|>' + 'көмекші' + '<|end_header_id|>\n\n' }} |
|
``` |
|
## Model Details |
|
|
|
- **Model Name**: LLaMA 1.9B Kazakh Instruct |
|
- **Model ID**: `nur-dev/llama-1.9B-kaz-instruct` |
|
- **Parameters**: 1.94 billion |
|
- **Architecture**: Causal Language Model (LLaMA) |
|
- **Tokenizer**: LLaMA tokenizer |
|
- **Language**: Kazakh |
|
|
|
## Training Data |
|
|
|
The model was fine-tuned on a dataset containing 22000 samples designed for instruction-based tasks. The dataset includes a diverse set of prompts and responses to help the model learn to handle a wide range of topics, from everyday queries to specialized questions. |
|
|
|
## How to Use |
|
|
|
### Using the Model Directly for Inference |
|
Using the LlamaForCausalLM and AutoTokenizer classes to load a custom model, format a conversation, and generate a response using various generation parameters like top_k, top_p, and temperature. |
|
|
|
```python |
|
from transformers import LlamaForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
# Load the model and tokenizer |
|
model_directory = "nur-dev/llama-1.9B-kaz-instruct" |
|
model = LlamaForCausalLM.from_pretrained(model_directory) |
|
tokenizer = AutoTokenizer.from_pretrained(model_directory) |
|
|
|
# Set the model to evaluation mode and move to appropriate device |
|
model.eval() |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
|
|
# Example input in Kazakh |
|
|
|
# Conversation history |
|
conversation_history = [ |
|
{"role": "system", "content": "Сіз сұрақтарға жауап беріп, ақпарат ұсынатын сенімді AI көмекшісісіз."}, |
|
{"role": "пайдаланушы", "content": "Жасанды интеллект денсаулық сақтау саласына қандай өзгерістер енгізе алады?"} |
|
] |
|
|
|
# Format conversation using the chat template (custom method) |
|
formatted_conversation = tokenizer.apply_chat_template(conversation_history, tokenize=False) |
|
|
|
# Tokenize input |
|
input_ids = tokenizer.encode(formatted_conversation, return_tensors="pt").to(device) |
|
|
|
# Generate a response from the model |
|
with torch.no_grad(): |
|
output = model.generate( |
|
input_ids, |
|
max_length=1000, |
|
num_return_sequences=1, |
|
pad_token_id=tokenizer.eos_token_id, |
|
no_repeat_ngram_size=2, |
|
early_stopping=True, |
|
do_sample=True, |
|
top_k=10, |
|
top_p=0.5, |
|
eos_token_id=tokenizer.eos_token_id, |
|
temperature=1.3 |
|
) |
|
|
|
# Decode and print the model's response |
|
response = tokenizer.decode(output[0], skip_special_tokens=False) |
|
print(response) |
|
``` |
|
### Using the Pipeline for Text Generation |
|
Using the pipeline API, which abstracts much of the setup, allowing you to generate responses with less boilerplate. The assistant responds in a “pirate” style to a user query. |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Initialize the text generation pipeline |
|
pipe = pipeline("text-generation", model="nur-dev/llama-1.9B-kaz-instruct") |
|
|
|
# Define the conversation messages |
|
messages = [ |
|
{"role": "system", "content": "Сіз сұрақтарға жауап беріп, ақпарат ұсынатын сенімді AI көмекшісісіз."}, |
|
{"role": "пайдаланушы", "content": "Жасанды интеллект денсаулық сақтау саласына қандай өзгерістер енгізе алады?"} |
|
] |
|
|
|
response = pipe(messages, max_new_tokens=128)[0]['generated_text'] |
|
|
|
print(response) |
|
``` |
|
|
|
|
|
@misc {nurgali_kadyrbek_2024, |
|
author = { {NURGALI Kadyrbek} }, |
|
title = { llama-1.9B-kaz-instruct (Revision 4059a4e) }, |
|
year = 2024, |
|
url = { https://huggingface.co/nur-dev/llama-1.9B-kaz-instruct }, |
|
doi = { 10.57967/hf/3114 }, |
|
publisher = { Hugging Face } |
|
} |