|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- tr |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
gemma-2b fine-tuned for the task of Turkish text generation. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
- **Language(s) (NLP):** Turkish, English |
|
- **License:** Creative Commons Attribution Non Commercial 4.0 (Chosen due to the use of restricted/gated datasets.) |
|
- **Finetuned from model [optional]:** gemma-2b (https://huggingface.co/google/gemma-2b) |
|
|
|
|
|
## Uses |
|
|
|
The model is specifically designed for Turkish text generation. It is not suitable for instruction-following or question-answering tasks. |
|
|
|
## How to Get Started with the Model |
|
|
|
```Python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Metin/gemma-2b-tr") |
|
model = AutoModelForCausalLM.from_pretrained("Metin/gemma-2b-tr") |
|
|
|
system_prompt = "You are a helpful assistant. Always reply in Turkish." |
|
instruction = "Bugün sinemaya gidemedim çünkü" |
|
prompt = f"{system_prompt} [INST] {instruction} [/INST]" |
|
input_ids = tokenizer(prompt, return_tensors="pt") |
|
|
|
outputs = model.generate(**input_ids) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
- Dataset size: ~190 Million Token or 100K Document |
|
- Dataset content: Web crawl data |
|
|
|
### Training Procedure |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Adapter:** QLoRA |
|
- **Epochs:** 1 |
|
- **Context length:** 1024 |
|
- **LoRA Rank:** 32 |
|
- **LoRA Alpha:** 32 |
|
- **LoRA Dropout:** 0.05 |