nazimali
/

Mistral-Nemo-Kurdish

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nazimali commited on Oct 9, 2024

Commit

b343b24

•

1 Parent(s): 05676f2

Create README.md

Files changed (1) hide show

README.md +68 -0

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+---
+base_model:
+- mistralai/Mistral-Nemo-Instruct-2407
+language:
+- ku
+- en
+license: apache-2.0
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- mistral
+datasets:
+- nazimali/kurdish-wikipedia-articles
+library_name: transformers
+---
+Continued pre-training on `mistralai/Mistral-Nemo-Instruct-2407` using the Kurdish wiki dataset with `unsloth`.
+This model should be further fine-tuned since the pre-training was to improve Kurdish language understanding.
+It's a quantized model using `bitsandbytes` so that it uses less memory. See [bitsandbytes documentation](https://huggingface.co/docs/transformers/main/en/quantization/bitsandbytes#bitsandbytes).
+There isn't a standard or even a good Kurdish metric to evaluate the model (that I could find).
+Will make it my next project to create an evaluation so that there's a reproducible baseline for Kurdish.
+Will look into a multi-GPU training setup so don't have to wait all day for results. Would like to train it with both Kurmanji and Sorani.
+### Use
+Should be fine-tuned further for a specific task.
+### Training
+Transformers `4.44.2`
+1 NVIDIA A100 80GB PCIe
+Duration 6h 31m 4s
+```json
+{
+  "total_flos": 4121524790259794000,
+  "train/epoch": 1,
+  "train/global_step": 1960,
+  "train/grad_norm": 3.1958093643188477,
+  "train/learning_rate": 0,
+  "train/loss": 1.2108,
+  "train_loss": 1.256846008738693,
+  "train_runtime": 23227.1752,
+  "train_samples_per_second": 2.7,
+  "train_steps_per_second": 0.084
+}
+```
+#### Pre-training data:
+- `nazimali/kurdish-wikipedia-articles`
+    - Dataset number of rows: 63,076
+    - Filtered columns `title, text`
+      - Must have at least 1 character
+- Number of rows used for training: 62,720
+#### Training prompt format:
+```python
+training_prompt = """Gotara Wikipedia
+### Sernav: {}
+### Gotar:
+{}"""