ssmits
/

Zamba2-1.2B-instruct-Dutch

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ssmits commited on Nov 3, 2024

Commit

c60d7a0

·

verified ·

1 Parent(s): 8ab0af2

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -60,6 +60,10 @@ outputs = model.generate(**input_ids, max_new_tokens=100)
 print(tokenizer.decode(outputs[0]))
 ```
 ### Fine-tuning with Learning Rate Optimization
 The model includes an advanced learning rate optimization system for fine-tuning, implemented through the `LROptimizerCallback` class. This callback automatically handles learning rate optimization during training. Here's how to use it:
@@ -141,4 +145,5 @@ And memory overhead
 ## Notice
-Zamba2-1.2B is a pretrained base model and therefore does not have any moderation mechanism and may output toxic or otherwise harmful language. In addition, one should not expect good instruct or chat performance, as this model was not fine-tuned for instruction following or chat.

 print(tokenizer.decode(outputs[0]))
 ```
+## Training Data
+The model is fine-tuned on the **Dolly-15k Dutch** dataset, specifically using the training split (`train_sft`). This dataset is not SoTA, however the goal is to demonstrate the capabilities and it fits <1024 tokens.
 ### Fine-tuning with Learning Rate Optimization
 The model includes an advanced learning rate optimization system for fine-tuning, implemented through the `LROptimizerCallback` class. This callback automatically handles learning rate optimization during training. Here's how to use it:
 ## Notice
+Zamba2-1.2B is a pretrained base model and therefore does not have any moderation mechanism and may output toxic or otherwise harmful language. In addition, one should not expect good instruct or chat performance, as this model was not fine-tuned for instruction following or chat.