ssmits commited on
Commit
c60d7a0
·
verified ·
1 Parent(s): 8ab0af2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -60,6 +60,10 @@ outputs = model.generate(**input_ids, max_new_tokens=100)
60
  print(tokenizer.decode(outputs[0]))
61
  ```
62
 
 
 
 
 
63
  ### Fine-tuning with Learning Rate Optimization
64
 
65
  The model includes an advanced learning rate optimization system for fine-tuning, implemented through the `LROptimizerCallback` class. This callback automatically handles learning rate optimization during training. Here's how to use it:
@@ -141,4 +145,5 @@ And memory overhead
141
 
142
  ## Notice
143
 
144
- Zamba2-1.2B is a pretrained base model and therefore does not have any moderation mechanism and may output toxic or otherwise harmful language. In addition, one should not expect good instruct or chat performance, as this model was not fine-tuned for instruction following or chat.
 
 
60
  print(tokenizer.decode(outputs[0]))
61
  ```
62
 
63
+ ## Training Data
64
+
65
+ The model is fine-tuned on the **Dolly-15k Dutch** dataset, specifically using the training split (`train_sft`). This dataset is not SoTA, however the goal is to demonstrate the capabilities and it fits <1024 tokens.
66
+
67
  ### Fine-tuning with Learning Rate Optimization
68
 
69
  The model includes an advanced learning rate optimization system for fine-tuning, implemented through the `LROptimizerCallback` class. This callback automatically handles learning rate optimization during training. Here's how to use it:
 
145
 
146
  ## Notice
147
 
148
+ Zamba2-1.2B is a pretrained base model and therefore does not have any moderation mechanism and may output toxic or otherwise harmful language. In addition, one should not expect good instruct or chat performance, as this model was not fine-tuned for instruction following or chat.
149
+