sambanovasystems
/

SambaLingo-Bulgarian-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zolicsaki commited on Feb 22

Commit

b662482

•

1 Parent(s): 018f4e8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -70,7 +70,7 @@ The alignment phase follows the recipe for [Zephyr-7B](https://huggingface.co/Hu
 The SFT phase was done on the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset mixed with the Google translated version of the ultrachat_200k dataset. It was trained for one epoch with global batch size 512 and max sequence length 2048 tokens. We used a linear decay learning rate of 2e-5 and 10% warmup.
-The DPO phase was done on the [ultrafeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized) dataset and cai-conversation-harmless dataset, mixed with 10% of the data Google translated. It was trained with global batch size 32 and for three epochs. We used a linear decay learning rate of 5e-7, 10% warmup and β=0.1 as the regularization factor for DPO.
 ## Tokenizer Details

 The SFT phase was done on the [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset mixed with the Google translated version of the ultrachat_200k dataset. It was trained for one epoch with global batch size 512 and max sequence length 2048 tokens. We used a linear decay learning rate of 2e-5 and 10% warmup.
+The DPO phase was done on the [ultrafeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized) dataset and [cai-conversation-harmless](https://huggingface.co/datasets/HuggingFaceH4/cai-conversation-harmless) dataset, mixed with 10% of the data Google translated. It was trained with global batch size 32 and for three epochs. We used a linear decay learning rate of 5e-7, 10% warmup and β=0.1 as the regularization factor for DPO.
 ## Tokenizer Details