chargoddard
/

servile-harpsichord-cdpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chargoddard commited on Dec 10, 2023

Commit

13cdf6b

•

1 Parent(s): bea3712

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -13,4 +13,6 @@ datasets:
 Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.
 Uses the Alpaca prompt format.

 Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.
+Several intermediate checkpoints (of cDPO training) are on branches.
 Uses the Alpaca prompt format.