Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,15 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
# Towards Efficient Exact Optimization of Language Model Alignment
|
6 |
+
|
7 |
+
- **model**: [exo-imdb-sft-model](https://huggingface.co/ehzoah/exo-imdb-sft-model)
|
8 |
+
|
9 |
+
- Finetuned from model: [pythia-2.8b](https://huggingface.co/EleutherAI/pythia-2.8b)
|
10 |
+
|
11 |
+
- **dataset**: [imdb](https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz) (original stanford version)
|
12 |
+
|
13 |
+
SFT model used in the imdb experiment of the ICML'24 paper [*Towards Efficient Exact Optimization of Language Model Alignment*](https://arxiv.org/pdf/2402.00856).
|
14 |
+
|
15 |
+
For details of the dataset, training and inference of this model, please refer to https://github.com/haozheji/exact-optimization/blob/main/exp/imdb_exp/README.md
|