Update README.md
Browse files
README.md
CHANGED
@@ -4,8 +4,14 @@ I just made a gguf file for my own use, and then share it, please support the or
|
|
4 |
---
|
5 |
This repo contains GGUF format model files for **[haoranxu/ALMA-7B-R](https://huggingface.co/haoranxu/ALMA-7B-R)**
|
6 |
---
|
7 |
-
license: mit
|
8 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
**[ALMA-R](https://arxiv.org/abs/2401.08417)** builds upon [ALMA models](https://arxiv.org/abs/2309.11674), with further LoRA fine-tuning with our proposed **Contrastive Preference Optimization (CPO)** as opposed to the Supervised Fine-tuning used in ALMA. CPO fine-tuning requires our [triplet preference data](https://huggingface.co/datasets/haoranxu/ALMA-R-Preference) for preference learning. ALMA-R now can matches or even exceeds GPT-4 or WMT winners!
|
10 |
```
|
11 |
@misc{xu2024contrastive,
|
|
|
4 |
---
|
5 |
This repo contains GGUF format model files for **[haoranxu/ALMA-7B-R](https://huggingface.co/haoranxu/ALMA-7B-R)**
|
6 |
---
|
|
|
7 |
---
|
8 |
+
---
|
9 |
+
---
|
10 |
+
---
|
11 |
+
the original model card:
|
12 |
+
---
|
13 |
+
license: mit
|
14 |
+
|
15 |
**[ALMA-R](https://arxiv.org/abs/2401.08417)** builds upon [ALMA models](https://arxiv.org/abs/2309.11674), with further LoRA fine-tuning with our proposed **Contrastive Preference Optimization (CPO)** as opposed to the Supervised Fine-tuning used in ALMA. CPO fine-tuning requires our [triplet preference data](https://huggingface.co/datasets/haoranxu/ALMA-R-Preference) for preference learning. ALMA-R now can matches or even exceeds GPT-4 or WMT winners!
|
16 |
```
|
17 |
@misc{xu2024contrastive,
|