Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,8 @@ tags:
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
|
|
|
|
11 |
# Llama-se-rl-peft
|
12 |
Adapter weights of an RL fine-tuned model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
13 |
For more info check out the [blog post]() and [github example]().
|
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
11 |
+
![pull_figure](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/stack-llama.png)
|
12 |
+
|
13 |
# Llama-se-rl-peft
|
14 |
Adapter weights of an RL fine-tuned model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
15 |
For more info check out the [blog post]() and [github example]().
|