Update README.md (#1)
Browse files- Update README.md (a5343d3e9c853a4ad6402179ac63ad23b50cc0a0)
README.md
CHANGED
@@ -8,6 +8,8 @@ tags:
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
|
|
|
|
11 |
# Llama-se-rl-peft
|
12 |
Adapter weights of an RL fine-tuned model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
13 |
For more info check out the [blog post](https://huggingface.co/blog/stackllama) and [github example](https://github.com/lvwerra/trl/tree/main/examples/stack_llama/scripts).
|
|
|
8 |
- reinforcement-learning
|
9 |
---
|
10 |
|
11 |
+
![pull_figure](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/stack-llama.png)
|
12 |
+
|
13 |
# Llama-se-rl-peft
|
14 |
Adapter weights of an RL fine-tuned model based on LLaMa. Authored by Edward Beeching, Younes Belkada, Kashif Rasul, Lewis Tunstall and Leandro von Werra.
|
15 |
For more info check out the [blog post](https://huggingface.co/blog/stackllama) and [github example](https://github.com/lvwerra/trl/tree/main/examples/stack_llama/scripts).
|