Transformers
PyTorch
English
trl
rlhf
llama-7b-se-rl-peft / README.md
lewtun's picture
lewtun HF staff
Refocus language on specific harms (#8)
b2b4d9b
|
raw
history blame
6.24 kB
metadata
license: bigscience-openrail-m
language:
  - en
inference: false
tags:
  - trl
  - transformers
  - rlhf
datasets:
  - lvwerra/stack-exchange-paired

pull_figure

Llama-se-rl-peft

Adapter weights of a Reinforcement Learning fine-tuned model based on the LLaMA model (see Meta's LLaMA release for the original LLaMA model). The model is designed to generate human-like responses to questions in Stack Exchange domains of programming, mathematics, physics, and more. For more info check out the blog post and github example.

Model Details

Model Description

Developed by: Hugging Face

Model type: An auto-regressive language model based on the transformer architecture, and fine-tuned with Stack Exchange datasets.

Languages: Predominantly English, with additional data from languages with the following ISO codes:

bg ca cs da de es fr hr hu it nl pl pt ro ru sl sr sv uk

License: bigscience-openrail-m

Finetuned from: LLaMA

Model Sources

Repository: https://huggingface.co/trl-lib/llama-7b-se-rl-peft/tree/main

Base Model Repository: https://github.com/facebookresearch/llama

Demo: https://huggingface.co/spaces/trl-lib/stack-llama

Uses

Direct Use

  • Long-form question-answering on topics of programming, mathematics, and physics
  • Demonstrating a Large Language Model's ability to follow target behavior of generating answers to a question that would be highly rated on Stack Exchange.

Out of Scope Use

  • Replacing human expertise

Bias, Risks, and Limitations

Recommendations

  • Answers should be validated through the use of external sources.
  • Disparities between the data contributors and the direct and indirect users of the technology should inform developers in assessing what constitutes an appropriate use case.
  • Further research is needed to attribute model generations to sources in the training data, especially in cases where the model copies answers from the training data.

Training Details

Training Data

Original datasets are described in the LLaMA Model Card. Fine-tuning datasets for this model are based on Stack Exchange Paired, which consists of questions and answers from various domains in Stack Exchange, such as programming, mathematics, physics, and more. Specifically:

Traditional Fine-tuning: https://huggingface.co/datasets/lvwerra/stack-exchange-paired/tree/main/data/finetune

RL Fine-tuning: https://huggingface.co/datasets/lvwerra/stack-exchange-paired/tree/main/data/rl

Reward Model: https://huggingface.co/trl-lib/llama-7b-se-rm-peft

Training Procedure

The model was first fine-tuned on the Stack Exchange question and answer pairs and then RL fine-tuned using a Stack Exchange Reward Model. It is trained to respond to prompts with the following template:

Question: <Query> 

Answer: <Response>

Citation

BibTeX:

@misc {beeching2023stackllama,
    author       = { Edward Beeching and
                     Younes Belkada and
                     Kashif Rasul and
                     Lewis Tunstall and
                     Leandro von Werra and
                     Nazneen Rajani and
                     Nathan Lambert
                   },
    title        = { StackLLaMa: An RL Fine-tuned LLaMa Model for Stack Exchange Question and Answering },
    year         = 2023,
    url          = { https://huggingface.co/trl-lib/llama-7b-se-rl-peft },
    doi          = { 10.57967/hf/0513 },
    publisher    = { Hugging Face Blog }
}

Model Card Authors

Nathan Lambert, Leandro von Werra, Edward Beeching, Kashif Rasul, Younes Belkada, Margaret Mitchell