File size: 4,291 Bytes
0b10459
 
5d7f537
 
706d5c0
 
 
5d7f537
 
 
 
 
 
 
 
 
 
 
 
 
 
0b10459
73bd97d
5d7f537
 
a1fb961
 
5d7f537
 
 
 
864bbc9
 
5d7f537
 
3b92c54
5d7f537
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3b92c54
 
5d7f537
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73bd97d
 
 
8cd2153
 
 
 
 
 
 
 
 
011d716
 
 
8cd2153
 
 
 
 
 
 
 
 
73bd97d
 
 
 
8cd2153
 
 
 
 
 
 
 
73bd97d
8cd2153
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
license: apache-2.0
language:
- en
- de
- es
- fr
tags:
- sft
pipeline_tag: text-generation
widget:
- text: >-
    <|prompter|>What is a meme, and what's the history behind this
    word?<|endoftext|><|assistant|>
- text: <|prompter|>What's the Earth total population<|endoftext|><|assistant|>
- text: >-
    <|prompter|>Write a story about future of AI
    development<|endoftext|><|assistant|>
datasets:
- OpenAssistant/oasst1
library_name: transformers
---

# Open-Assistant Falcon 7B SFT OASST-TOP1 Model

This model is a fine-tuning of TII's [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) LLM.
It was trained with 11,123 top-1 (high-quality) demonstrations of the OASST data set (exported on July 2, 2023) with a batch size of 128 for 8 epochs with LIMA style dropout (p=0.2) and a context-length of 2048 tokens.

## Model Details

- **Finetuned from:** [tiiuae/falcon-7b](https://huggingface.co/tiiuae/falcon-7b)
- **Model type:** Causal decoder-only transformer language model
- **Language:** English, German, Spanish, French (and limited capabilities in Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish);
- **Weights & Biases:** [public-sft/runs/25apbcld](https://wandb.ai/open-assistant/public-sft/runs/25apbcld)
- **Code:** [Open-Assistant/model/model_training](https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_training)
- **Demo:** [Continuations for 250 random prompts](https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Fchat-gpt%2F2023-04-11_gpt-3.5-turbo_lottery.json%0Ahttps%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-06-05_OpenAssistant_falcon-7b-sft-top1-696_sampling_noprefix2.json)
- **License:** Apache 2.0
- **Contact:** [Open-Assistant Discord](https://ykilcher.com/open-assistant-discord)


## Prompting

Two special tokens are used to mark the beginning of user and assistant turns:
`<|prompter|>` and `<|assistant|>`. Each turn ends with a `<|endoftext|>` token.

Input prompt example:
```
<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>
```
The input ends with the `<|assistant|>` token to signal that the model should 
start generating the assistant reply.


## Sample Code

```python
from transformers import AutoTokenizer
import transformers
import torch

model = "OpenAssistant/falcon-7b-sft-top1-696"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)

input_text="<|prompter|>What is a meme, and what's the history behind this word?<|endoftext|><|assistant|>"

sequences = pipeline(
    input_text,
    max_length=500,
    do_sample=True,
    return_full_text=False,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")
```


## Configuration Details

Model:
```
falcon-7b:
  dtype: bf16
  log_dir: "falcon_log_7b"
  learning_rate: 1e-5
  model_name: "tiiuae/falcon-7b"
  deepspeed_config: configs/zero_config.json
  output_dir: falcon
  weight_decay: 0.0
  max_length: 2048
  save_strategy: steps
  eval_steps: 80
  save_steps: 80
  warmup_steps: 20
  gradient_checkpointing: true
  gradient_accumulation_steps: 4
  per_device_train_batch_size: 4
  per_device_eval_batch_size: 8
  num_train_epochs: 8
  save_total_limit: 4
  residual_dropout: 0.2
  residual_dropout_lima: true
```

Dataset:
```
oasst-top1:
  # oasst_export: 11123 (100.00%)
  datasets:
    - oasst_export:
        lang: "bg,ca,cs,da,de,en,es,fr,hr,hu,it,nl,pl,pt,ro,ru,sl,sr,sv,uk" # sft-8.0
        input_file_path: 2023-06-02_oasst_all_labels.jsonl.gz
        val_split: 0.05
        top_k: 1
```

Train command:
```
deepspeed trainer_sft.py --configs defaults falcon-7b oasst-top1 --cache_dir <data_cache_dir> --output_dir <output_path> --deepspeed
```

Export command:
```
python export_model.py --dtype bf16 --hf_repo_name OpenAssistant/falcon-7b-sft-top1 --trust_remote_code --auth_token <auth_token> <output_path> --max_shard_size 2GB
```