Text Generation
PEFT
Safetensors
dfurman commited on
Commit
1a8b424
1 Parent(s): 5c6c6e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -8
README.md CHANGED
@@ -10,7 +10,7 @@ Falcon-7b-chat-oasst1 is a chatbot-like model for dialogue generation. It was bu
10
  This model was fine-tuned in 8-bit using 🤗 [peft](https://github.com/huggingface/peft) adapters, [transformers](https://github.com/huggingface/transformers), and [bitsandbytes](https://github.com/TimDettmers/bitsandbytes).
11
  - The training relied on a recent method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), instead of fine-tuning the entire model you just have to fine-tune adapters and load them properly inside the model.
12
  - Training took approximately 6 hours and was executed on a workstation with a single NVIDIA A100-SXM 40GB GPU (via Google Colab).
13
- - See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
14
 
15
  ## Model Summary
16
 
@@ -92,11 +92,6 @@ We recommend users of this model to develop guardrails and to take appropriate p
92
  import torch
93
  from peft import PeftModel, PeftConfig
94
  from transformers import AutoModelForCausalLM, AutoTokenizer
95
-
96
- # Login to HF
97
- from huggingface_hub import notebook_login
98
-
99
- notebook_login() # use personal HF token for access to intellio-nlp
100
  ```
101
 
102
  ### GPU Inference in 8-bit
@@ -105,7 +100,7 @@ This requires a GPU with at least 12GB memory.
105
 
106
  ```python
107
  # load the model
108
- peft_model_id = "intellio-NLP/falcon-7b-chat-oasst1"
109
  config = PeftConfig.from_pretrained(peft_model_id)
110
 
111
  model = AutoModelForCausalLM.from_pretrained(
@@ -153,7 +148,7 @@ print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
153
 
154
  ## Reproducibility
155
 
156
- - See attached [Notebook](https://huggingface.co/intellio-NLP/falcon-40b-chat-oasst1/blob/main/finetune_falcon40b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
157
 
158
  ### CUDA Info
159
 
 
10
  This model was fine-tuned in 8-bit using 🤗 [peft](https://github.com/huggingface/peft) adapters, [transformers](https://github.com/huggingface/transformers), and [bitsandbytes](https://github.com/TimDettmers/bitsandbytes).
11
  - The training relied on a recent method called "Low Rank Adapters" ([LoRA](https://arxiv.org/pdf/2106.09685.pdf)), instead of fine-tuning the entire model you just have to fine-tune adapters and load them properly inside the model.
12
  - Training took approximately 6 hours and was executed on a workstation with a single NVIDIA A100-SXM 40GB GPU (via Google Colab).
13
+ - See attached [Notebook](https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
14
 
15
  ## Model Summary
16
 
 
92
  import torch
93
  from peft import PeftModel, PeftConfig
94
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
 
 
95
  ```
96
 
97
  ### GPU Inference in 8-bit
 
100
 
101
  ```python
102
  # load the model
103
+ peft_model_id = "dfurman/falcon-7b-chat-oasst1"
104
  config = PeftConfig.from_pretrained(peft_model_id)
105
 
106
  model = AutoModelForCausalLM.from_pretrained(
 
148
 
149
  ## Reproducibility
150
 
151
+ - See attached [Notebook](https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb) for the code (and hyperparams) used to train the model.
152
 
153
  ### CUDA Info
154