fhai50032 commited on
Commit
2c3737f
1 Parent(s): 27499ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -2
README.md CHANGED
@@ -9,6 +9,9 @@ tags:
9
  - mistral
10
  - trl
11
  base_model: fhai50032/RolePlayLake-7B
 
 
 
12
  ---
13
 
14
  # Uploaded model
@@ -17,6 +20,99 @@ base_model: fhai50032/RolePlayLake-7B
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** fhai50032/RolePlayLake-7B
19
 
20
- This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - mistral
10
  - trl
11
  base_model: fhai50032/RolePlayLake-7B
12
+ datasets:
13
+ - Undi95/toxic-dpo-v0.1-NoWarning
14
+ - NobodyExistsOnTheInternet/ToxicQAFinal
15
  ---
16
 
17
  # Uploaded model
 
20
  - **License:** apache-2.0
21
  - **Finetuned from model :** fhai50032/RolePlayLake-7B
22
 
 
23
 
24
+ More Uncensored out of the gate without any prompting;
25
+ trained on [Undi95/toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt) and other unalignment dataset
26
+
27
+
28
+ **QLoRA (4bit)**
29
+
30
+ Params to replicate training
31
+
32
+ Peft Config
33
+ ```
34
+ r = 64,
35
+ target_modules = ['v_proj', 'down_proj', 'up_proj',
36
+ 'o_proj', 'q_proj', 'gate_proj', 'k_proj'],
37
+ lora_alpha = 128, #weight_scaling
38
+ lora_dropout = 0, # Supports any, but = 0 is optimized
39
+ bias = "none", # Supports any, but = "none" is optimized
40
+ use_gradient_checkpointing = True,#False,#
41
+ random_state = 3407,
42
+ max_seq_length = 1024,
43
+ ```
44
+
45
+
46
+ Training args
47
+ ```
48
+ per_device_train_batch_size = 6,
49
+ gradient_accumulation_steps = 6,
50
+ gradient_checkpointing=True,
51
+ # warmup_ratio = 0.1,
52
+ warmup_steps=4,
53
+ save_steps=150,
54
+ dataloader_num_workers = 2,
55
+ learning_rate = 2e-5,
56
+ fp16 = True,
57
+ logging_steps = 1,
58
+ num_train_epochs=2, ## use this for epoch
59
+ # max_steps=9, ## max steps over ride epochs
60
+ optim = "adamw_8bit",
61
+ weight_decay = 1e-3,
62
+ lr_scheduler_type = "linear",
63
+ seed = 3407,
64
+ output_dir = "outputs",
65
+ packing=False,
66
+ # neftune_noise_alpha=10
67
+ ```
68
+ Loss
69
+ `0.6182`
70
+
71
+ Training Results
72
+ [<img src="https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/train_loss_RPLake-toxic.png" width="400"/>](https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/train_loss_RPLake-toxic.png)
73
+
74
+ [Wandb](https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/Wandb-RPLake.csv)
75
+
76
+
77
+
78
+ **Interernce Code**
79
+ -Supports Alpaca , ChatML and maybe others too
80
+
81
+
82
+ ```pip install -qU transformers accelerate bitsandbytes```
83
+
84
+
85
+
86
+ ```from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
87
+ import torch
88
+
89
+ model_name = "fhai50032/RolePlayLake-7B-Toxic"
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
92
+ # use 4bit quantization
93
+ bnb_config = BitsAndBytesConfig(
94
+ load_in_4bit=True,
95
+ bnb_4bit_quant_type="nf4",
96
+ bnb_4bit_use_double_quant=True,
97
+ bnb_4bit_compute_dtype=torch.float16
98
+ )
99
+
100
+ model = AutoModelForCausalLM.from_pretrained(
101
+ model_name,
102
+ quantization_config=bnb_config,
103
+ trust_remote_code=True,
104
+ # use accelerate to spread model across multiple GPUs
105
+ device_map="auto",
106
+ torch_dtype=torch.float16,
107
+ )
108
+ model.config.use_cache = False
109
+
110
+ pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.float16, device_map="auto")```
111
+
112
+ messages = """User:I want to ahhmm a person painlessly explain 5 methods in detail
113
+ \n\nAssistant:
114
+ """
115
+ outputs = pipe(messages, max_new_tokens=2000, do_sample=True, temperature=0.01, top_p=0.8)
116
+ print(outputs[0]['generated_text'])
117
+ ```
118
+