jeiku commited on
Commit
b89bd64
·
verified ·
1 Parent(s): 43a1c07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -13
README.md CHANGED
@@ -18,7 +18,7 @@ base_model:
18
 
19
  **Aura-MoE-2x4B** is a state of the art dedicated roleplaying model designed to fulfill your every desire.
20
 
21
- The finetunes used in this merge saw several hundreds of millions of tokens of completion, instruction and roleplaying data. A Kahneman-Tversky Optimization was applied as a Low Rank Adapter to both heal and give this model a unique output style.
22
 
23
  Developed by **Aura Industries**, with contributions from **Anthracite Org**
24
 
@@ -68,10 +68,10 @@ experts_per_token: 1
68
  experts:
69
  - source_model: FourOhFour/Crispy_Crab_4B
70
  positive_prompts:
71
- - "You are a roleplaying powerhouse, reply as {{char}} in this ongoing conversation with {{user}}."
72
  - source_model: FourOhFour/Zenith_4B
73
  positive_prompts:
74
- - "You are a helpful assistant designed to perform tasks for the user."
75
  ```
76
 
77
  KTO
@@ -104,15 +104,6 @@ shuffle_merged_datasets: true
104
  val_set_size: 0.0
105
  output_dir: ./outputs/out
106
 
107
- adapter: lora
108
- lora_model_dir:
109
-
110
- lora_r: 32
111
- lora_alpha: 64
112
- lora_dropout: 0.05
113
- lora_target_linear: true
114
- lora_fan_in_fan_out:
115
-
116
  sequence_len: 8192
117
  sample_packing: false
118
  eval_sample_packing: false
@@ -131,7 +122,7 @@ max_steps: 500
131
 
132
  optimizer: adamw_8bit
133
  lr_scheduler: cosine
134
- learning_rate: 0.0001
135
  weight_decay: 0.05
136
 
137
  train_on_inputs: false
 
18
 
19
  **Aura-MoE-2x4B** is a state of the art dedicated roleplaying model designed to fulfill your every desire.
20
 
21
+ The finetunes used in this merge saw several hundreds of millions of tokens of completion, instruction and roleplaying data. A Kahneman-Tversky Optimization was applied to both heal and give this model a unique output style.
22
 
23
  Developed by **Aura Industries**, with contributions from **Anthracite Org**
24
 
 
68
  experts:
69
  - source_model: FourOhFour/Crispy_Crab_4B
70
  positive_prompts:
71
+ - "Roleplaying partner"
72
  - source_model: FourOhFour/Zenith_4B
73
  positive_prompts:
74
+ - "Instruction following assistant"
75
  ```
76
 
77
  KTO
 
104
  val_set_size: 0.0
105
  output_dir: ./outputs/out
106
 
 
 
 
 
 
 
 
 
 
107
  sequence_len: 8192
108
  sample_packing: false
109
  eval_sample_packing: false
 
122
 
123
  optimizer: adamw_8bit
124
  lr_scheduler: cosine
125
+ learning_rate: 0.00001
126
  weight_decay: 0.05
127
 
128
  train_on_inputs: false