AuraIndustries
/

Aura-MoE-2x4B

Model card Files Files and versions Community

jeiku commited on 27 days ago

Commit

b89bd64

·

verified ·

1 Parent(s): 43a1c07

Update README.md

Files changed (1) hide show

README.md +4 -13

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ base_model:
 **Aura-MoE-2x4B** is a state of the art dedicated roleplaying model designed to fulfill your every desire.
-The finetunes used in this merge saw several hundreds of millions of tokens of completion, instruction and roleplaying data. A Kahneman-Tversky Optimization was applied as a Low Rank Adapter to both heal and give this model a unique output style.
 Developed by **Aura Industries**, with contributions from **Anthracite Org**
@@ -68,10 +68,10 @@ experts_per_token: 1
 experts:
   - source_model: FourOhFour/Crispy_Crab_4B
     positive_prompts:
-      - "You are a roleplaying powerhouse, reply as {{char}} in this ongoing conversation with {{user}}."
   - source_model: FourOhFour/Zenith_4B
     positive_prompts:
-      - "You are a helpful assistant designed to perform tasks for the user."
 ```
 KTO
@@ -104,15 +104,6 @@ shuffle_merged_datasets: true
 val_set_size: 0.0
 output_dir: ./outputs/out
-adapter: lora
-lora_model_dir:
-lora_r: 32
-lora_alpha: 64
-lora_dropout: 0.05
-lora_target_linear: true
-lora_fan_in_fan_out:
 sequence_len: 8192
 sample_packing: false
 eval_sample_packing: false
@@ -131,7 +122,7 @@ max_steps: 500
 optimizer: adamw_8bit
 lr_scheduler: cosine
-learning_rate: 0.0001
 weight_decay: 0.05
 train_on_inputs: false

 **Aura-MoE-2x4B** is a state of the art dedicated roleplaying model designed to fulfill your every desire.
+The finetunes used in this merge saw several hundreds of millions of tokens of completion, instruction and roleplaying data. A Kahneman-Tversky Optimization was applied to both heal and give this model a unique output style.
 Developed by **Aura Industries**, with contributions from **Anthracite Org**
 experts:
   - source_model: FourOhFour/Crispy_Crab_4B
     positive_prompts:
+      - "Roleplaying partner"
   - source_model: FourOhFour/Zenith_4B
     positive_prompts:
+      - "Instruction following assistant"
 ```
 KTO
 val_set_size: 0.0
 output_dir: ./outputs/out
 sequence_len: 8192
 sample_packing: false
 eval_sample_packing: false
 optimizer: adamw_8bit
 lr_scheduler: cosine
+learning_rate: 0.00001
 weight_decay: 0.05
 train_on_inputs: false