passthepizza commited on
Commit
84c1b91
1 Parent(s): b5a1c17

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -1,3 +1,51 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - mistralai/Mistral-Large-Instruct-2411
7
+ tags:
8
+ - axolotl
9
+ datasets:
10
+ - NarrativAI/CakrawalaRP
11
+ pipeline_tag: text-generation
12
+ ---
13
+ # 🎭 Cakrawala-123B
14
+ > *Where Worlds Converge and Adventures Begin!*
15
+
16
+ ## 🌟 What's Special About This Model?
17
+ Cakrawala-123B is a fine-tuned variant of the Mistral-Large-Instruct-2411 model, specifically optimised for generating rich roleplaying conversations and character interactions. The model has been trained to excel at producing detailed, contextually appropriate character dialogues with rich descriptions of physical actions, expressions, and emotional states while maintaining consistent character voices and perspectives throughout extended interactions.
18
+
19
+ ## 🧪 The Secret Sauce
20
+ ### Training Diet:
21
+ - Fed with NarrativAI/CakrawalaRP dataset
22
+ - Conversation pairs with detailed interactions
23
+ - Focused on maintaining character consistency and rich descriptions
24
+
25
+ ### Tech Wizardry:
26
+ - Base Model: Mistral-Large-Instruct-2411
27
+ - Fine-tuned using QLoRA
28
+ - Trained over 2 epochs
29
+
30
+ ## Training Parameters
31
+ - Gradient Accumulation Steps: 1
32
+ - Micro Batch Size: 4
33
+ - Learning Rate: 0.000015
34
+ - Optimizer: AdamW
35
+ - Scheduler: Cosine
36
+ - Mixed Precision: BF16 & FP16 with TF32 support
37
+
38
+ ## 🔧 Under the Hood
39
+ - LoRA Configuration:
40
+ - Rank (r): 32
41
+ - Alpha: 64
42
+ - Dropout: 0.1
43
+ - Sequence Length: 2048
44
+ - Gradient Checkpointing: Enabled
45
+ - Flash Attention: Enabled
46
+
47
+ ## 🎬 License & Credits
48
+ - Licensed under MIT
49
+ - Based on mistralai/Mistral-Large-Instruct-2411
50
+
51
+ *Built with ❤️ for roleplayers, by roleplayers*