Triangle104 commited on
Commit
4bf3622
·
verified ·
1 Parent(s): fb29b5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md CHANGED
@@ -26,6 +26,93 @@ model-index:
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Use with llama.cpp
30
  Install llama.cpp through brew (works on Mac and Linux)
31
 
 
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
28
 
29
+ ---
30
+ Model details:
31
+ -
32
+ A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data.
33
+
34
+ It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
35
+ versatility, creativity and "flavor" of the resulting model.
36
+
37
+
38
+
39
+
40
+
41
+ Dedicated to Nev.
42
+
43
+
44
+
45
+ Version notes for 0.2: Basically, reprocessed the whole
46
+ dataset again, due to a severe mistake in previously used pipeline,
47
+ which left the data poisoned with a lot of non-unicode characters. Now,
48
+ no more weird generation artifacts, and more stability. Major kudos to
49
+ Cahvay for his work on fixing this critical issue.
50
+
51
+
52
+
53
+
54
+
55
+ Prompt format is ChatML.
56
+
57
+
58
+
59
+ Recommended sampler values:
60
+
61
+
62
+ Temperature: 1
63
+ Min-P: 0.05
64
+ Top-A: 0.2
65
+ Repetition Penalty: 1.03
66
+
67
+
68
+
69
+ Recommended SillyTavern presets (via CalamitousFelicitousness):
70
+
71
+
72
+
73
+ Context
74
+ Instruct and System Prompt
75
+
76
+
77
+
78
+
79
+
80
+
81
+
82
+
83
+ Training data:
84
+
85
+
86
+
87
+ Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
88
+ Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
89
+ A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
90
+ A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
91
+ Synthstruct and SynthRP datasets by Epiculous
92
+ A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
93
+
94
+
95
+
96
+ Training time and hardware:
97
+
98
+
99
+
100
+ 7 hours on 8xH100 SXM, provided by FeatherlessAI
101
+
102
+
103
+
104
+
105
+
106
+
107
+ Model was created by Kearm, Auri and Cahvay.
108
+
109
+
110
+ Special thanks:
111
+ to Cahvay for his work on investigating and reprocessing the
112
+ corrupted dataset, removing the single biggest source of data poisoning.
113
+ to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
114
+ to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
115
+ and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
116
  ## Use with llama.cpp
117
  Install llama.cpp through brew (works on Mac and Linux)
118