Panchovix commited on
Commit
a125036
1 Parent(s): d7e5827

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: llama2
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ tags:
4
+ - not-for-all-audiences
5
  ---
6
+ 5 bits/bpw quantization of [Venus-103b-v1.1](https://huggingface.co/nsfwthrowitaway69/Venus-103b-v1.1) to be used on exllamav2.
7
+
8
+ Calibration dataset was a cleaned Pippa dataset (https://huggingface.co/datasets/royallab/PIPPA-cleaned), same as used as on the original model card.
9
+
10
+ You can use the measurement.json from there to do your own quant sizes
11
+
12
+ # Original model card
13
+
14
+ # Venus 103b - version 1.1
15
+
16
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/655febd724e0d359c1f21096/BSKlxWQSbh-liU8kGz4fF.png)
17
+
18
+ ## Model Details
19
+
20
+ - A result of interleaving layers of [Sao10K/Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B), [migtissera/SynthIA-70B-v1.5](https://huggingface.co/migtissera/SynthIA-70B-v1.5), and [Xwin-LM/Xwin-LM-70B-V0.1](https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1) using [mergekit](https://github.com/cg123/mergekit).
21
+ - The resulting model has 120 layers and 103 billion parameters.
22
+ - See mergekit-config.yml for details on the merge method used.
23
+ - See the `exl2-*` branches for exllama2 quantizations. The 5.65 bpw quant should fit in 80GB VRAM, and the 3.35 bpw quant should fit in 48GB VRAM.
24
+ - Inspired by [Goliath-120b](https://huggingface.co/alpindale/goliath-120b)
25
+
26
+ **Warning: This model will produce NSFW content!**
27
+
28
+ ## Results
29
+
30
+ 1. Seems to be more "talkative" than Venus-103b-v1.0 (i.e characters speakmore often in roleplays)
31
+ 2. Sometimes struggles to pay attention to small details in the scenes
32
+ 3. Prose seems pretty creative and more logical than Venus-120b-v1.0