mayank-mishra commited on
Commit
2afcd1b
1 Parent(s): 4a760e2

add mmodel

Browse files
Files changed (2) hide show
  1. README.md +20 -0
  2. model.pt +3 -0
README.md CHANGED
@@ -1,3 +1,23 @@
1
  ---
2
  license: openrail
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: openrail
3
  ---
4
+
5
+ # GPTQ-for-SantaCoder
6
+ Visit [GPTQ-for-SantaCoder](https://github.com/mayank31398/GPTQ-for-SantaCoder) for instructions on how to use the model weights here.
7
+ If you want 8-bit weights, visit [santacoder-GPTQ-8bit-128g](https://huggingface.co/mayank31398/santacoder-GPTQ-8bit-128g).
8
+
9
+ ## Results
10
+ | [SantaCoder](https://arxiv.org/abs/2301.03988) | Bits | group-size | memory(MiB) | wikitext2 | ptb | c4 | stack | checkpoint size(MB) |
11
+ | -------------------------------------------------- | ---- | ---------- | ----------- | --------- | ---------- | ---------- | ---------- | ------------------- |
12
+ | FP32 | 32 | - | 4344.722 | 24.927 | 38.574 | 27.779 | 2.619 | 4394 |
13
+ | BF16 | 16 | - | 2173.680 | 24.960 | 38.597 | 27.794 | 2.621 | 2195 |
14
+ | [GPTQ](https://arxiv.org/abs/2210.17323) | 8 | -1 | 1396.548 | 24.936 | 38.592 | 27.785 | 2.619 | 1411 |
15
+ | [GPTQ](https://arxiv.org/abs/2210.17323) | 4 | -1 | 911.384 | 26.581 | 40.717 | 29.232 | 2.658 | 913 |
16
+ | [GPTQ](https://arxiv.org/abs/2210.17323) | 3 | -1 | - | 11761.473 | 7273.338 | 9124.941 | 2485.844 | 789 |
17
+ | [GPTQ](https://arxiv.org/abs/2210.17323) | 2 | -1 | - | 67976.797 | 68994.484 | 73294.438 | 45370.488 | 649 |
18
+
19
+ # License
20
+ The model is licenses under the CodeML Open RAIL-M v0.1 license. You can find the full license [here](https://huggingface.co/spaces/bigcode/license).
21
+
22
+ # Acknowledgements
23
+ Thanks to everyone in BigCode who worked so hard to create these code models.
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a7d89f54f6ce941cfd0597282e3aa60d6496dfb118116bc0566364488521d4c
3
+ size 935374317