RichardErkhov commited on
Commit
9089ba7
1 Parent(s): 1ab8dee

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +91 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ MultiPL-T-StarCoderBase_1b - AWQ
11
+ - Model creator: https://huggingface.co/nuprl/
12
+ - Original model: https://huggingface.co/nuprl/MultiPL-T-StarCoderBase_1b/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: bigscience-openrail-m
20
+ library_name: transformers
21
+ tags:
22
+ - code
23
+ - gpt_bigcode
24
+ datasets:
25
+ - nuprl/MultiPL-T
26
+ metrics:
27
+ - code_eval
28
+ model-index:
29
+ - name: MultiPLCoder-1b-OCaml
30
+ results:
31
+ - task:
32
+ type: text-generation
33
+ dataset:
34
+ name: MultiPL-HumanEval (Lua)
35
+ type: nuprl/MultiPL-E
36
+ metrics:
37
+ - type: pass@1
38
+ value: 0.173
39
+ name: pass@1
40
+ verified: true
41
+ - type: pass@1
42
+ value: 0.113
43
+ name: pass@1
44
+ verified: true
45
+ - type: pass@1
46
+ value: 0.097
47
+ name: pass@1
48
+ verified: true
49
+ ---
50
+ # MultiPLCoder-1b
51
+
52
+ 1 billion parameter version of MultiPLCoder, a set of StarCoder-based models finetuned on the [MultiPL-T dataset](https://huggingface.co/datasets/nuprl/MultiPL-T).
53
+ These models are state-of-the-art at low-resource languages, such as: Lua, Racket, and OCaml.
54
+
55
+
56
+ ## Language Revision Index
57
+
58
+ This is the revision index for the best-performing models for their respective langauge.
59
+
60
+ | Langauge | Revision ID | Epoch |
61
+ | ------------- | ----------- | ----- |
62
+ | Lua | `7e96d931547e342ad0661cdd91236fe4ccf52545` | 3 |
63
+ | Racket | `2cdc541bee1db4da80c0b43384b0d6a0cacca5b2` | 5 |
64
+ | OCaml | `e8a24f9e2149cbda8c3cca264a53c2b361b7a031` | 6 |
65
+
66
+ ## Usage
67
+
68
+ To utilize one of the models in this repository, you must first select a commit revision for that model from the table above.
69
+ For example, to use the Lua model:
70
+ ```py
71
+ from transformers import AutoTokenizer, AutoModelForCausalLM
72
+ tokenizer = AutoTokenizer.from_pretrained("nuprl/MultiPLCoder-1b")
73
+ lua_revision="7e96d931547e342ad0661cdd91236fe4ccf52545"
74
+ model = AutoModelForCausalLM.from_pretrained("nuprl/MultiPLCoder-1b", revision=lua_revision)
75
+ ```
76
+
77
+ Note that the model's default configuration does not enable caching, therefore you must specify to use the cache on generation.
78
+ ```py
79
+ toks = tokenizer.encode("-- Hello World", return_tensors="pt")
80
+ out = model.generate(toks, use_cache=True, do_sample=True, temperature=0.2, top_p=0.95, max_length=50)
81
+ print(tokenizer.decode(out[0], skip_special_tokens=True))
82
+ ```
83
+ ```
84
+ -- Hello World!
85
+ -- :param name: The name of the person to say hello to
86
+ -- :return: A greeting
87
+ local function say_hello(name)
88
+ return "Hello ".. name
89
+ end
90
+ ```
91
+