RichardErkhov
/

nuprl_-_MultiPL-T-StarCoderBase_1b-awq

4-bit precision

Model card Files Files and versions Community

RichardErkhov commited on Nov 19, 2024

Commit

9089ba7

·

verified ·

1 Parent(s): 1ab8dee

uploaded readme

Files changed (1) hide show

README.md +91 -0

README.md ADDED Viewed

	@@ -0,0 +1,91 @@

+Quantization made by Richard Erkhov.
+[Github](https://github.com/RichardErkhov)
+[Discord](https://discord.gg/pvy7H8DZMG)
+[Request more models](https://github.com/RichardErkhov/quant_request)
+MultiPL-T-StarCoderBase_1b - AWQ
+- Model creator: https://huggingface.co/nuprl/
+- Original model: https://huggingface.co/nuprl/MultiPL-T-StarCoderBase_1b/
+Original model description:
+---
+license: bigscience-openrail-m
+library_name: transformers
+tags:
+- code
+- gpt_bigcode
+datasets:
+- nuprl/MultiPL-T
+metrics:
+- code_eval
+model-index:
+- name: MultiPLCoder-1b-OCaml
+  results:
+  - task:
+      type: text-generation
+    dataset:
+      name: MultiPL-HumanEval (Lua)
+      type: nuprl/MultiPL-E
+    metrics:
+    - type: pass@1
+      value: 0.173
+      name: pass@1
+      verified: true
+    - type: pass@1
+      value: 0.113
+      name: pass@1
+      verified: true
+    - type: pass@1
+      value: 0.097
+      name: pass@1
+      verified: true
+---
+# MultiPLCoder-1b
+1 billion parameter version of MultiPLCoder, a set of StarCoder-based models finetuned on the [MultiPL-T dataset](https://huggingface.co/datasets/nuprl/MultiPL-T).
+These models are state-of-the-art at low-resource languages, such as: Lua, Racket, and OCaml.
+## Language Revision Index
+This is the revision index for the best-performing models for their respective langauge.
+| Langauge      | Revision ID | Epoch |
+| ------------- | ----------- | ----- |
+| Lua           | `7e96d931547e342ad0661cdd91236fe4ccf52545`         | 3    |
+| Racket        | `2cdc541bee1db4da80c0b43384b0d6a0cacca5b2`         | 5    |
+| OCaml         | `e8a24f9e2149cbda8c3cca264a53c2b361b7a031`         | 6    |
+## Usage
+To utilize one of the models in this repository, you must first select a commit revision for that model from the table above.
+For example, to use the Lua model:
+```py
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("nuprl/MultiPLCoder-1b")
+lua_revision="7e96d931547e342ad0661cdd91236fe4ccf52545"
+model = AutoModelForCausalLM.from_pretrained("nuprl/MultiPLCoder-1b", revision=lua_revision)
+```
+Note that the model's default configuration does not enable caching, therefore you must specify to use the cache on generation.
+```py
+toks = tokenizer.encode("-- Hello World", return_tensors="pt")
+out = model.generate(toks, use_cache=True,  do_sample=True, temperature=0.2, top_p=0.95, max_length=50)
+print(tokenizer.decode(out[0], skip_special_tokens=True))
+```
+```
+-- Hello World!
+-- :param name: The name of the person to say hello to
+-- :return: A greeting
+local function say_hello(name)
+  return "Hello ".. name
+end
+```