Add link to paper

#1
by osanseviero HF staff - opened
Files changed (1) hide show
  1. README.md +40 -38
README.md CHANGED
@@ -1,38 +1,40 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024). (90.9% vs 90.2%).
6
-
7
- Additionally, compared to previous open-source models, AutoCoder offers a new feature: it can **automatically install the required packages** and attempt to run the code until it deems there are no issues, **whenever the user wishes to execute the code**.
8
-
9
- See details on the [AutoCoder GitHub](https://github.com/bin123apple/AutoCoder).
10
-
11
- Simple test script:
12
-
13
- ```
14
- model_path = ""
15
- tokenizer = AutoTokenizer.from_pretrained(model_path)
16
- model = AutoModelForCausalLM.from_pretrained(model_path,
17
- device_map="auto")
18
-
19
- HumanEval = load_dataset("evalplus/humanevalplus")
20
-
21
- Input = "" # input your question here
22
-
23
- messages=[
24
- { 'role': 'user', 'content': Input}
25
- ]
26
- inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True,
27
- return_tensors="pt").to(model.device)
28
-
29
- outputs = model.generate(inputs,
30
- max_new_tokens=1024,
31
- do_sample=False,
32
- temperature=0.0,
33
- top_p=1.0,
34
- num_return_sequences=1,
35
- eos_token_id=tokenizer.eos_token_id)
36
-
37
- answer = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
38
- ```
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024). (90.9% vs 90.2%).
6
+
7
+ Additionally, compared to previous open-source models, AutoCoder offers a new feature: it can **automatically install the required packages** and attempt to run the code until it deems there are no issues, **whenever the user wishes to execute the code**.
8
+
9
+ See details on the [AutoCoder GitHub](https://github.com/bin123apple/AutoCoder).
10
+
11
+ Simple test script:
12
+
13
+ ```
14
+ model_path = ""
15
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
16
+ model = AutoModelForCausalLM.from_pretrained(model_path,
17
+ device_map="auto")
18
+
19
+ HumanEval = load_dataset("evalplus/humanevalplus")
20
+
21
+ Input = "" # input your question here
22
+
23
+ messages=[
24
+ { 'role': 'user', 'content': Input}
25
+ ]
26
+ inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True,
27
+ return_tensors="pt").to(model.device)
28
+
29
+ outputs = model.generate(inputs,
30
+ max_new_tokens=1024,
31
+ do_sample=False,
32
+ temperature=0.0,
33
+ top_p=1.0,
34
+ num_return_sequences=1,
35
+ eos_token_id=tokenizer.eos_token_id)
36
+
37
+ answer = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
38
+ ```
39
+
40
+ Paper: https://arxiv.org/abs/2405.14906