lvkaokao commited on
Commit
4121a2d
1 Parent(s): 0b39ee8
Files changed (1) hide show
  1. src/display/about.py +19 -9
src/display/about.py CHANGED
@@ -40,22 +40,32 @@ We chose these benchmarks as they test a variety of reasoning and general knowle
40
 
41
  ## REPRODUCIBILITY
42
  To reproduce our results, here is the commands you can run, using [v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness/tree/v0.4.2) of the Eleuther AI Harness:
43
- `python main.py --model=hf-causal-experimental --model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"`
44
- ` --tasks=<task_list> --num_fewshot=<n_few_shot> --batch_size=1 --output_path=<output_path>`
 
 
 
 
 
 
45
 
46
  ```
47
- python main.py --model=hf-causal-experimental \
48
- --model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>" \
49
- --tasks=<task_list> \
50
- --num_fewshot=<n_few_shot> \
51
- --batch_size=1 \
52
  --output_path=<output_path>
53
 
54
  ```
55
 
56
- **Note:** You can expect results to vary slightly for different batch sizes because of padding.
 
 
 
 
57
 
58
- The tasks and few shots parameters are:
59
  - ARC-C: 0-shot, *arc_challenge* (`acc`)
60
  - ARC-E: 0-shot, *arc_easy* (`acc`)
61
  - HellaSwag: 0-shot, *hellaswag* (`acc`)
 
40
 
41
  ## REPRODUCIBILITY
42
  To reproduce our results, here is the commands you can run, using [v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness/tree/v0.4.2) of the Eleuther AI Harness:
43
+ ```
44
+ python main.py --model=hf-causal-experimental
45
+ --model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"
46
+ --tasks=<task_list>
47
+ --num_fewshot=<n_few_shot>
48
+ --batch_size=1
49
+ --output_path=<output_path>
50
+ ```
51
 
52
  ```
53
+ python main.py --model=hf-causal-experimental
54
+ --model_args="pretrained=<your_model>,use_accelerate=True,revision=<your_model_revision>"
55
+ --tasks=<task_list>
56
+ --num_fewshot=<n_few_shot>
57
+ --batch_size=1
58
  --output_path=<output_path>
59
 
60
  ```
61
 
62
+ **Note:**
63
+ - We run `llama.cpp` series models on Xeon CPU and others on NVidia GPU.
64
+ - If model paramerters > 7B, we use `--batch_size 4`. If model parameters < 7B, we use `--batch_size 2`. And we set `--batch_size 1` for llama.cpp. You can expect results to vary slightly for different batch sizes because of padding.
65
+
66
+
67
 
68
+ ### The tasks and few shots parameters are:
69
  - ARC-C: 0-shot, *arc_challenge* (`acc`)
70
  - ARC-E: 0-shot, *arc_easy* (`acc`)
71
  - HellaSwag: 0-shot, *hellaswag* (`acc`)