pminervini commited on
Commit
9cbf014
1 Parent(s): a008a91
Files changed (1) hide show
  1. src/display/about.py +3 -5
src/display/about.py CHANGED
@@ -54,11 +54,9 @@ As large language models (LLMs) get better at creating believable texts, address
54
  # Reproducibility
55
  To reproduce our results, here is the commands you can run, using [this script](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/blob/main/backend-cli.py): python backend-cli.py.
56
 
57
- Alternatively, if you're interested in evaluating a specific task with a particular model, you can use [this script](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463) of the Eleuther AI Harness:
58
- `python main.py --model=hf-causal-experimental --model_args="pretrained=<your_model>,revision=<your_model_revision>"`
59
- ` --tasks=<task_list> --num_fewshot=<n_few_shot> --batch_size=1 --output_path=<output_path>` (Note that you may need to add tasks from [here](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/tree/main/src/backend/tasks) to [this folder](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463/lm_eval/tasks))
60
-
61
- The total batch size we get for models which fit on one A100 node is 8 (8 GPUs * 1). If you don't use parallelism, adapt your batch size to fit. You can expect results to vary slightly for different batch sizes because of padding.
62
 
63
  The tasks and few shots parameters are:
64
 
 
54
  # Reproducibility
55
  To reproduce our results, here is the commands you can run, using [this script](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/blob/main/backend-cli.py): python backend-cli.py.
56
 
57
+ Alternatively, if you're interested in evaluating a specific task with a particular model, you can use the [EleutherAI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness/):
58
+ `python main.py --model=hf-causal-experimental --model_args="pretrained=<your_model>,parallelize=True,revision=<your_model_revision>"`
59
+ ` --tasks=<task_list> --num_fewshot=<n_few_shot> --batch_size=auto --output_path=<output_path>` (Note that you may need to add tasks from [here](https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/tree/main/src/backend/tasks) to [this folder](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463/lm_eval/tasks))
 
 
60
 
61
  The tasks and few shots parameters are:
62