Cyrile commited on
Commit
bb07343
1 Parent(s): 28765ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -11
README.md CHANGED
@@ -51,17 +51,18 @@ Evaluation of the model was conducted using the PoLL (Pool of LLM) technique, as
51
  (two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
52
 
53
  **Performance Scores (on a scale of 5):**
54
- | Model | Score |
55
- |---------------------------------------------:|:--------|
56
- | gpt-4o | 4.13 |
57
- | mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 |
58
- | gpt-3.5-turbo | 3.66 |
59
- | cmarkea/bloomz-7b1-mt-sft-chat | 1.69 |
60
- | cmarkea/bloomz-3b-dpo-chat | 1.68 |
61
- | cmarkea/bloomz-3b-sft-chat | 1.51 |
62
- | croissantllm/CroissantLLMChat-v0.1 | 1.19 |
63
- | cmarkea/bloomz-560m-sft-chat | 1.04 |
64
- | OpenLLM-France/Claire-Mistral-7B-0.1 | 0.38 |
 
65
 
66
  The bloomz-3b-dpo-chat model demonstrates improved performance over its SFT counterpart, particularly in zero-shot contexts, making it a competitive choice for
67
  production environments.
 
51
  (two per evaluator). The evaluators included GPT-4o, Gemini-1.5-pro, and Claude3.5-sonnet.
52
 
53
  **Performance Scores (on a scale of 5):**
54
+ | Model | Score | # params |
55
+ |---------------------------------------------:|:-------:|:--------:|
56
+ | gpt-4o | 4.13 | N/A |
57
+ | mistralai/Mixtral-8x7B-Instruct-v0.1 | 3.71 | 46.7b |
58
+ | gpt-3.5-turbo | 3.66 | 175b |
59
+ | mistralai/Mistral-7B-Instruct-v0.2 | 1.98 | 7.25b |
60
+ | cmarkea/bloomz-7b1-mt-sft-chat | 1.69 | 7.1b |
61
+ | cmarkea/bloomz-3b-dpo-chat | 1.68 | 3b |
62
+ | cmarkea/bloomz-3b-sft-chat | 1.51 | 3b |
63
+ | croissantllm/CroissantLLMChat-v0.1 | 1.19 | 1.3b |
64
+ | cmarkea/bloomz-560m-sft-chat | 1.04 | 0.56b |
65
+ | OpenLLM-France/Claire-Mistral-7B-0.1 | 0.38 | 7.25b |
66
 
67
  The bloomz-3b-dpo-chat model demonstrates improved performance over its SFT counterpart, particularly in zero-shot contexts, making it a competitive choice for
68
  production environments.