Locutusque
commited on
Commit
•
9a917e2
1
Parent(s):
b9e7d10
Update README.md
Browse files
README.md
CHANGED
@@ -76,16 +76,7 @@ The model is evaluated based on several metrics, including loss, reward, penalty
|
|
76 |
- Perplexity: 19
|
77 |
- Loss: 1.7
|
78 |
|
79 |
-
Although these metrics seem mediocre, it's actually better because that way the model is able to make open-ended responses, but is still coherent to the user's input.
|
80 |
|
81 |
-
This model was also evaluated on Hugging Face's Open LLM Leaderboard
|
82 |
-
Model | Average | ARC (25-shot) | HellaSwag (10-shot) | MMLU (5-shot) | TruthfulQA (0-shot)
|
83 |
-
--- | --- | --- | --- | --- | ---
|
84 |
-
Locutusque/gpt2-conversational-or-qa | 30.9 | 21.3 | 27.6 | 27.5 | 47.3
|
85 |
-
gpt2 | 30.4 | 21.9 | 31.6 | 27.5 | 40.7
|
86 |
-
||||||
|
87 |
-
|
88 |
-
*This model performed excellently in TruthfulQA, outperforming gpt2 by nearly 7 points*
|
89 |
## Limitations and Bias
|
90 |
This model is not suitable for all use cases due to its limited training time on a weak computer. As a result, it may produce irrelevant or nonsensical responses. Additionally, it has not been fine-tuned to remember the chat history, is unable to provide follow-up responses, and it does not know the answer to many questions (it was only fine-tuned to respond in a conversational way). For optimal performance, we recommend using a GPU with at least 4GB of VRAM and downloading the model manually instead of using the Transformers library or deploying it on the Interface API. Here's how you should deploy the model:
|
91 |
|
|
|
76 |
- Perplexity: 19
|
77 |
- Loss: 1.7
|
78 |
|
|
|
79 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
## Limitations and Bias
|
81 |
This model is not suitable for all use cases due to its limited training time on a weak computer. As a result, it may produce irrelevant or nonsensical responses. Additionally, it has not been fine-tuned to remember the chat history, is unable to provide follow-up responses, and it does not know the answer to many questions (it was only fine-tuned to respond in a conversational way). For optimal performance, we recommend using a GPU with at least 4GB of VRAM and downloading the model manually instead of using the Transformers library or deploying it on the Interface API. Here's how you should deploy the model:
|
82 |
|