TRI-ML
/

DCLM-1B-v0

achal-tri commited on Jul 18, 2024

Commit

6a413ee

•

1 Parent(s): 22f569d

Add results comparison to SmolLM

Files changed (1) hide show

README.md CHANGED Viewed

@@ -115,6 +115,16 @@ Here are the evaluation results for DCLM-1B on various tasks (using [llm-foundry
 Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
 ## Limitations and Biases
 While DCLM-1B demonstrates strong performance across a range of tasks, it's important to note:

 Note: All scores are presented as decimal values between 0 and 1, representing the proportion of correct answers or the model's performance on each task.
+Below we compare to the recently released SmolLM (https://huggingface.co/blog/smollm) on key benchmarks. As described in the paper, Core accuracy is the average of
+centered accuracy on 22 tasks (including HellaSwag and ARC-E), Extended is centered accuracy averaged over 53 tasks.
+We evaluate the models using llm-foundry.
+| Task    | Core | Extended | MMLU 5-shot |
+|---------|------|----------|-------------|
+| DCLM-1B | 42.3 | 25.1     | 41.9        |
+| SmolLM  | 36.3 | 21.2     | 30.0        |
 ## Limitations and Biases
 While DCLM-1B demonstrates strong performance across a range of tasks, it's important to note: