nicholasKluge commited on
Commit
98c4584
1 Parent(s): 462feae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -43,7 +43,7 @@ inference:
43
  top_p: 0.5
44
  max_new_tokens: 200
45
  co2_eq_emissions:
46
- emissions: 15
47
  source: CodeCarbon
48
  training_type: pre-training
49
  geographical_location: Germany
@@ -75,8 +75,8 @@ Teeny-tiny-llama has been trained by leveraging scaling laws to determine the op
75
  - **Optimizer:** `torch.optim.AdamW` (warmup_ratio = 0.01, learning_rate = 6e-4, epsilon = 1e-8)
76
  - **GPU:** 1 NVIDIA A100-SXM4-40GB
77
  - **Training time**: ~ 36 hours
78
- - **Emissions:** 15 KgCO2 (Germany)
79
- - **Total Energy Consumption:** 42 kWh
80
 
81
  This repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.
82
 
@@ -138,18 +138,20 @@ model = AutoModelForCausalLM.from_pretrained("nicholasKluge/Teeny-tiny-llama-162
138
 
139
  ## Evaluations
140
 
 
 
 
 
 
 
 
 
 
141
  | Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
142
  |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
143
  | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
144
 
145
- * Evaluations were performed using the [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) (by [EleutherAI](https://www.eleuther.ai/)). Thanks to [Laiviet](https://github.com/laiviet/lm-evaluation-harness) for translating some of the tasks in the LM-Evaluation-Harness.
146
-
147
- | Steps | Evaluation Loss | Perplexity | Total Energy Consumption |
148
- |---------|-----------------|------------|--------------------------|
149
- | 100.000 | 3.19 | 24.52 | 3.75 kWh |
150
- | 200.000 | 3.02 | 20.58 | 7.51 kWh |
151
- | 300.000 | 2.83 | 16.98 | 11.25 kWh |
152
- | 400.000 | 2.79 | 16.41 | 30.20 kWh |
153
 
154
  ## Cite as 🤗
155
 
 
43
  top_p: 0.5
44
  max_new_tokens: 200
45
  co2_eq_emissions:
46
+ emissions: 5.6
47
  source: CodeCarbon
48
  training_type: pre-training
49
  geographical_location: Germany
 
75
  - **Optimizer:** `torch.optim.AdamW` (warmup_ratio = 0.01, learning_rate = 6e-4, epsilon = 1e-8)
76
  - **GPU:** 1 NVIDIA A100-SXM4-40GB
77
  - **Training time**: ~ 36 hours
78
+ - **Emissions:** 5.6 KgCO2 (Germany)
79
+ - **Total Energy Consumption:** 15.5 kWh
80
 
81
  This repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.
82
 
 
138
 
139
  ## Evaluations
140
 
141
+ | Steps | Evaluation Loss | Perplexity | Total Energy Consumption | Emissions |
142
+ |---------|-----------------|------------|--------------------------|------------|
143
+ | 100.000 | 3.19 | 24.52 | 3.75 kWh | 1.28 CO2eq |
144
+ | 200.000 | 3.02 | 20.58 | 7.51 kWh | 2.56 CO2eq |
145
+ | 300.000 | 2.83 | 16.98 | 11.25 kWh | 3.84 CO2eq |
146
+ | 400.000 | 2.79 | 16.41 | 14.52 kWh | 5.11 CO2eq |
147
+
148
+ ## Benchmarks
149
+
150
  | Models | Average | [ARC](https://arxiv.org/abs/1803.05457) | [Hellaswag](https://arxiv.org/abs/1905.07830) | [MMLU](https://arxiv.org/abs/2009.03300) | [TruthfulQA](https://arxiv.org/abs/2109.07958) |
151
  |-------------------------------------------------------------------------------------|---------|-----------------------------------------|-----------------------------------------------|------------------------------------------|------------------------------------------------|
152
  | [Gpt2-portuguese-small](https://huggingface.co/pierreguillou/gpt2-small-portuguese) | 30.22 | 22.48 | 29.62 | 27.36 | 41.44 |
153
 
154
+ * Evaluations on benchmarks were performed using the [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) (by [EleutherAI](https://www.eleuther.ai/)). Thanks to [Laiviet](https://github.com/laiviet/lm-evaluation-harness) for translating some of the tasks in the LM-Evaluation-Harness.
 
 
 
 
 
 
 
155
 
156
  ## Cite as 🤗
157