shanearora commited on
Commit
501c498
1 Parent(s): 704314f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -168,7 +168,7 @@ Both stages contribute equally to the final performance of the OLMo model. After
168
  OLMo 7B architecture with peer models for comparison.
169
 
170
  | | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
171
- |------------------------|-------------------|-------------------|---------------------|--------------------|------------------|
172
  | d_model | 4096 | 4096 | 4096 | 4096 | 4544 | 4096 |
173
  | num heads | 32 | 32 | 32 | 32 | 71 | 16 |
174
  | num layers | 32 | 32 | 32 | 32 | 32 | 32 |
@@ -197,7 +197,7 @@ AdamW optimizer parameters are shown below.
197
  Optimizer settings comparison with peer models.
198
 
199
  | | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) |
200
- |-----------------------|------------------|---------------------|--------------------|--------------------|
201
  | warmup steps | 2500 | 5000 | 2000 | 2000 | 1000 |
202
  | peak LR | 3.0E-04 | 3.0E-04 | 3.0E-04 | 3.0E-04 | 6.0E-04 |
203
  | minimum LR | 3.0E-05 | 3.0E-05 | 3.0E-05 | 3.0E-05 | 1.2E-05 |
@@ -212,7 +212,7 @@ Optimizer settings comparison with peer models.
212
 
213
 
214
 
215
- ## Environmental Impact
216
 
217
  OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
218
  A summary of the environmental impact. Further details are available in the paper.
@@ -220,7 +220,7 @@ A summary of the environmental impact. Further details are available in the pape
220
  | | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
221
  |-----------|------------|-----------------------------|--------------------------------|---------------------------|
222
  | OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
223
- | OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 |
224
 
225
  ## Bias, Risks, and Limitations
226
 
 
168
  OLMo 7B architecture with peer models for comparison.
169
 
170
  | | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
171
+ |------------------------|-------------------|-------------------|---------------------|--------------------|--------------------|------------------|
172
  | d_model | 4096 | 4096 | 4096 | 4096 | 4544 | 4096 |
173
  | num heads | 32 | 32 | 32 | 32 | 71 | 16 |
174
  | num layers | 32 | 32 | 32 | 32 | 32 | 32 |
 
197
  Optimizer settings comparison with peer models.
198
 
199
  | | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) |
200
+ |-----------------------|------------------|------------------|---------------------|--------------------|--------------------\
201
  | warmup steps | 2500 | 5000 | 2000 | 2000 | 1000 |
202
  | peak LR | 3.0E-04 | 3.0E-04 | 3.0E-04 | 3.0E-04 | 6.0E-04 |
203
  | minimum LR | 3.0E-05 | 3.0E-05 | 3.0E-05 | 3.0E-05 | 1.2E-05 |
 
212
 
213
 
214
 
215
+ <!-- ## Environmental Impact
216
 
217
  OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
218
  A summary of the environmental impact. Further details are available in the paper.
 
220
  | | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
221
  |-----------|------------|-----------------------------|--------------------------------|---------------------------|
222
  | OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
223
+ | OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 | -->
224
 
225
  ## Bias, Risks, and Limitations
226