allenai
/

OLMo-7B-0724-hf

@@ -168,7 +168,7 @@ Both stages contribute equally to the final performance of the OLMo model. After
 OLMo 7B architecture with peer models for comparison.
 |                        | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
-|------------------------|-------------------|-------------------|---------------------|--------------------|------------------|
 | d_model                | 4096              | 4096              | 4096                | 4096               | 4544               | 4096             |
 | num heads              | 32                | 32                | 32                  | 32                 | 71                 | 16               |
 | num layers             | 32                | 32                | 32                  | 32                 | 32                 | 32               |
@@ -197,7 +197,7 @@ AdamW optimizer parameters are shown below.
 Optimizer settings comparison with peer models.
 |                       | **OLMo 7B July 2024**  | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) |
-|-----------------------|------------------|---------------------|--------------------|--------------------|
 | warmup steps          | 2500             | 5000             | 2000                | 2000               | 1000               |
 | peak LR               | 3.0E-04          | 3.0E-04          | 3.0E-04             | 3.0E-04            | 6.0E-04            |
 | minimum LR            | 3.0E-05          | 3.0E-05          | 3.0E-05             | 3.0E-05            | 1.2E-05            |
@@ -212,7 +212,7 @@ Optimizer settings comparison with peer models.
-## Environmental Impact
 OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
 A summary of the environmental impact. Further details are available in the paper.
@@ -220,7 +220,7 @@ A summary of the environmental impact. Further details are available in the pape
 |           | GPU Type   | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
 |-----------|------------|-----------------------------|--------------------------------|---------------------------|
 | OLMo 7B Twin  | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu))   |  135 MWh                     | 0*                             | 0*                        |
-| OLMo 7B   | A100-40GB ([MosaicML](https://www.mosaicml.com)) |  104 MWh                     | 0.656                          | 75.05                     |
 ## Bias, Risks, and Limitations

 OLMo 7B architecture with peer models for comparison.
 |                        | **OLMo 7B July 2024** | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
+|------------------------|-------------------|-------------------|---------------------|--------------------|--------------------|------------------|
 | d_model                | 4096              | 4096              | 4096                | 4096               | 4544               | 4096             |
 | num heads              | 32                | 32                | 32                  | 32                 | 71                 | 16               |
 | num layers             | 32                | 32                | 32                  | 32                 | 32                 | 32               |
 Optimizer settings comparison with peer models.
 |                       | **OLMo 7B July 2024**  | [OLMo 1.0 7B](https://huggingface.co/allenai/OLMo-7B-hf) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) |
+|-----------------------|------------------|------------------|---------------------|--------------------|--------------------\
 | warmup steps          | 2500             | 5000             | 2000                | 2000               | 1000               |
 | peak LR               | 3.0E-04          | 3.0E-04          | 3.0E-04             | 3.0E-04            | 6.0E-04            |
 | minimum LR            | 3.0E-05          | 3.0E-05          | 3.0E-05             | 3.0E-05            | 1.2E-05            |
+<!-- ## Environmental Impact
 OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
 A summary of the environmental impact. Further details are available in the paper.
 |           | GPU Type   | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
 |-----------|------------|-----------------------------|--------------------------------|---------------------------|
 | OLMo 7B Twin  | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu))   |  135 MWh                     | 0*                             | 0*                        |
+| OLMo 7B   | A100-40GB ([MosaicML](https://www.mosaicml.com)) |  104 MWh                     | 0.656                          | 75.05                     | -->
 ## Bias, Risks, and Limitations