BSC-LT
/

salamandraTA-7b-instruct

text-generation

text-generation-inference

Inference Endpoints

🇪🇺 Region: EU

Model card Files Files and versions Community

javi8979 commited on 20 days ago

Commit

348bb9a

·

verified ·

1 Parent(s): 9d9580e

Update README.md

Files changed (1) hide show

README.md +26 -0

README.md CHANGED Viewed

@@ -51,6 +51,32 @@ SalamandraTA-7b-instruct is a translation LLM that has been instruction-tuned fr
 > **DISCLAIMER:** This version of Salamandra is tailored exclusively for translation tasks. It lacks chat capabilities and has not been trained with any chat instructions.
 ## How to use
 You can translate between the following 37 languages:

 > **DISCLAIMER:** This version of Salamandra is tailored exclusively for translation tasks. It lacks chat capabilities and has not been trained with any chat instructions.
+---
+## Hardware and Software
+### Training Framework
+SalamandraTA-7b-base was continually pre-trained using NVIDIA’s [NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html),
+which leverages PyTorch Lightning for efficient model training in highly distributed settings.
+SalamandraTA-7b-instruct was produced with [FastChat](https://github.com/lm-sys/FastChat).
+### Compute Infrastructure
+All models were trained on [MareNostrum 5](https://www.bsc.es/ca/marenostrum/marenostrum-5), a pre-exascale EuroHPC supercomputer hosted and
+operated by Barcelona Supercomputing Center.
+The accelerated partition is composed of 1,120 nodes with the following specifications:
+- 4x Nvidia Hopper GPUs with 64GB HBM2 memory
+- 2x Intel Sapphire Rapids 8460Y+ at 2.3Ghz and 32c each (64 cores)
+- 4x NDR200 (BW per node 800Gb/s)
+- 512 GB of Main memory (DDR5)
+- 460GB on NVMe storage
+---
 ## How to use
 You can translate between the following 37 languages: