metadata
pipeline_tag: text-generation
inference: true
license: apache-2.0
Table of Contents
Model Summary
GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.
- Repository: ContextualAI/gritlm
- Paper: TODO
Use
The models usage is documented here. It supports GritLM, Transformers, Sentence Transformers.
Training
Model
- Architecture: Mistral-8x7B
- Steps: 250k pretraining & 30 instruction tuning
- Pretraining tokens: ? pretraining & 2M instruction tuning
- Precision: bfloat16
Hardware
- Pretraining:
- GPUs: 512 Tesla A100
- Training time: 1 day
- Instruction tuning:
- GPUs: 8 Tesla A100
- Training time: 4 hours
Software
https://github.com/ContextualAI/gritlm
Citation
TODO