File size: 1,061 Bytes
9a00d10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
pipeline_tag: text-generation
inference: true
license: apache-2.0
---

# Table of Contents

1. [Model Summary](#model-summary)
2. [Use](#use)
3. [Training](#training)
4. [Citation](#citation)

# Model Summary

> GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.

- **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
- **Paper:** [TODO](https://arxiv.org/abs/2308.07124)

# Use

The models usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.

# Training

## Model

- **Architecture:** Mistral-8x7B
- **Steps:** 250k pretraining & 30 instruction tuning
- **Pretraining tokens:** ? pretraining & 2M instruction tuning
- **Precision:** bfloat16

## Hardware

- **Pretraining:**
  - **GPUs:** 512 Tesla A100
  - **Training time:** 1 day
- **Instruction tuning:**
  - **GPUs:** 8 Tesla A100
  - **Training time:** 4 hours

## Software

https://github.com/ContextualAI/gritlm

# Citation

```bibtex
TODO
```