Add README.md
Browse files
README.md
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
tags: []
|
4 |
+
---
|
5 |
+
|
6 |
+
# mambaoutai
|
7 |
+
|
8 |
+
# Usage
|
9 |
+
|
10 |
+
You need to install `transformers` from `main` until `transformers=4.39.0` is released.
|
11 |
+
|
12 |
+
```bash
|
13 |
+
pip install git+https://github.com/huggingface/transformers@main
|
14 |
+
```
|
15 |
+
|
16 |
+
We also recommend you to install both `causal_conv_1d` and `mamba-ssm` using:
|
17 |
+
|
18 |
+
```bash
|
19 |
+
pip install causal-conv1d>=1.2.0
|
20 |
+
pip install mamba-ssm
|
21 |
+
```
|
22 |
+
|
23 |
+
If any of these two is not installed, the "eager" implementation will be used. Otherwise the more optimised `cuda` kernels will be used.
|
24 |
+
|
25 |
+
## Generation
|
26 |
+
|
27 |
+
Use this snippet of code to generate text from the model:
|
28 |
+
|
29 |
+
```python
|
30 |
+
from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
|
31 |
+
import torch
|
32 |
+
|
33 |
+
tokenizer = AutoTokenizer.from_pretrained("lightonai/mambaoutai")
|
34 |
+
model = MambaForCausalLM.from_pretrained("lightonai/mambaoutai")
|
35 |
+
input_ids = tokenizer("What is a mamba?", return_tensors="pt")["input_ids"]
|
36 |
+
|
37 |
+
out = model.generate(input_ids, max_new_tokens=10)
|
38 |
+
print(tokenizer.batch_decode(out))
|
39 |
+
```
|
40 |
+
|
41 |
+
## Training checkpoints
|
42 |
+
|
43 |
+
You can find some of the training checkpoints in the repo branch. On branch corresponding to the model at some point in time during training.
|
44 |
+
|
45 |
+
You can do inference with these training checkpoints by adding the `revision` parameter to the `from_pretrained` method. For example, to load the model checkpoint after 30000 steps of pretraining, you can use the following code:
|
46 |
+
|
47 |
+
```python
|
48 |
+
from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
|
49 |
+
import torch
|
50 |
+
|
51 |
+
tokenizer = AutoTokenizer.from_pretrained("lightonai/mambaoutai", revision="pre-30000")
|
52 |
+
model = MambaForCausalLM.from_pretrained("lightonai/mambaoutai", revision="pre-30000")
|
53 |
+
input_ids = tokenizer("What is a mamba?", return_tensors="pt")["input_ids"]
|
54 |
+
|
55 |
+
out = model.generate(input_ids, max_new_tokens=10)
|
56 |
+
print(tokenizer.batch_decode(out))
|
57 |
+
```
|