Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -6,6 +6,9 @@ datasets:
6
  language:
7
  - fr
8
  - en
 
 
 
9
  ---
10
 
11
  # Mambaoutai 1.6B
@@ -20,14 +23,14 @@ You need to install `transformers` from `main` until `transformers=4.39.0` is re
20
  pip install git+https://github.com/huggingface/transformers@main
21
  ```
22
 
23
- We also recommend you to install both `causal_conv_1d` and `mamba-ssm` using:
24
 
25
  ```bash
26
  pip install causal-conv1d>=1.2.0
27
  pip install mamba-ssm
28
  ```
29
 
30
- If any of these two is not installed, the "eager" implementation will be used. Otherwise the more optimised `cuda` kernels will be used.
31
 
32
  ### Generation
33
 
@@ -56,7 +59,8 @@ print(tokenizer.batch_decode(out))
56
 
57
  You can find some of the training checkpoints in the repo branch. On branch corresponding to the model at some point in time during training.
58
 
59
- You can do inference with these training checkpoints by adding the `revision` parameter to the `from_pretrained` method. For example, to load the model checkpoint after 30000 steps of pretraining, you can use the following code:
 
60
 
61
  ```python
62
  from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
 
6
  language:
7
  - fr
8
  - en
9
+ metrics:
10
+ - accuracy
11
+ - perplexity
12
  ---
13
 
14
  # Mambaoutai 1.6B
 
23
  pip install git+https://github.com/huggingface/transformers@main
24
  ```
25
 
26
+ We also recommend you to install both `causal-conv1d` and `mamba-ssm` using:
27
 
28
  ```bash
29
  pip install causal-conv1d>=1.2.0
30
  pip install mamba-ssm
31
  ```
32
 
33
+ If any of these two is not installed, the "eager" implementation will be used(not recommended). Otherwise the more optimised `cuda` kernels will be used.
34
 
35
  ### Generation
36
 
 
59
 
60
  You can find some of the training checkpoints in the repo branch. On branch corresponding to the model at some point in time during training.
61
 
62
+ You can do inference with these training checkpoints by adding the `revision` parameter to the `from_pretrained` method.
63
+ For example, to load the model checkpoint after 30000 steps of pretraining, you can use the following code:
64
 
65
  ```python
66
  from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer