pglo commited on
Commit
7494a05
1 Parent(s): 360d2b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -9,22 +9,23 @@ Zamba-7B-v1 is a hybrid model between Mamba, a state-space model, and transforme
9
 
10
  ### Presequities
11
 
12
- Zamba requires you use `transformers` version 4.39.0 or higher:
13
- ```bash
14
- pip install transformers>=4.39.0
15
- ```
 
16
 
17
- In order to run optimized Mamba implementations on a CUDA device, you first need to install `mamba-ssm` and `causal-conv1d`:
18
  ```bash
19
  pip install mamba-ssm causal-conv1d>=1.2.0
20
  ```
21
 
22
- You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
23
 
24
  To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
25
 
26
 
27
- ## Inference
28
 
29
  ```python
30
  from transformers import AutoTokenizer, AutoModelForCausalLM
 
9
 
10
  ### Presequities
11
 
12
+ To download Zamba, clone Zyphra's fork of transformers:
13
+ 1. `git clone https://github.com/Zyphra/transformers_zamba`
14
+ 2. `cd transformers_zamba`
15
+ 3. Install the repository: `pip install -e .`
16
+
17
 
18
+ In order to run optimized Mamba implementations on a CUDA device, you need to install `mamba-ssm` and `causal-conv1d`:
19
  ```bash
20
  pip install mamba-ssm causal-conv1d>=1.2.0
21
  ```
22
 
23
+ You can run the model without using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
24
 
25
  To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
26
 
27
 
28
+ ### Inference
29
 
30
  ```python
31
  from transformers import AutoTokenizer, AutoModelForCausalLM