pglo commited on
Commit
360d2b3
1 Parent(s): 625e567

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -19,7 +19,10 @@ In order to run optimized Mamba implementations on a CUDA device, you first need
19
  pip install mamba-ssm causal-conv1d>=1.2.0
20
  ```
21
 
22
- You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency. In order to do that, you'll need to specify `use_mamba_kernels=False` when loading the model.
 
 
 
23
 
24
  ## Inference
25
 
 
19
  pip install mamba-ssm causal-conv1d>=1.2.0
20
  ```
21
 
22
+ You can run the model not using the optimized Mamba kernels, but it is **not** recommended as it will result in significantly higher latency.
23
+
24
+ To run on CPU, please specify `use_mamba_kernels=False` when loading the model using ``AutoModelForCausalLM.from_pretrained``.
25
+
26
 
27
  ## Inference
28