tiiuae
/

falcon-mamba-7b-instruct-Q8_0-GGUF

Inference Endpoints

Model card Files Files and versions Community

ybelkada commited on Aug 18, 2024

Commit

721af8d

·

verified ·

1 Parent(s): 84468e1

Update README.md

Files changed (1) hide show

README.md +0 -13

README.md CHANGED Viewed

@@ -131,19 +131,6 @@ Also, we evaluate our model on the benchmarks of the first leaderboard using `li
 Mostly, we took evaluation results from both leaderboards. For the models marked by *star* we evaluated the tasks internally, while for the models marked by two *stars* the results were taken from paper or model card.
-## Throughput
-This model can achieve comparable throughput and performance compared to other transformer based models that use optimized kernels such as Flash Attention 2. Make sure to install the optimized Mamba kernels with the following commands:
-```bash
-pip install "causal-conv1d>=1.4.0" mamba-ssm
-```
-Refer to our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba) for more details about performance evaluation.
-<br>
 # Technical Specifications
 ## Model Architecture and Objective

 Mostly, we took evaluation results from both leaderboards. For the models marked by *star* we evaluated the tasks internally, while for the models marked by two *stars* the results were taken from paper or model card.
 # Technical Specifications
 ## Model Architecture and Objective