tiiuae
/

falcon-mamba-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

yellowvm commited on Jul 25, 2024

Commit

3adb85d

·

verified ·

1 Parent(s): 2b10f59

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -198,8 +198,8 @@ We evaluate our model on all benchmarks of the new leaderboard's version using t
 | `model name`              |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
 |:--------------------------|:------:|:-----:|:---------:|:-----:|:-----:|:--------:|:-------:|
 | ***Pure SSM models***     |        |       |           |       |       |          |         |
-| `FalconMamba-7B`         | 33.36  | 19.88 |    3.63   | 8.05  | 10.86 | 14.47    |**15.04**|
-| `TRI-ML/mamba-7b-rw`      | 22.46  | 6.71  | 0.45      | 1.12  | 5.51  | 1.69     | 6.25    |
 |***Hybrid SSM-attention models***   |       |           |       |       |          |         |
 |`recurrentgemma-9b`        | 30.76  | 14.80 | 4.83      | 4.70  | 6.60  | 17.88    |  13.20  |
 | `Zyphra/Zamba-7B-v1`      | 24.06  | 21.12 | 3.32      | 3.03  | 7.74  | 16.02    | 12.55   |
@@ -229,6 +229,8 @@ Also, we evaluate our model on the benchmarks of the first leaderboard using `li
 | `Mistral-7B-v0.1`            | 59.98  | 83.31     | 64.16 | 78.37      | 42.15      | 37.83 | 60.97            |
 | `gemma-7B`                   | 61.09  |   82.20   | 64.56 |   79.01    |   44.79    | 50.87 |  63.75           |
 ## Throughput
 This model can achieve comparable throughput and performance compared to other transformer based models that use optimized kernels such as Flash Attention 2. Make sure to install the optimized Mamba kernels with the following commands:

 | `model name`              |`IFEval`| `BBH` |`MATH LvL5`| `GPQA`| `MUSR`|`MMLU-PRO`|`Average`|
 |:--------------------------|:------:|:-----:|:---------:|:-----:|:-----:|:--------:|:-------:|
 | ***Pure SSM models***     |        |       |           |       |       |          |         |
+| `FalconMamba-7B`          | 33.36  | 19.88 |    3.63   | 8.05  | 10.86 | 14.47    |**15.04**|
+| `TRI-ML/mamba-7b-rw`<sup>*</sup>| 22.46  | 6.71  | 0.45      | 1.12  | 5.51  | 1.69     | 6.25    |
 |***Hybrid SSM-attention models***   |       |           |       |       |          |         |
 |`recurrentgemma-9b`        | 30.76  | 14.80 | 4.83      | 4.70  | 6.60  | 17.88    |  13.20  |
 | `Zyphra/Zamba-7B-v1`      | 24.06  | 21.12 | 3.32      | 3.03  | 7.74  | 16.02    | 12.55   |
 | `Mistral-7B-v0.1`            | 59.98  | 83.31     | 64.16 | 78.37      | 42.15      | 37.83 | 60.97            |
 | `gemma-7B`                   | 61.09  |   82.20   | 64.56 |   79.01    |   44.79    | 50.87 |  63.75           |
+The evaluation results were borrowed from both leaderboards. For the models with no leaderboard results (marked by *star*), we evalueated the tasks internally.
 ## Throughput
 This model can achieve comparable throughput and performance compared to other transformer based models that use optimized kernels such as Flash Attention 2. Make sure to install the optimized Mamba kernels with the following commands: