Spaces:
Running
Running
JingweiZuo
commited on
Commit
β’
d4d0c04
1
Parent(s):
fcc67e6
Update README.md
Browse files
README.md
CHANGED
@@ -9,12 +9,13 @@ pinned: false
|
|
9 |
|
10 |
**Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
|
11 |
|
|
|
12 |
* π¦
π¦
**The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
|
13 |
* π₯ **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
|
14 |
|
15 |
# News
|
16 |
|
17 |
-
* π **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and
|
18 |
* πΈ **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
|
19 |
* π **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
|
20 |
* π₯ **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
@@ -24,12 +25,16 @@ pinned: false
|
|
24 |
|
25 |
We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
|
26 |
|
|
|
|
|
|
|
27 |
More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
|
28 |
|
29 |
| **Artefact** | **Link** | **Type** | **Details** |
|
30 |
|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
|
31 |
-
| π **FalconMamba-7B** | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b) | *pretrained model* | 7B parameters pure SSM trained on ~5,
|
32 |
| FalconMamba-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) | *instruction/chat model* | Falcon-Mamba-7B finetuned using only SFT.|
|
|
|
33 |
| FalconMamba-7B-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit) | *pretrained model* | 4bit quantized version using GGUF.|
|
34 |
| FalconMamba-7B-Instruct-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit) | *instruction/chat model* | 4bit quantized version using GGUF.|
|
35 |
|
|
|
9 |
|
10 |
**Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
|
11 |
|
12 |
+
* π¦
π **The first SSLM model of the Falcon series has been released open-access, featuring [FalconMamba-7B](https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a).**
|
13 |
* π¦
π¦
**The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
|
14 |
* π₯ **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
|
15 |
|
16 |
# News
|
17 |
|
18 |
+
* π **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and check the **[FalconMamba Technical Report](https://arxiv.org/abs/2410.05355)** and **[FalconMamba blogpost](https://huggingface.co/blog/falconmamba)**.
|
19 |
* πΈ **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
|
20 |
* π **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
|
21 |
* π₯ **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
|
|
|
25 |
|
26 |
We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
|
27 |
|
28 |
+
Papers:
|
29 |
+
- [FalconMamba Technical Report, Zuo et al. 2024](https://arxiv.org/abs/2410.05355)
|
30 |
+
|
31 |
More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
|
32 |
|
33 |
| **Artefact** | **Link** | **Type** | **Details** |
|
34 |
|---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
|
35 |
+
| π **FalconMamba-7B** | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b) | *pretrained model* | 7B parameters pure SSM trained on ~5,800 billion tokens. |
|
36 |
| FalconMamba-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) | *instruction/chat model* | Falcon-Mamba-7B finetuned using only SFT.|
|
37 |
+
| FalconMamba-7B-pre-decay | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-pre-decay) | *pretrained model* | Falcon-Mamba-7B pre-decay checkpoint.|
|
38 |
| FalconMamba-7B-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit) | *pretrained model* | 4bit quantized version using GGUF.|
|
39 |
| FalconMamba-7B-Instruct-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit) | *instruction/chat model* | 4bit quantized version using GGUF.|
|
40 |
|