JingweiZuo commited on
Commit
d4d0c04
β€’
1 Parent(s): fcc67e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -9,12 +9,13 @@ pinned: false
9
 
10
  **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
11
 
 
12
  * πŸ¦…πŸ¦… **The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
13
  * πŸ”₯ **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
14
 
15
  # News
16
 
17
- * 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and find the FalconMamba blogpost **[here](https://huggingface.co/blog/falconmamba)**.
18
  * πŸ“Έ **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
19
  * πŸŽ‰ **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
20
  * πŸ’₯ **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
@@ -24,12 +25,16 @@ pinned: false
24
 
25
  We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
26
 
 
 
 
27
  More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
28
 
29
  | **Artefact** | **Link** | **Type** | **Details** |
30
  |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
31
- | 🐍 **FalconMamba-7B** | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b) | *pretrained model* | 7B parameters pure SSM trained on ~5,500 billion tokens. |
32
  | FalconMamba-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) | *instruction/chat model* | Falcon-Mamba-7B finetuned using only SFT.|
 
33
  | FalconMamba-7B-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit) | *pretrained model* | 4bit quantized version using GGUF.|
34
  | FalconMamba-7B-Instruct-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit) | *instruction/chat model* | 4bit quantized version using GGUF.|
35
 
 
9
 
10
  **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
11
 
12
+ * πŸ¦…πŸ **The first SSLM model of the Falcon series has been released open-access, featuring [FalconMamba-7B](https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a).**
13
  * πŸ¦…πŸ¦… **The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
14
  * πŸ”₯ **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
15
 
16
  # News
17
 
18
+ * 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and check the **[FalconMamba Technical Report](https://arxiv.org/abs/2410.05355)** and **[FalconMamba blogpost](https://huggingface.co/blog/falconmamba)**.
19
  * πŸ“Έ **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
20
  * πŸŽ‰ **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
21
  * πŸ’₯ **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 
25
 
26
  We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
27
 
28
+ Papers:
29
+ - [FalconMamba Technical Report, Zuo et al. 2024](https://arxiv.org/abs/2410.05355)
30
+
31
  More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
32
 
33
  | **Artefact** | **Link** | **Type** | **Details** |
34
  |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
35
+ | 🐍 **FalconMamba-7B** | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b) | *pretrained model* | 7B parameters pure SSM trained on ~5,800 billion tokens. |
36
  | FalconMamba-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct) | *instruction/chat model* | Falcon-Mamba-7B finetuned using only SFT.|
37
+ | FalconMamba-7B-pre-decay | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-pre-decay) | *pretrained model* | Falcon-Mamba-7B pre-decay checkpoint.|
38
  | FalconMamba-7B-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit) | *pretrained model* | 4bit quantized version using GGUF.|
39
  | FalconMamba-7B-Instruct-4bit | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit) | *instruction/chat model* | 4bit quantized version using GGUF.|
40