Spaces:

tiiuae
/

README

Running

App Files Files Community

JingweiZuo commited on Oct 10

Commit

d4d0c04

•

1 Parent(s): fcc67e6

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -9,12 +9,13 @@ pinned: false
 **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
 * 🦅🦅 **The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
 * 🔥 **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
 # News
-* 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and find the FalconMamba blogpost **[here](https://huggingface.co/blog/falconmamba)**.
 * 📸 **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
 * 🎉 **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
 * 💥 **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
@@ -24,12 +25,16 @@ pinned: false
 We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
 More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
 | **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
 |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
-| 🐍 **FalconMamba-7B**       | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b)                  | *pretrained model*        | 7B parameters pure SSM trained on ~5,500 billion tokens.                   |
 | FalconMamba-7B-Instruct  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct)         | *instruction/chat model*  | Falcon-Mamba-7B finetuned using only SFT.|
 | FalconMamba-7B-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit)         | *pretrained model*  | 4bit quantized version using GGUF.|
 | FalconMamba-7B-Instruct-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit)         | *instruction/chat model*  | 4bit quantized version using GGUF.|

 **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.**
+* 🦅🐍 **The first SSLM model of the Falcon series has been released open-access, featuring [FalconMamba-7B](https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a).**
 * 🦅🦅 **The second generation of Falcon models has been released open-access, featuring [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B) and [Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm).**
 * 🔥 **[Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)**
 # News
+* 🐍 **[FalconMamba-7B](https://huggingface.co/tiiuae/falcon-mamba-7b) is now available.** The first pure SSM model of the Falcon series released under the same permissive license. You can interact with it [here](https://huggingface.co/spaces/tiiuae/falcon-mamba-playground), and check the **[FalconMamba Technical Report](https://arxiv.org/abs/2410.05355)** and **[FalconMamba blogpost](https://huggingface.co/blog/falconmamba)**.
 * 📸 **[Falcon2-11B-vlm](https://huggingface.co/tiiuae/falcon-11B-vlm) is now available.** Built on top of the Falcon2-11B model, and released under the same permissive license, this open source model allows users to interact with image content via text.
 * 🎉 **TII has just released a new generation of models, starting with [Falcon2-11B](https://huggingface.co/tiiuae/falcon-11B)**, a 11B parameters causal decoder-only model and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
 * 💥 **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 We are excited to announce the release of our groundbreaking LLM model with a pure SSM architecture, setting a new benchmark by outperforming all previous SSM models and achieving performance on par with leading transformer-based models.
+Papers:
+- [FalconMamba Technical Report, Zuo et al. 2024](https://arxiv.org/abs/2410.05355)
 More details on the new models and their performance can also be found in our [FalconMamba blogpost](https://huggingface.co/blog/falconmamba).
 | **Artefact**        | **Link**                                                         | **Type**                | **Details**                                                       |
 |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------|
+| 🐍 **FalconMamba-7B**       | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b)                  | *pretrained model*        | 7B parameters pure SSM trained on ~5,800 billion tokens.                   |
 | FalconMamba-7B-Instruct  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct)         | *instruction/chat model*  | Falcon-Mamba-7B finetuned using only SFT.|
+| FalconMamba-7B-pre-decay  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-pre-decay)         | *pretrained model*  | Falcon-Mamba-7B pre-decay checkpoint.|
 | FalconMamba-7B-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-4bit)         | *pretrained model*  | 4bit quantized version using GGUF.|
 | FalconMamba-7B-Instruct-4bit  | [Here](https://huggingface.co/tiiuae/falcon-mamba-7b-instruct-4bit)         | *instruction/chat model*  | 4bit quantized version using GGUF.|