Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,7 @@
|
|
2 |
base_model:
|
3 |
- tiiuae/Falcon3-10B-Base
|
4 |
library_name: transformers
|
|
|
5 |
tags:
|
6 |
- mergekit
|
7 |
- merge
|
@@ -23,7 +24,7 @@ Quantizations available [here](https://huggingface.co/arcee-ai/Virtuoso-Lite-GGU
|
|
23 |
- **Distillation Data:**
|
24 |
- ~1.1B tokens/logits from Deepseek-v3’s training data.
|
25 |
- Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
|
26 |
-
- **License:** [
|
27 |
|
28 |
### Background on Deepseek Distillation
|
29 |
Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
|
@@ -76,6 +77,6 @@ Virtuoso-Lite demonstrates strong results across multiple benchmarks (e.g., BBH,
|
|
76 |
- **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
|
77 |
-
|
78 |
### License
|
79 |
-
**Virtuoso-Lite (10B)** is released under the [
|
80 |
|
81 |
If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!
|
|
|
2 |
base_model:
|
3 |
- tiiuae/Falcon3-10B-Base
|
4 |
library_name: transformers
|
5 |
+
license: other
|
6 |
tags:
|
7 |
- mergekit
|
8 |
- merge
|
|
|
24 |
- **Distillation Data:**
|
25 |
- ~1.1B tokens/logits from Deepseek-v3’s training data.
|
26 |
- Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
|
27 |
+
- **License:** [falcon-llm-license](https://falconllm.tii.ae/falcon-terms-and-conditions.html)
|
28 |
|
29 |
### Background on Deepseek Distillation
|
30 |
Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
|
|
|
77 |
- **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
|
78 |
-
|
79 |
### License
|
80 |
+
**Virtuoso-Lite (10B)** is released under the [falcon-llm-license License](https://falconllm.tii.ae/falcon-terms-and-conditions.html). You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
|
81 |
|
82 |
If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!
|