arcee-ai
/

Virtuoso-Lite

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Crystalcareai commited on 17 days ago

Commit

ef3a352

·

verified ·

1 Parent(s): efc17a8

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 base_model:
 - tiiuae/Falcon3-10B-Base
 library_name: transformers
 tags:
 - mergekit
 - merge
@@ -23,7 +24,7 @@ Quantizations available [here](https://huggingface.co/arcee-ai/Virtuoso-Lite-GGU
 - **Distillation Data:**
   - ~1.1B tokens/logits from Deepseek-v3’s training data.
   - Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
-- **License:** [Apache-2.0](#license)
 ### Background on Deepseek Distillation
 Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
@@ -76,6 +77,6 @@ Virtuoso-Lite demonstrates strong results across multiple benchmarks (e.g., BBH,
 - **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
 -
 ### License
-**Virtuoso-Lite (10B)** is released under the [Apache-2.0 License](https://www.apache.org/licenses/LICENSE-2.0). You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
 If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!

 base_model:
 - tiiuae/Falcon3-10B-Base
 library_name: transformers
+license: other
 tags:
 - mergekit
 - merge
 - **Distillation Data:**
   - ~1.1B tokens/logits from Deepseek-v3’s training data.
   - Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
+- **License:** [falcon-llm-license](https://falconllm.tii.ae/falcon-terms-and-conditions.html)
 ### Background on Deepseek Distillation
 Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
 - **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
 -
 ### License
+**Virtuoso-Lite (10B)** is released under the [falcon-llm-license License](https://falconllm.tii.ae/falcon-terms-and-conditions.html). You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
 If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!