Crystalcareai commited on
Commit
ef3a352
·
verified ·
1 Parent(s): efc17a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -2,6 +2,7 @@
2
  base_model:
3
  - tiiuae/Falcon3-10B-Base
4
  library_name: transformers
 
5
  tags:
6
  - mergekit
7
  - merge
@@ -23,7 +24,7 @@ Quantizations available [here](https://huggingface.co/arcee-ai/Virtuoso-Lite-GGU
23
  - **Distillation Data:**
24
  - ~1.1B tokens/logits from Deepseek-v3’s training data.
25
  - Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
26
- - **License:** [Apache-2.0](#license)
27
 
28
  ### Background on Deepseek Distillation
29
  Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
@@ -76,6 +77,6 @@ Virtuoso-Lite demonstrates strong results across multiple benchmarks (e.g., BBH,
76
  - **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
77
  -
78
  ### License
79
- **Virtuoso-Lite (10B)** is released under the [Apache-2.0 License](https://www.apache.org/licenses/LICENSE-2.0). You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
80
 
81
  If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!
 
2
  base_model:
3
  - tiiuae/Falcon3-10B-Base
4
  library_name: transformers
5
+ license: other
6
  tags:
7
  - mergekit
8
  - merge
 
24
  - **Distillation Data:**
25
  - ~1.1B tokens/logits from Deepseek-v3’s training data.
26
  - Logit-level distillation using a proprietary “fusion merging” approach for maximum fidelity.
27
+ - **License:** [falcon-llm-license](https://falconllm.tii.ae/falcon-terms-and-conditions.html)
28
 
29
  ### Background on Deepseek Distillation
30
  Deepseek-v3 serves as the teacher model, from which we capture logits across billions of tokens. Rather than standard supervised fine-tuning, Virtuoso-Lite applies a full logit-level replication to preserve the most crucial insights from the teacher. This approach enables:
 
77
  - **Content Generation Risks:** Like any language model, Virtuoso-Lite can generate potentially harmful or biased content if prompted in certain ways.
78
  -
79
  ### License
80
+ **Virtuoso-Lite (10B)** is released under the [falcon-llm-license License](https://falconllm.tii.ae/falcon-terms-and-conditions.html). You are free to use, modify, and distribute this model in both commercial and non-commercial applications, subject to the terms and conditions of the license.
81
 
82
  If you have questions or would like to share your experiences using Virtuoso-Lite (10B), please connect with us on social media. We’re excited to see what you build—and how this model helps you innovate!