|
--- |
|
base_model: |
|
- Qwen/Qwen2.5-1.5B-Instruct |
|
base_model_relation: finetune |
|
library_name: peft |
|
tags: |
|
- mergekit |
|
- merge |
|
- llama-factory |
|
- lora |
|
datasets: |
|
- allura-org/fujin-cleaned-stage-1 |
|
- Dampfinchen/Creative_Writing_Multiturn |
|
- ToastyPigeon/SpringDragon |
|
- allura-org/medquad_sharegpt |
|
- allura-org/scienceqa_sharegpt |
|
- Alignment-Lab-AI/orcamath-sharegpt |
|
--- |
|
# Q25-1.5-VeoLu-R2 |
|
![made with StableNoobAI-IterSPO in sd-webui-forge](veolu.png) |
|
[*A source of life and hope for the land.*](https://www.youtube.com/watch?v=TJRq1Ag2Wmw) |
|
|
|
Q25-1.5B-Veo Lu is a tiny General-Purpose Creative model, made up of a merge of bespoke finetunes on Qwen 2.5-1.5B-Instruct. |
|
|
|
Inspired by the success of [MN-12B-Mag Mell](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1) and [MS-Meadowlark-22B](https://huggingface.co/allura-org/MS-Meadowlark-22B), Veo Lu was trained on a healthy, balanced diet of of Internet fiction, roleplaying, adventuring, and reasoning/general knowledge. |
|
|
|
The components of Veo Lu are: |
|
|
|
* Bard (pretrain, writing): [Fujin (Cleaned/extended Rosier)](https://huggingface.co/datasets/allura-org/fujin-cleaned-stage-1) |
|
* Scribe (pretrain, roleplay): [Creative Writing Multiturn](https://huggingface.co/datasets/Dampfinchen/Creative_Writing_Multiturn) |
|
* Cartographer (pretrain, adventuring): [SpringDragon](https://huggingface.co/datasets/ToastyPigeon/SpringDragon) |
|
* Alchemist (SFT, science/reasoning): [ScienceQA,](https://huggingface.co/datasets/allura-org/scienceqa_sharegpt) [MedquadQA,](https://huggingface.co/datasets/allura-org/medquad_sharegpt) [Orca Math Word Problems](https://huggingface.co/datasets/Alignment-Lab-AI/orcamath-sharegpt) |
|
|
|
This model is capable of carrying on a scene without going completely off the rails. That being said, it only has 1.5B parameters. So please, for the love of God, *manage your expectations.* |
|
Since it's Qwen, use ChatML formatting. Turn the temperature down to ~0.7-0.8 and try a dash of rep-pen. |
|
|
|
GGUFs coming soon, but honestly, the full-precision model is 3.5GB in size. You might wanna have a go at running this unquantized with vLLM. |
|
``` |
|
pip install vllm |
|
vllm serve Alfitaria/Q25-1.5B-VeoLu --max-model-len 16384 --max-num-seqs 1 |
|
``` |
|
|
|
Made by inflatebot. |
|
|
|
Special thanks to our friends at [Allura](https://huggingface.co/allura-org), and especially to [Auri](https://huggingface.co/AuriAetherwiing), who basically held my hand through the whole process. Her effort and enthusiasm carried this project forward. |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
base_model: Qwen/Qwen2.5-1.5B-Instruct |
|
dtype: bfloat16 |
|
merge_method: task_arithmetic |
|
parameters: |
|
normalize: 1.0 |
|
slices: |
|
- sources: |
|
- layer_range: [0, 28] |
|
model: bard |
|
parameters: |
|
weight: 1.0 |
|
- layer_range: [0, 28] |
|
model: scribe |
|
parameters: |
|
weight: 1.0 |
|
- layer_range: [0, 28] |
|
model: cartographer |
|
parameters: |
|
weight: 1.0 |
|
- layer_range: [0, 28] |
|
model: alchemist |
|
parameters: |
|
weight: 1.0 |
|
- layer_range: [0, 28] |
|
model: Qwen/Qwen2.5-1.5B-Instruct |
|
``` |