|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- Qwen/Qwen2.5-7B |
|
pipeline_tag: text-generation |
|
tags: |
|
- not-for-all-audiences |
|
language: |
|
- en |
|
library_name: transformers |
|
--- |
|
|
|
## Model Description |
|
|
|
Model created by analyzing and selecting the optimal layers from other Qwen2.5-7B models based on their dimensional utilization efficiency, measured by the Normalized Effective Rank (NER). Computed like: |
|
|
|
- Input: Weight matrix for each model layer |
|
- Compute singular values σᵢ where σᵢ ≥ 0 # σᵢ represents the importance of each dimension |
|
- Filter values above numerical threshold (>1e-12) |
|
- Sum all singular values: S = Σσᵢ # S acts as normalization factor |
|
- Create probability distribution: pᵢ = σᵢ/S # converts singular values to probabilities summing to 1 |
|
- Compute Shannon entropy: H = -Σ(pᵢ * log₂(pᵢ)) # measures information content |
|
- Calculate maximum possible entropy: H_max = log₂(n) |
|
- Final NER score = H/H_max # normalizes score to [0,1] range |
|
- Results in value between 0 and 1 for each model layer |
|
|
|
## Creating Composite Model |
|
|
|
Code here: https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0/blob/main/ner_merge.py |
|
|
|
Code functions: |
|
- Download selected models from Hugging Face Hub |
|
- Calculate Normalized Effective Rank (NER) for each layer within each model |
|
- Define model and layer name pairs that have highest NER for each layer based on their NER scores |
|
- Incrementally build a composite model using layer with highest NER from model pool |
|
- Save merge reports documenting layer sources |
|
- Copy config and tokenizer files from base model |
|
- Save the composite model with complete weights # model ready to use |
|
|
|
Configfile: |
|
|
|
base_model: "Qwen/Qwen2.5-7B" |
|
|
|
fine_tuned_models: # uncomment the models you want to merge |
|
|
|
#- "Qwen/Qwen2.5-7B" |
|
|
|
#- "Qwen/Qwen2.5-7B-Instruct" |
|
|
|
#- "EVA-UNIT-01/EVA-Qwen2.5-7B-v0.1" |
|
|
|
#- "FourOhFour/Vapor_v2_7B" |
|
|
|
#- "Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2" |
|
|
|
#- "happzy2633/qwen2.5-7b-ins-v3" |
|
|
|
#- "huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2" |
|
|
|
#- "HumanLLMs/Humanish-Qwen2.5-7B-Instruct" |
|
|
|
#- "Orion-zhen/Qwen2.5-7B-Instruct-Uncensored" |
|
|
|
#- "Orion-zhen/Meissa-Qwen2.5-7B-Instruct" |
|
|
|
#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v0.9" |
|
|
|
#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0" |
|
|
|
#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.1" |
|
|
|
#- "jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.2" |
|
|
|
#- "AmberYifan/Qwen2.5-7B-dpo-2k" |
|
|
|
#- "sethuiyer/Qwen2.5-7B-Anvita" |
|
|
|
#- "rombodawg/Rombos-LLM-V2.5-Qwen-7b" |
|
|
|
#- "Cran-May/T.E-8.1" |
|
|
|
#- "beomi/Qwen2.5-7B-Instruct-kowiki-qa" |
|
|
|
#- "Orion-zhen/Qwen2.5-7B-Gutenberg-KTO" |
|
|
|
#- "fblgit/cybertron-v4-qw7B-MGS" |
|
|
|
#- "nguyentd/FinancialAdvice-Qwen2.5-7B" |
|
|
|
#- "WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B" |
|
|
|
#- "edgerunner-ai/EdgeRunner-Command-Nested" |
|
|
|
#- "katanemo/Arch-Function-7B" |
|
|
|
#- "DeepGlint-AI/llava-mlcd-qwen2.5-7b" |
|
|
|
#- "mergekit-community/mergekit-slerp-aflqaqy" |
|
|
|
#- "mergekit-community/mergekit-ties-inxwsfo" |
|
|
|
#- "Qwen/Qwen2.5-Coder-7B-Instruct" |
|
|
|
#- "Qwen/Qwen2.5-Math-7B-Instruct" |
|
|
|
#- "Qwen/Qwen2.5-Coder-7B" |
|
|
|
#- "Qwen/Qwen2.5-Math-7B" |
|
|
|
#- "thomas-yanxin/XinYuan-Qwen2.5-7B-0917" |
|
|
|
#- "jbjeong91/Qwen2.5_7B_IST_StoryGen_vanilla" |
|
|
|
#- "AmberYifan/Qwen2.5-7B-dpo-2k-hhrlhf" |
|
|
|
#- "jbjeong91/Qwen2.5_7B_IST_StoryGen_test2" |
|
|
|
models_dir: "./input_models/" |
|
|
|
output_dir: "./merged_model/" |
|
|
|
metric_dir: "./metrics/" |