Qwenvergence-14B-v9 / README.md
sometimesanotion's picture
Update README.md
9ba1b82 verified
metadata
base_model:
  - sometimesanotion/Lamarck-14B-v0.6
  - sthenno-com/miscii-14b-1225
  - sometimesanotion/Qwentinuum-14B-v013
  - Krystalan/DRT-o1-14B
  - sometimesanotion/Qwenvergence-14B-v3-Prose
  - arcee-ai/Virtuoso-Small
  - huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
library_name: transformers
tags:
  - mergekit
  - merge
license: apache-2.0
language:
  - en
metrics:
  - accuracy
pipeline_tag: text-generation

This isn't meant for end users. This merge is meant to expand the range of weights and scores possible to a standardized base, and may be showing signs of overfitting. I'm leaving it public because it's made new record scores for MUSR and MATH for 14B Qwen2.5 models, and that's worth studying.

This model was merged using the Model Stock merge method using sometimesanotion/Lamarck-14B-v0.7-Base-001 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model (note: -qv suffixes indicate LoRA application from Abliterate-Qwenvergence which is almost identical to huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2)

name:                Qwenvergence-14B-v9
merge_method:        model_stock
base_model:          sometimesanotion/Lamarck-14B-v0.7-Base-001
tokenizer_source:    sometimesanotion/Abliterate-Qwenvergence
dtype:               float32
out_dtype:           bfloat16
parameters:
  int8_mask:         true
  normalize:         true
  rescale:           false
models:
  - model:           sometimesanotion/Lamarck-14B-v0.6
  - model:           sometimesanotion/Qwenvergence-14B-v3-Prose-qv256
  - model:           Krystalan/DRT-o1-14B-qv128
  - model:           arcee-ai/Virtuoso-Small-qv64
  - model:           sometimesanotion/Lamarck-14B-v0.6
  - model:           sometimesanotion/Qwentinuum-14B-v013-qv512
  - model:           sthenno-com/miscii-14b-1225-qv64
  - model:           sometimesanotion/Qwenvergence-14B-v3-Prose-qv256
  - model:           sometimesanotion/Lamarck-14B-v0.6