General discussion.

#1
by Lewdiculous - opened

@Endevor - I'm just hoping for a miracle here.

slices:
  - sources:
      - model: Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context
        layer_range: [0, 32]
      - model: Endevor/InfinityRP-v1-7B
        layer_range: [0, 32]
merge_method: slerp
base_model: Epiculous/Fett-uccine-Long-Noodle-7B-120k-Context
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

Quants to be uploaded:

    quantization_options = [
        "Q4_K_M", "IQ4_XS", "Q5_K_M", "Q5_K_S", "Q6_K",
        "Q8_0", "IQ3_M", "IQ3_S", "IQ3_XXS"
    ]
Lewdiculous pinned discussion

Testing at --contextsize 12288... It "works", as in InfinityRP alone would basically break above 8192, this can still produce responses that make sense, with some continuity inconsistencies, but it mostly works, formatting is fine. Maybe with slightly less context, such as 10240, it's already a boost but I'd need to evaluate the writing quality in practice for longer.

@Endevor - To infinity and beyond!

Sign up or log in to comment