hello, I would like to ask you how to merge the 20B model?

#1
by win10 - opened

Hello, I would like to ask you how to merge the 20B model?
your approach to MLewd-ReMM-L2-Chat-20B has produced errors.

I use Mergekit, this tool let you use layer of multiple model to create only one.
The approach used for this model is the same has the other, chunk of 16 layers, what are the errors you got?

I use Mergekit, this tool let you use layer of multiple model to create only one.
The approach used for this model is the same has the other, chunk of 16 layers, what are the errors you got?

Hello, the problem is solved, how to merge the custom_code model?

I don't really understand your question, what is "custom_code", you want to merge the 20B with another model?
Sadly, 20B can only be merged with other 20B, but you can do it, with Mergekit as always.
You can try to apply 13B LoRA on 20B model since they have the layers of a 13B but I don't recommand it.
Here is the tools : https://github.com/cg123/mergekit

Don't hesitate to post again if you need help!

like this :ValueError: The repository for D:/oobabooga_windows/text-generation-webui/models/THUDM_chatglm3-6b-32k contains custom code which must be executed to correctlyload the model. You can inspect the repository content at https://hf.co/D:/oobabooga_windows/text-generation-webui/models/THUDM_chatglm3-6b-32k.
Please pass the argument trust_remote_code=True to allow custom code to be run.

It's because this is a different architecture than llama2 model, you can't merge 2 models from different architecture.
Also, you try to load this model is ooba but i'm not sure he support it, I don't use ooba anymore and I don't know this modele haha.
Try to merge model with the same size and the same architecture, and use mergekit to stack layer and do a bigger model :

image.png

How to merge models with different vocab_sizes?

How to merge models with different vocab_sizes?

Edit one config file.
If model1 have 32000 vocab and model2 have, for exemple, 32021, put 32000 on the config file of model2 for the vocab size.
It's dirty, but it's work.
If you just stack layer (bakllama), you don't even need to do that, it work (I tried)

How to merge models with different vocab_sizes?

Edit one config file.
If model1 have 32000 vocab and model2 have, for exemple, 32021, put 32000 on the config file of model2 for the vocab size.
It's dirty, but it's work.
If you just stack layer (bakllama), you don't even need to do that, it work (I tried)

Sorry if it bothers you, I am getting the following error while trying to merge 20b,
Traceback (most recent call last):

File "D:\mergekit-main\mergekit\scripts\bakllama.py", line 83, in
_main()

File "D:\mergekit-main\mergekit\scripts\bakllama.py", line 79, in _main
typer.run(main)

File "D:\mergekit-main\mergekit\scripts\bakllama.py", line 72, in main
merge_config = MergeConfiguration(
^^^^^^^^^^^^^^^^^^^

File "C:\Users\jmes1\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydantic\main.py", line 164, in init
pydantic_self.pydantic_validator.validate_python(data, self_instance=pydantic_self)

pydantic_core._pydantic_core.ValidationError: 8 validation errors for MergeConfiguration
slices.0
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...e=(0, 8), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.1
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...=(4, 12), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.2
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...=(9, 16), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.3
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...(13, 22), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.4
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...(17, 24), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.5
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...(23, 32), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.6
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...(25, 32), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type
slices.7
Input should be a valid dictionary or instance of OutputSliceDefinition [type=model_type, input_value=InputSliceDefinition(mode...(33, 40), parameters={}), input_type=InputSliceDefinition]
For further information visit https://errors.pydantic.dev/2.4/v/model_type

@win10 Copy/paste your merging .yaml I will take a look

This comment has been hidden

@win10 Copy/paste your merging .yaml I will take a look

layer_slices:

  • model: OpenBuddy/openbuddy-zephyr-7b-v14.1
    start: 0
    end: 8
  • model: Undi95/Toppy-M-7B
    start: 4
    end: 12
  • model: openchat/openchat_3.5
    start: 9
    end: 16
  • model: Undi95/Toppy-M-7B
    start: 13
    end: 22
  • model: openchat/openchat_3.5
    start: 17
    end: 24
  • model: Undi95/Toppy-M-7B
    start: 23
    end: 32
  • model: OpenBuddy/openbuddy-zephyr-7b-v14.1
    start: 25
    end: 32
  • model: Undi95/Toppy-M-7B
    start: 33
    end: 40
win10 changed discussion status to closed
win10 changed discussion status to open

@win10 7B model only have 32 layers, you go up to 40, that is the problem.
Try to go only to 32 max.
Good luck! post here is you need more help.
I can't read at 4AM lmao, forget about the 3 models, i'm dumb.

Sign up or log in to comment