Unable to replicate using LazyMergeKit Colab

#3
by sudhir2016 - opened

I tried to replicate using your LazyMergeKit notebook as a learning exercise. This yaml_config gives an error that positive_prompts are same for both experts.
base_model: cognitivecomputations/dolphin-2_6-phi-2
gate_mode: cheap_embed
experts:

  • source_model: cognitivecomputations/dolphin-2_6-phi-2
    positive_prompts: [""]
  • source_model: lxuechen/phi-2-dpo
    positive_prompts: [""]

Then I changed yaml_config like this
MODEL_NAME = "Phixtral-Merge"
yaml_config = """
base_model: cognitivecomputations/dolphin-2_6-phi-2
gate_mode: cheap_embed
experts:

  • source_model: cognitivecomputations/dolphin-2_6-phi-2
    positive_prompts: ["code"]
  • source_model: lxuechen/phi-2-dpo
    positive_prompts: ["math"]
    """
    This time I got this error.
    File "/content/mergekit/mergekit/io/lazy_tensor_loader.py", line 127, in get_tensor
    raise KeyError(key)
    KeyError: 'model.embed_tokens.weight'

Please help.

Sorry that's normal, I modified mergekit's code to produce phixtral. This branch hasn't been released yet (still need to work on it).

sudhir2016 changed discussion status to closed

Sign up or log in to comment