conversion to HF

#1
by ehartford - opened
Mistral Community org

I cannot find how to convert this to HF, @v2ray can you please show me the way?

Mistral Community org

I'm aware of the script.

How to use it to convert 8x22b is far from self evident.

Mistral Community org
edited May 25

@ehartford https://huggingface.co/v2ray/Mixtral-8x22B-v0.1/blob/main/convert.py

python convert.py --input-dir /path/to/original --model-size 22B --output-dir /path/to/save
Mistral Community org

Thanks!

Mistral Community org

I will do this immediately

Mistral Community org
max_position_embeddings = params["max_seq_len"]
                             ~~~~~~^^^^^^^^^^^^^^^

It wants "max_seq_len"

I see there isn't one in params.json

{
    "dim": 6144,
    "n_layers": 56,
    "head_dim": 128,
    "hidden_dim": 16384,
    "n_heads": 48,
    "n_kv_heads": 8,
    "norm_eps": 1e-05,
    "vocab_size": 32768,
    "rope_theta": 1000000.0,
    "moe": {
        "num_experts": 8,
        "num_experts_per_tok": 2
    }
}

I will try setting it to 32768

Mistral Community org

I thought it was 64k?

Mistral Community org

Ok thank you 😊

Mistral Community org

ok that worked, but didn't create a tokenizer

Mistral Community org

it came with this file
tokenizer.model.v3

Mistral Community org

and no tokenizer.config file

Mistral Community org

ok looks like maybe I need to rename that to tokenizer.model then rerun

Mistral Community org
edited May 25

@ehartford I just copied the tokenizer from 8x7B when I did conversion for 8x22B v0.1 since it's the same one.
Wait a minute v0.3?!

Mistral Community org

nope that didn't do it

Mistral Community org

oh yeah I could copy the tokenizer from mistral-7b-v0.3

Mistral Community org

ok I think I got it. Uploading

Mistral Community org

@ehartford I just copied the tokenizer from 8x7B since it's the same one.
Wait a minute v0.3?!

yeah - they say it's the same but with a new tokenizer

Mistral Community org

finished uploading mistral-community/mixtral-8x22B-v0.3

ehartford changed discussion status to closed

Sign up or log in to comment