max_position_embedding

#5
by the-hir0 - opened

Why in config.json "max_position_embeddings": 4096 ? And not 32768? Don't we need to change it when increasing context len via rope? Thank you in advance!

Nope, I think that's only the models original length (4096 for Llama-2).

There is a line in config.json that pertains to linear ROPE scaling ("rope_scaling": {"factor": 8.0, "type": "linear"} which is normally missing or None), but not all clients pay attention to it, and have their own GUI or command line argument to override it. Each client has its own name for it as well (egs., Ooba calls it compress_pos_emb). This is what sets the actual final context usually. It's a mess.

Rope theta scaling (or adjusted base frequency) on the other hand is specified in config.json and read automatically by most clients that read Huggingface format (i.e., not ggml), but I didn't use that method for this model.

Sign up or log in to comment