Context Size of EM German Mistral

#2
by weissenbacherpwc - opened

Hi,

I am wondering what the maximum context size of the model is? Is is 2048?

It´s trained at a max 4096 context window (next version will probably be 8k+) and the max context size of the architecture is 32k token, see details here: https://huggingface.co/jphme/em_german_mistral_v01/blob/main/config.json . You should be able to scale the context window up to at least 8k using rope scaling (see here: https://huggingface.co/docs/transformers/main/en/model_doc/llama#transformers.LlamaConfig.rope_scaling ).

jphme changed discussion status to closed

Sign up or log in to comment