Ollama version doesn't properly truncate tokens to 512 max

#14
by shuaiscott - opened

When using the official Ollama model of snowflake-arctic-embed-l (latest/335m - 21ab8b9b0545), if input is greater than 512 tokens, instead of truncating, the model encounters an error somewhere and returns only [0,0,0...] embeddings.

I've checked my Ollama parameters and this occurs when "truncate": true. Other embedding models properly truncates the input and I see the INFO log in Ollama say "input truncated". I don't see this message with snowflake-arctic-embed-l.

When "truncate" is set to false, I get the expected "input length exceeds maximum context length".

Also just leaving a thanks for building these embedding models!

Sign up or log in to comment