Feedback

#2
by KeyboardMasher - opened

Here are my observations after trying this model in Q8_0 quantization and greedy decoding and comparing it to Llama 3.1 Instruct in the same quantization, sampling parameters and system prompt:

  1. Does not handle false premise questions well. Unlike L3.1 it does not correct the user, but makes up wrong justification.
    Example - "Why do numbers in Slitherlink puzzle can go only up to 2?" (They can go up to 3).
  2. Hallucinates about obscure real world facts noticeably more than L3.1
    Example - question it about small towns around the world and compare the answers to Wikipedia entries.

Sign up or log in to comment