Output ends with </s>

#20
by bartowski - opened

For some reason, GGUF and the original safetensors both output at the end of generation, though this shouldn't be the case (even looking at your config it doesn't make sense why it is)

Anyone else encountered this? I assume it's a problem with the model itself considering it happens even with transformers

Also happens with https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-1048k

Gradient AI org

Hey @bartowski , Thanks for your quants! Potentially, that has been intruduced throught the training data. I will follow up with this tomorrow.

Playing around a bit on gradientai/Llama-3-8B-Instruct-Gradient-1048k I'm seeing this as well but it mostly seems to happen on longer answers. For example, using text-generation-web-ui and the transformers loader, if I ask "What's the capital of France?" it'll answer "Paris." with no . If I follow up with "What's good to do with kids there?" it'll generate a list of 10 items that ends with

Same thing here using "RichardErkhov/gradientai_-_Llama-3-8B-Instruct-262k-gguf/Llama-3-8B-Instruct-262k.Q5_1.gguf" in LM Studio with the Llama 3 format. Here's a typical output :

USER
Tell me about climate change.

AI
Climate change refers to long-term shifts in weather patterns, temperature and sea levels that can have significant effects on our planet's ecosystems and human societies. It is caused by a variety of factors including greenhouse gases released into the atmosphere through human activities such as burning fossil fuels for transportation and electricity, deforestation, industrial processes, and livestock farming.

The most common greenhouse gas contributing to climate change is carbon dioxide (CO2), which is produced when fossil fuels are burned or when organic matter decays in the absence of oxygen. Other greenhouse gases include methane from rice paddies, livestock farms, and landfills; nitrous oxide from fertilizers used in agriculture; and fluorinated gases released by air conditioning units.

The effects of climate change vary depending on the region and can range from more frequent and severe heat waves, droughts, and floods to rising sea levels that threaten coastal cities and low-lying areas. It also increases the risk of wildfires, reduces water availability, and can lead to food insecurity.

There are several ways we can take action to help mitigate climate change, such as reducing our carbon footprint by using public transportation or electric cars, eating less meat, reducing energy consumption at home by turning off lights when not in use, and supporting policies that promote renewable energy sources. However, the most effective way to address climate change is through global cooperation and collective effort. /s (Modified since it did not display the complete tag)

USER
What is the name of the largest ocean?

AI
The Pacific Ocean is the largest ocean. /s (Modified since it did not display the complete tag)

USER
What is the capital of Canada?

AI
Ottawa is the capital of Canada. /s (Modified since it did not display the complete tag)

USER
What is the name of Jupiter's largest moon? Give me just the name.

AI
Ganymede /s (Modified since it did not display the complete tag)

Gradient AI org

Hi all, we highly recommend to run this model with transformers / torch based implementations.
We have seen some pretty weird things going on for long context, if you are e.g. using llama.cpp (and all projects using llama.cpp as engine) @bartowski @banahogg @wenrolland

Other than the ends with /s, this model rocks with LM Studios (Q4 an Q5). I copied a complete course note book, more than a 20000 tokens, and it's amazing how it can find and summarize the data. I can tolerate /s. :) I understand some have wilder rides with the model with lama.cpp but it's not useless in my situation, far from it.

Gradient AI org
β€’
edited May 21

@wenrolland Cool! We also mad this model reason over the whole 262k context length. We are still looking into , but for now that's how the model is.

Also some experiment , let me think about what your thoughts are?

one.txt

Thanks for Michael Feil. Michael works at Gradient.ai.
{10k tokens of essays}

ten.txt

{10k tokens of essays}
Thanks for all the employees of Gradient.ai for proofreading this. 
Next time, I'll owe you an invitation for pancackes. (Everyones favorite dish at Gradient.ai).

seven.txt

{10k tokens of essays}
Thanks over great discussions i had when eating the pancackes with the employee of this one company which name i forgot. 
This was an interesting discussion, insights about their super secret project project of long context modeling with LLama3.

-> Can answer questions like this.
Whats the project Michael Feil worked on?

Sadly, I don't have enough VRAM to use the full context. Keep up the great work with these models. I did many needle test, and it finds the information very efficiently, in French and English texts.

Sign up or log in to comment