Spaces:
Running
Responses from model are showing to other users
We have LLM chat https://huggingface.co/spaces/speakleash/Bielik-7B-Instruct-v0.1 When popular youtuber published video about our model, many users started using this space. We and users observed that the interface shows responses for questions from other users.
I don't think it is a bug in our code - it is quite standard, similar to other chat spaces.
One of the comments from YouTube: "they have a bug with this model. I noticed that sometimes replies are sent to the wrong recipients. I had a situation where I sat and watched the replies generated by people for 15 minutes (in the place where the reply to me should have been)."
@djstrong why do you have https://huggingface.co/spaces/speakleash/Bielik-7B-Instruct-v0.1/blob/main/app.py#L132 in your code in the GPU function? Try removing any stateful operation that doesn't require the GPU outside the function that is decorated with @spaces.GPU
So, the main function predict
invoke without @spaces.GPU
and inside call model with @spaces.GPU
.
@spaces.GPU # the only function with the decorator
def generate_response():
...
def predict(message, history, system_prompt, temperature, max_new_tokens, top_k, repetition_penalty, top_p):
prepare_data()
yield from generate_response()
save_results()
No you can print stuff it will be routed to the logs