different responses on each prediction?
The responses seem to be very different in each prediction once the model is loaded? wondering if there are any tweaks to control this behavior?
for example; for the instruction/question
"What is the capital of France?"
I get three different answers ranging from one word to multiple sentences
You can experiment with different generation config settings. Here's some reference documentation: https://huggingface.co/docs/transformers/main_classes/text_generation
Our pipeline has some default settings that cause it to generate different samples each time you run it. If you follow the the link below you can see what we're using.
https://huggingface.co/databricks/dolly-v2-7b/blob/main/instruct_pipeline.py#L62
Set do_sample=False , in particular
Thank you, that helped