Wrong Output in Usage Example Code?
#5
by
hmomin
- opened
When I run the example code in the Usage section, I don't get the output 47.4404296875 as suggested at the bottom.
Instead, I get -221.7861328125.
Thanks
Hi,
We've just fixed a bug in the demo yesterday -- correcting the [\INST] to [/INST] -- and haven't got time yet to test the reward based on the corrected code. If you are using the latest code, it is possible to obtain a different output.