该模型的长文效果这么差吗？

by BeautyCJ - opened Jul 2

Jul 2

参照 https://huggingface.co/xverse/XVERSE-13B-256K/blob/main/modeling_xverse.py#L755 使用 chat_template 对输入进行处理后再进行推理
短文上效果符合预期
长文上效果太差了，输出一些乱码、胡言乱语等，完全没有含义的信息

Jul 2

切换到最新transformer环境（transformers==4.42.3，tokenizer==0.19.1） + 使用 tokenizer.json.update\tokenizer_config.json.update，效果同上。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment