How to handle the truncate part when concating multiple sequences in pretraining phrase?
#61
by
feiyulv
- opened
Hi, when pretraining , we concat multiple sequences into a 8192 batch. How to handle the last sequence when it exceeds 8912 with preivous sequences?
- dicard the last sequence, and padding the previous sequences to 8192
- truncating the last sequnce with 8192
Which startegy do we use?