ZeusLabs/L3-Aethora-15B-V2 · Special tokens for instruction template

Jul 1, 2024

•

edited Jul 1, 2024

I hope this is a good place to ask this but when I run the original Llama 3 model then the instructions like

'<|begin_of_text|>'
'<|start_header_id|>'
'<|end_header_id|>'

get tokenized into one token each (128000, 128006 and 128006 respectivly) but when running this model every instruction gets tokenized into way more token for example '<|start_header_id|>' gets translated to:

128000 - ''
27 - '<'
91 - '|'
2527 - 'start'
8932 - '_header'
851 - '_id'
91 - '|'
29 - '>'

Is that intended behavior or am I doing something wrong? I noticed this when I used a lot of short "user" sections and ran out of context fast.

Edit: I am using Text generation web UI. I don't know if that is relevant.

Steelskull changed discussion status to closed Nov 4, 2024

Weker

Nov 4, 2024

Why close the discussion?