Stanisław Szymczyk
sszymczyk
AI & ML interests
None yet
Recent Activity
New activity
3 days ago
AIDC-AI/Marco-o1:Can you provide code for inference with MCTS?
New activity
3 days ago
allenai/Llama-3.1-Tulu-3-70B:Reason behind not using special tokens in the prompt format?
New activity
5 days ago
mistralai/Mistral-Large-Instruct-2411:The curse of the Consolidated Safetensors strikes again...
Organizations
None yet
sszymczyk's activity
Can you provide code for inference with MCTS?
4
#3 opened 3 days ago
by
sszymczyk
Reason behind not using special tokens in the prompt format?
1
#2 opened 4 days ago
by
Doctor-Shotgun
The curse of the Consolidated Safetensors strikes again...
2
#4 opened 6 days ago
by
jukofyork
What call() function parameters besides "query" can be used by the model when doing brave_search and wolfram_alpha tool calls?
#89 opened 4 months ago
by
sszymczyk
What form of the built-in brave_search and wolfram_alpha tool call output is expected by the model?
3
#88 opened 4 months ago
by
sszymczyk
The model often enters infinite generation loops
13
#32 opened 4 months ago
by
sszymczyk
Translation to German doesn't work in 3B model
#8 opened 5 months ago
by
sszymczyk
Calculation of _mscale during YARN RoPE scaling
1
#4 opened 6 months ago
by
sszymczyk
Wrong BOS and EOS tokens in tokenizer.model file
1
#12 opened 7 months ago
by
sszymczyk
Confusing ArcticDecoderLayer::forward() implementation
#11 opened 7 months ago
by
sszymczyk
Problem with repeated generation of newline characters
2
#3 opened 7 months ago
by
sszymczyk
Possible error in tokenizer.json
6
#6 opened 7 months ago
by
sszymczyk
Model is paraphrasing text instead of citing it verbatim
3
#7 opened 7 months ago
by
sszymczyk
Model is paraphrasing text instead of citing it verbatim
3
#7 opened 7 months ago
by
sszymczyk
Possible error in tokenizer.json
6
#6 opened 7 months ago
by
sszymczyk