Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
9
Dmytro Dzhulgakov
dzhulgakov
Follow
21world's profile picture
naturelizer's profile picture
kristileilani's profile picture
5 followers
·
7 following
dzhulgakov
dzhulgakov
AI & ML interests
None yet
Organizations
dzhulgakov
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
deepseek-ai/DeepSeek-V3
2 months ago
Bug in fp8_cast_bf16.py
1
#4 opened 2 months ago by
dzhulgakov
New activity in
meta-llama/Llama-3.2-11B-Vision-Instruct
5 months ago
Tokenizer needs to be fixed for BOS handling
#18 opened 5 months ago by
dzhulgakov
New activity in
meta-llama/Llama-3.2-1B-Instruct
5 months ago
Tokenizer BOS behavior is inconsistent with Llama 3.1
1
#5 opened 5 months ago by
dzhulgakov
New activity in
deepseek-ai/DeepSeek-Coder-V2-Instruct
8 months ago
How important is the grouped_topk?
#6 opened 8 months ago by
dzhulgakov
New activity in
google/gemma-2-9b
8 months ago
Can't repro MMLU: sliding window attention implementation seems broken
3
#11 opened 8 months ago by
dzhulgakov
New activity in
meta-llama/Meta-Llama-3-70B-Instruct
10 months ago
clean_up_tokenization_spaces=True causes formatting issues, why is it set?
2
#44 opened 10 months ago by
dzhulgakov
New activity in
google/gemma-7b-it
about 1 year ago
Running sample code gives ma a shape error
1
#22 opened about 1 year ago by
dzhulgakov
New activity in
DiscoResearch/mixtral-7b-8expert
about 1 year ago
Update modeling_moe_mistral.py
2
#1 opened about 1 year ago by
bjoernp
commented
a paper
over 1 year ago
Mistral 7B
Paper
•
2310.06825
•
Published
Oct 10, 2023
•
46
•
8