[AUTOMATED] Model Memory Requirements
#21 opened 9 months ago
by
model-sizer-bot
What is the max sequence length that model can compute if I use flash attention?
1
#20 opened 10 months ago
by
halfmoon039
Do I need to apply_chat_template before Supervised Fine-tuning Gemma-1.1-7b-it?
2
#19 opened 10 months ago
by
Syax19
Is 1.1 trained from the same SFT model as 1.0?
1
#18 opened 10 months ago
by
chujiezheng
![](https://cdn-avatars.huggingface.co/v1/production/uploads/610b70452719facd4ea85e28/S7nMy7D0Rxq0VIVblhYDG.jpeg)
finetunr error. "triu_tril_cuda_template" not implemented for 'BFloat16'
2
#17 opened 10 months ago
by
Saicy
Update README.md
#16 opened 10 months ago
by
ssalvo41
TemplateError: System role not supported
7
#15 opened 10 months ago
by
luogy
![](https://cdn-avatars.huggingface.co/v1/production/uploads/noauth/fptUCPcRJJ2mKuB76qXJ-.png)
Consider adding <start_of_context> and <stop_of_context> or similar special tokens for context ingestion.
#13 opened 10 months ago
by
qnixsynapse
![](https://cdn-avatars.huggingface.co/v1/production/uploads/63119cc5af10c9efa1e9b620/RA-UgDNTPsF6j5uDnG3-N.jpeg)
loss padding_side
1
#12 opened 10 months ago
by
NickyNicky
![](https://cdn-avatars.huggingface.co/v1/production/uploads/641b435ba5f876fe30c5ae0a/OknUuweWxX3IzUZIKZ6CF.png)
Why is this completely broken?
2
#11 opened 10 months ago
by
rombodawg
![](https://cdn-avatars.huggingface.co/v1/production/uploads/642cc1c253e76b4c2286c58e/fGtQ_QeTjUgBhIT89dpUt.jpeg)
Number of parameters
8
#9 opened 10 months ago
by
HugoLaurencon
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1635201569275-noauth.jpeg)