Bowen Peng
bloc97
AI & ML interests
Machine Learning, Computer Graphics, Language Models
Organizations
bloc97's activity
How did you train this without going OOM in RAM & VRAM?
3
#15 opened 9 months ago
by
vicplus
VRAM usage for full 128k tokens
7
#5 opened about 1 year ago
by
Hypersniper
sliding_window = 131072? Sliding window attention doesn't work for 128?
1
#4 opened about 1 year ago
by
keyishen
Hardware requirements for the model.
2
#1 opened about 1 year ago
by
Sc0urge