Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
csabakecskemetiΒ 
posted an update 2 days ago
Post
1606
-UPDATED-
4bit inference is working! The blogpost is updated with code snippet and requirements.txt
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
-UPDATED-
I've played around with an MI100 and ROCm and collected my experience in a blogpost:
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
Unfortunately I've could not make inference or training work with model loaded in 8bit or use BnB, but did everything else and documented my findings.

How many tokens per second do you receive? I didn't see this on the blog.

Β·

Good callout will add this evening
Llama 3 8b q8 was around 80t/s generation

QLORA model loaded in 4bits

Screenshot From 2025-03-03 17-08-25.png