Post
510
The AMD Instinct MI50 (~$110) is surprisingly fast for inference Quantized models.
This runs a Llama 3.1 8B Q8 with Llama.cpp
DevQuasar/Mi50
A little blogpost about the HW
http://devquasar.com/uncategorized/amd-radeon-instinct-mi50-cheap-inference/
This runs a Llama 3.1 8B Q8 with Llama.cpp
DevQuasar/Mi50
A little blogpost about the HW
http://devquasar.com/uncategorized/amd-radeon-instinct-mi50-cheap-inference/