Improved precision / reduced frequency of nan outputs, allow bf16 t5, f32 rmsnorm, larger clamp f708e90 aredden commited on Sep 7, 2024
Remove f8 flux, instead configure at load, improved quality & corrected configs 1f9e684 aredden commited on Aug 24, 2024
Dynamic swap with cublas linear / optional improved precision with vram drawback 37bd8c1 aredden commited on Aug 24, 2024
Remove unnecessary code, hide prints behind debug flag, hide warnings 0f3134f aredden commited on Aug 20, 2024
Fix non-offload inference & add option to load from prequantized flux 2f2c44c aredden commited on Aug 18, 2024