Fix issue where cublas linear not installed causing TypeError 56c313c aredden commited on Oct 12, 2024
Improved precision / reduced frequency of nan outputs, allow bf16 t5, f32 rmsnorm, larger clamp f708e90 aredden commited on Sep 7, 2024
Fix issues with loading F8Linear from state dict when init_scale not initialized & loaded from meta device 3ddaa67 aredden commited on Sep 1, 2024
Fix issue where lora alpha is not correct if lora from transformers checkpoint 7a7b2c1 aredden commited on Aug 28, 2024
Small fix for issue where f16 CublasLinear layers weren't being used even when available. 6d82dcc aredden commited on Aug 28, 2024
Merge pull request #3 from aredden/improved_precision af20799 unverified aredden commited on Aug 24, 2024
Remove f8 flux, instead configure at load, improved quality & corrected configs 1f9e684 aredden commited on Aug 24, 2024
Fix issue where torch.dtype throws error when converting to dtype fb7df61 aredden commited on Aug 24, 2024
Dynamic swap with cublas linear / optional improved precision with vram drawback 37bd8c1 aredden commited on Aug 24, 2024
Allow overriding config values from load_pipeline_from_config_path 25ae92b aredden commited on Aug 24, 2024
Remove unnecessary synchronize, add more universal seeding & limit if run on windows ffa6ff7 aredden commited on Aug 21, 2024
Remove unnecessary code, hide prints behind debug flag, hide warnings 0f3134f aredden commited on Aug 20, 2024
Merge branch 'main' of https://github.com/aredden/flux-fp16-acc-api into main 58082af aredden commited on Aug 20, 2024
Adding device specific configs & more input image type options + small model spec from args change e81fa57 aredden commited on Aug 20, 2024