flux new models

#1
by iafun - opened

I've messed around with Flux a bit, but haven't set up a conversion environment yet.
It's midnight in my country right now, so I'll try after tomorrow.

thank you john, good night !

Bad news.

I did a quick test before I went to bed and found that even if I used the latest dev version of diffusers, I got an error, then I forcefully modified the diffusers code and still got an error, because it seems that the 8-bit quantization process is slightly changing the structure of the unet.
Forcing this to be converted would produce a busted file.

If diffusers does not support quantized safetensors, it is probably necessary to first manually separate unet from the rest and convert each to diffusers format.
This would require me to look into the details of the quantization methods of sayakpaul, lllyasviel, and others. (just to make sure I'm not mistaken).
Tomorrow I have some business to attend to, and of course I have some routine conversion work to do, so it will be an in-between process, but I'll take the opportunity to give it a try.
I've been meaning to try it sometime.

However, my PC, especially my GPU environment, is extremely poor, so don't expect me to be able to do it right away.
It might be faster if diffusers officially support quantized files. Or rather, I'd rather they support it as soon as possible.😭

good night!

thank you for your effort ! don't tryhard to diffuse it here if its impossible to do it, i didn't know it was so difficult ! But it means for us that maybe future flux model won't be available here on hf.

No, no, no, the short answer is that the current version of diffusers (the newly created Flux-related part) is just buggy, and I'm sure it will be fixed eventually in the not-too-distant future.
Well, if it's not possible, I'll just sit tight and wait for the new version to come out.

My biggest fear is that I'll end up mass-producing broken files.

It seems to be discussed on github, so I don't think it means it won't be made.
https://github.com/huggingface/diffusers/issues/9053
https://github.com/huggingface/diffusers/issues/9149
https://github.com/huggingface/diffusers/discussions/7023

thank you for sharing the discussion i am really interested to read it

It was a step forward, but the reason was too stupid.😭
When compositing the parts of the quantized model, (probably by ComfyUI), it had taken the liberty of changing the names of the keys with its own rules, and there was a discrepancy in the key names with the Diffusers side, or rather with the official model.

The remaining problem is that this still drowns out the behavior of from_single_file and that it cannot be converted (in a straightforward way) without passing personal information.
And the biggest problem is that unless Diffusers officially supports quantized files, the size of the Repo will be staggeringly bloated.
The loading time of models for Inference will be extremely long, and even though I don't care to know, it is a fool's errand that wastes HF's RAM, VRAM and HDD resources and accelerates global warming.

If the Diffusers built into HF's Inference API fully supported NF4 or qfloat8, we could reduce the size while keeping the accuracy there...

The code below is what is needed to preprocess the conversion. About half of it is the logging part.

import json
import torch
from safetensors.torch import load_file, save_file
from pathlib import Path

# read safetensors metadata
def read_safetensors_metadata(path):
    with open(path, 'rb') as f:       
        header_size = int.from_bytes(f.read(8), 'little')
        header_json = f.read(header_size).decode('utf-8')
        header = json.loads(header_json)
        metadata = header.get('__metadata__', {})
        return metadata

def normalize_key(k: str):
    return k.replace("vae.", "").replace("model.diffusion_model.", "")\
        .replace("text_encoders.clip_l.transformer.text_model.", "")\
        .replace("text_encoders.t5xxl.transformer.", "")

filename = "fluxunchainedArtfulNSFW_fuT516xfp8E4m3fnV11.safetensors"
savename = Path(filename).stem + "_fixed" + Path(filename).suffix
oldlogname = Path(filename).stem + "_fixed" + ".old_keys.txt"
newlogname = Path(filename).stem + "_fixed" + ".new_keys.txt"

metadata = read_safetensors_metadata(filename)
print(json.dumps(metadata, indent=4)) #show metadata

state_dict = load_file(filename)
new_sd = dict()

keys_old = []
keys_new = []
with torch.no_grad():
    try:
        for k, v in state_dict.items():
            nkey = normalize_key(k)
            print(f"{k} => {nkey}")
            keys_old.append(k)
            keys_new.append(nkey)
            new_sd[nkey] = torch.dequantize(v).to(torch.bfloat16) # too large...
    except Exception as e:
        print(e)

save_file(new_sd, savename, metadata={"format": "pt", **metadata})

with open(oldlogname, encoding='utf-8', mode='w') as f:
    f.write("\n".join(keys_old))
with open(newlogname, encoding='utf-8', mode='w') as f:
    f.write("\n".join(keys_new))

I think I have almost finished the program to convert ComfyUI formatted FLUX.1 models to Diffusers format, but it is too heavy to run on my PC as it is now.😭
I let the HF space do it and the torch calculation was too slow and took over 30 minutes on the CPU space, much better on the Zero GPU, but it was stuck on Quota.

Need to optimize.
Download and convert FLUX.1 ComfyUI formatted safetensors to Diffusers and create your repo

wow awesome John ! I think this is huge contribution here on HF from you. You seem to have very solide strong Informatic knowledge background.

No, I just took the appropriate pieces and spliced them together with duct tape. Ha-ha-ha.🤗

I have duplicated this space and i have an error when i submit the flux model i want to import form civitai. It starts downloading then i got an error

Sorry, I haven't tested downloading from Civitai yet.😅
But I'm using a module from DiffuseCraft that downloads LoRA and VAE, so if it doesn't work, I've probably done something wrong.
I'll look into it.

yeah but i wonder about the URL from civitai model. Do we have to put the URL of the web page or some specific URL like the download one ?

exemple for Fluxunchained :
url model page : https://civitai.com/models/645943/fluxunchained-artful-nsfw-capable-fluxd-tuned-model-by-socalguitarist
download link : https://civitai.com/api/download/models/722620?type=Model&format=SafeTensor&size=pruned&fp=fp8

what's the use of single safetensor files into new repo ?

I just tried it now with the appropriate model of SDXL and the download from Civitai was successful.

For the former, the download link below is correct. I can't mechanically narrow down the list of possible downloads from the model page because there are several candidates. (I could have tried all of them... but not in this case...)

The latter is to preserve the little bit of information that gets lost in the process of converting the model, and to provide a backup for everyone.
YnTec, for example, is in that style.
But doing that with FLUX.1 would add 20GB...
https://huggingface.co/Yntec/epiCPhotoGasm/tree/main

it download it but download aborts everytime after many tries. I have fast connection but maybe unstable one. But i have still download timer running though. It's difficult to follow download progress or to know if download is active.

I don't know if Civitai uses CloudFlare or some other CDN, but anyway, if you get a lucky server, it drops within minutes, but if you get a loser, you have to wait for several hours.
In any case, it is not as stable as HF, so it is a safe bet to upload to HF and then convert.

It would be best if I could convert locally, but I dread to think how much RAM and VRAM the people who can convert locally have. Even my PC is old but not that low end.

In conclusion, it seems that we have to wait for Diffusers, or rather the HF Inference API itself, to be upgraded.

The converter aborts in the middle of the process due to lack of memory, but the temporary files were still there, so I uploaded them and found out a terrible fact.
Diffusers or rather pytorch on HF's server probably does not support torch float8.
So, if I want to run it serverless, I have to convert it to torch.bfloat16.
I don't have that much RAM.😭

https://huggingface.co/John6666/testa
https://huggingface.co/spaces/John6666/demo

runtime error
Exit code: 1. Reason: -packages/fsspec/asyn.py", line 103, in sync
~~~
module 'torch' has no attribute 'float8_e4m3fn'

Sign up or log in to comment