Parameter Format for Serverless Inference (Text to Image)

#1
by DeFactOfficial - opened

Hey... thank you for sharing this with the community. Your code works great, and I'm curious to know how you got it running - because the official docs at https://huggingface.co/docs/api-inference/tasks/text-to-image are TOTALLY wrong Bringing this up so that the information is clearly laid out for other devs who might attempt to use the Serverless Inference endpoint for image generation tasks. BTW... if you work for HF or know anyone involved in maintaining the documentation, please let them know that the text-to-image interface laid out in the docs is VERY different from the actual interface for interacting with these models.

Because this issue really should be fixed - I know documentation is never a priority in a fast growing startup, but something like an API reference really needs to be accurate!

Here is a comparison between what's in the docs, and the actual code in your space, which appears to work perfectly.

DOCUMENTED VERSION (Does Not Work):

- inputs*	string	The input text data (sometimes called “prompt”)
- parameters	object	Additional inference parameters for Text To Image
        - guidance_scale	number	A higher guidance scale value encourages the model to generate images closely linked to the text prompt, but values too high may cause saturation and other artifacts.
        - negative_prompt	string[]	One or several prompt to guide what NOT to include in image generation.
        - num_inference_steps	integer	The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
        - target_size	object	The size in pixel of the output image
                - width*	integer	
                - height*	integer	
        - scheduler	string	Override the scheduler with a compatible one.
        - seed	integer	Seed for the random number generator.

NYMBO'S VERSION (Works Amazing):

    payload = {
        "inputs": prompt,
        "is_negative": is_negative,
        "steps": steps,
        "cfg_scale": cfg_scale,
        "seed": seed if seed != -1 else random.randint(1, 1000000000),
        "strength": strength,
        "parameters": {
            "width": width,  # Pass the width to the API
            "height": height  # Pass the height to the API
        }
    }

PS. Happy to help with fixing these docs... I'm not technically a technical writer, but I like to think I write better than the average developer :)

Yeah TBH I got this running mostly by hacking together stuff that I saw work in other spaces. The API inference docs are actually brand new as of this month I believe, so I imagine quite a bit of it could still be corrected, and it's still lacking some critical documentation for some tasks. I'm just glad we have the API inference docs, they were badly needed. Feels like parameter syntax is magic knowledge reserved for wizards.

I don't work at HF (pls reach out if ur hiring, Clem) but I do know they are working on this diligently, there's also been a recent effort to modernize the inference infrastructure and there have been lots of changes. Hopefully some of these inconsistencies get smoothed out sooner than later, I think they will.

P.S. thank you for the glazing!

Thank you so much for the work you've done to curate all these spaces, @Nymbo - i actually just learned something else very important from one of your other spaces: the availability of serverless inference for LLMs in hub is not being accurately reported thru either API or on the web ui: the command r plus model supposedly had inference turned off (based on its hub page) but then i discovered it actually is turned ON because I saw that one of your spaces was using it! (I believe the same may be true for llama 405b...)

Sign up or log in to comment