import spaces import gradio as gr import torch #import transformers #from transformers import AutoTokenizer #from transformers import pipeline from diffusers import StableDiffusionXLPipeline, UNet2DConditionModel, EulerDiscreteScheduler from huggingface_hub import hf_hub_download from safetensors.torch import load_file base = "stabilityai/stable-diffusion-xl-base-1.0" repo = "ByteDance/SDXL-Lightning" ckpt = "sdxl_lightning_4step_unet.safetensors" # Use the correct ckpt for your step setting! # Load model. pipe_box=[] @spaces.GPU() def main(): def init(): device="cuda:0" #unet = UNet2DConditionModel.from_config(base, subfolder="unet").to(device, torch.float16) #unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device=device)) #pipe = StableDiffusionXLPipeline.from_pretrained(base, unet=unet, torch_dtype=torch.float16, variant="fp16").to(device) pipe = StableDiffusionXLPipeline.from_pretrained(base, torch_dtype=torch.float16, variant="fp16").to(device) # Ensure sampler uses "trailing" timesteps. pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing") pipe_box.append(pipe) #init() @spaces.GPU() def run(): init() pipe=pipe_box[0] # Ensure using the same inference steps as the loaded model and CFG set to 0. return pipe("A cat", num_inference_steps=4, guidance_scale=0).images[0].save("output.png") ''' tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b") model = transformers.AutoModelForCausalLM.from_pretrained( 'mosaicml/mpt-7b-instruct', trust_remote_code=True ) pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cuda:0') INSTRUCTION_KEY = "### Instruction:" RESPONSE_KEY = "### Response:" INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request." PROMPT_FOR_GENERATION_FORMAT = """{intro} {instruction_key} {instruction} {response_key} """.format( intro=INTRO_BLURB, instruction_key=INSTRUCTION_KEY, instruction="{instruction}", response_key=RESPONSE_KEY, ) example = "James decides to run 3 sprints 3 times a week. He runs 60 meters each sprint. How many total meters does he run a week? Explain before answering." fmt_ex = PROMPT_FOR_GENERATION_FORMAT.format(instruction=example) @spaces.GPU def run(): with torch.autocast('cuda', dtype=torch.bfloat16): return( pipe('Here is a recipe for vegan banana bread:\n', max_new_tokens=100, do_sample=True, use_cache=True)) ''' with gr.Blocks() as app: btn = gr.Button() #outp=gr.Textbox() outp=gr.Image() btn.click(run,None,outp) app.launch() if __name__ == "__main__": main()