Spaces:

tttoaster
/

SEED-X-17B

Build error

App Files Files Community

tttoaster commited on May 14, 2024

Commit

070a748

verified ·

1 Parent(s): 4086dbb

Update app.py

Browse files

Files changed (1) hide show

app.py +13 -2

app.py CHANGED Viewed

@@ -26,7 +26,7 @@ from flask import Flask
 import json
 from typing import Optional
 import cv2
-from diffusers import AutoencoderKL, UNet2DConditionModel, EulerDiscreteScheduler
 pyrootutils.setup_root(__file__, indicator=".project-root", pythonpath=True)
@@ -185,6 +185,10 @@ class LLMService:
         self.visual_encoder.to(self.vit_sd_device, dtype=self.dtype)
         self.boi_token_id = self.tokenizer.encode(BOI_TOKEN, add_special_tokens=False)[0]
         self.eoi_token_id = self.tokenizer.encode(EOI_TOKEN, add_special_tokens=False)[0]
@@ -355,6 +359,13 @@ def generate(text_list, image_list, max_new_tokens, force_boi, force_bbox):
         for img_idx in range(output['num_gen_imgs']):
             img_feat = img_gen_feat[img_idx:img_idx + 1]
             generated_image = service.sd_adapter.generate(image_embeds=img_feat, num_inference_steps=50)[0]
             image_base64 = encode_image(generated_image)
             gen_imgs_base64_list.append(image_base64)
@@ -628,7 +639,7 @@ SEED-X-I can follow multimodal instruction (including images with **dynamic reso
 SEED-X-I **does not support image manipulation**. If you want to experience **SEED-X-Edit** for high-precision image editing, please refer to [[Inference Code]](https://github.com/AILab-CVC/SEED-X).
-Due to insufficient GPU memory, when generating images, we need to offload the LLM to the CPU and move the de-tokenizer to the CPU, which will **result in a long processing time**. If you want to experience the normal model inference speed, you can run [[Inference Code]](https://github.com/AILab-CVC/SEED-X) locally.
 ## Tips:

 import json
 from typing import Optional
 import cv2
+from diffusers import AutoencoderKL, UNet2DConditionModel, EulerDiscreteScheduler, StableDiffusionImg2ImgPipeline
 pyrootutils.setup_root(__file__, indicator=".project-root", pythonpath=True)
         self.visual_encoder.to(self.vit_sd_device, dtype=self.dtype)
+        model_id_or_path = "stablediffusionapi/realistic-vision-v51"
+        self.vae_pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
+        self.vae_pipe = pipe.to(self.vit_sd_device)
         self.boi_token_id = self.tokenizer.encode(BOI_TOKEN, add_special_tokens=False)[0]
         self.eoi_token_id = self.tokenizer.encode(EOI_TOKEN, add_special_tokens=False)[0]
         for img_idx in range(output['num_gen_imgs']):
             img_feat = img_gen_feat[img_idx:img_idx + 1]
             generated_image = service.sd_adapter.generate(image_embeds=img_feat, num_inference_steps=50)[0]
+            init_image = generated_image.resize((1024, 1024))
+            prompt = ""
+            images = service.vae_pipe(prompt=prompt, image=init_image,
+                          num_inference_steps=50, guidance_scale=8.0, strength=0.38).images
+            generated_image = images[0]
             image_base64 = encode_image(generated_image)
             gen_imgs_base64_list.append(image_base64)
 SEED-X-I **does not support image manipulation**. If you want to experience **SEED-X-Edit** for high-precision image editing, please refer to [[Inference Code]](https://github.com/AILab-CVC/SEED-X).
+If you want to experience the normal model inference speed, you can run [[Inference Code]](https://github.com/AILab-CVC/SEED-X) locally.
 ## Tips: