Shitao
/

OmniGen-v1

Text-to-Image

Diffusers

Safetensors

phi3

image-to-image

Model card Files Files and versions Community

Shitao commited on Oct 22

Commit

aff8a9e

•

1 Parent(s): ac48fca

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ More information please refer to our github repo: https://github.com/VectorSpace
 ## 1. Overview
-OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible and easy to use. We provide [inference code](#4-quick-start) so that everyone can explore more functionalities of OmniGen.
 Existing image generation models often require loading several additional network modules (such as ControlNet, IP-Adapter, Reference-Net, etc.) and performing extra preprocessing steps (e.g., face detection, pose estimation, cropping, etc.) to generate a satisfactory image. However, **we believe that the future image generation paradigm should be more simple and flexible, that is, generating various images directly through arbitrarily multi-modal instructions without the need for additional plugins and operations, similar to how GPT works in language generation.**
@@ -61,7 +61,7 @@ You can see details in our [paper](https://arxiv.org/abs/2409.11340).
 ## 4. What Can OmniGen do?
-![demo](./imgs/demo_cases.png)
 OmniGen is a unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, image editing, and image-conditioned generation. **OmniGen don't need additional plugins or operations, it can automatically identify the features (e.g., required object, human pose, depth mapping) in input images according the text prompt.**
 We showcase some examples in [inference.ipynb](https://github.com/VectorSpaceLab/OmniGen/blob/main/inference.ipynb). And in [inference_demo.ipynb](https://github.com/VectorSpaceLab/OmniGen/blob/main/inference_demo.ipynb), we show a insteresting pipeline to generate and modify a image.

 ## 1. Overview
+OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible and easy to use. We provide [inference code](#5-quick-start) so that everyone can explore more functionalities of OmniGen.
 Existing image generation models often require loading several additional network modules (such as ControlNet, IP-Adapter, Reference-Net, etc.) and performing extra preprocessing steps (e.g., face detection, pose estimation, cropping, etc.) to generate a satisfactory image. However, **we believe that the future image generation paradigm should be more simple and flexible, that is, generating various images directly through arbitrarily multi-modal instructions without the need for additional plugins and operations, similar to how GPT works in language generation.**
 ## 4. What Can OmniGen do?
+![demo](./demo_cases.png)
 OmniGen is a unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, image editing, and image-conditioned generation. **OmniGen don't need additional plugins or operations, it can automatically identify the features (e.g., required object, human pose, depth mapping) in input images according the text prompt.**
 We showcase some examples in [inference.ipynb](https://github.com/VectorSpaceLab/OmniGen/blob/main/inference.ipynb). And in [inference_demo.ipynb](https://github.com/VectorSpaceLab/OmniGen/blob/main/inference_demo.ipynb), we show a insteresting pipeline to generate and modify a image.