init readme
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ inference: False
|
|
16 |
|
17 |
![不同风格、不同prompt的生成效果展示](./imgs/high-resolution.jpg)
|
18 |
|
19 |
-
文生图模型如谷歌的Imagen、OpenAI的DALL-E 3和Stability AI的Stable Diffusion引领了AIGC和数字艺术创作的新浪潮。然而,基于SD v1.5的中文文生图模型,如
|
20 |
|
21 |
The surge in text-to-image models like Google's Imagen, OpenAI's DALL-E 3, and Stability AI's Stable Diffusion has revolutionized digital art creation. However, the effectiveness of Chinese text-to-image models, such as taiyi-diffusion-v0.1 and alt-diffusion based on SD v1.5, remains moderate. Many AI art platforms in China support only English or rely on Chinese-to-English translation tools. Current open-source text-to-image models predominantly support English, with limited bilingual capabilities. Our work, Taiyi-Diffusion-XL (Taiyi-XL), builds on these developments, focusing on enhancing Chinese text-to-image generation while retaining English proficiency, addressing the unique challenges of bilingual language processing.
|
22 |
|
@@ -55,11 +55,11 @@ Our machine evaluation involved a comprehensive comparison of various models. Th
|
|
55 |
|
56 |
## 人类偏好评估 Human Preference Evaluation
|
57 |
|
58 |
-
如下图所示,比较了不同模型在中英文文生图生成方面的表现。XL版本模型,如SD-XL和Taiyi-XL,在1.5版本模型如SD-v1.5和Alt-Diffusion上显示出显著改进。DALL-E 3
|
59 |
|
60 |
As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
|
61 |
|
62 |
-
尽管Taiyi-XL
|
63 |
|
64 |
Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
|
65 |
|
|
|
16 |
|
17 |
![不同风格、不同prompt的生成效果展示](./imgs/high-resolution.jpg)
|
18 |
|
19 |
+
文生图模型如谷歌的Imagen、OpenAI的DALL-E 3和Stability AI的Stable Diffusion引领了AIGC和数字艺术创作的新浪潮。然而,基于SD v1.5的中文文生图模型,如Taiyi-Diffusion-v0.1和Alt-Diffusion的效果仍然一般。中国的许多AI绘画平台仅支持英文,或依赖中译英的翻译工具。目前的开源文生图模型主要支持英文,双语支持有限。我们的工作,Taiyi-Diffusion-XL(Taiyi-XL),在这些发展的基础上,专注于保留英文理解能力的同时增强中文文生图生成能力,更好地支持双语文生图。
|
20 |
|
21 |
The surge in text-to-image models like Google's Imagen, OpenAI's DALL-E 3, and Stability AI's Stable Diffusion has revolutionized digital art creation. However, the effectiveness of Chinese text-to-image models, such as taiyi-diffusion-v0.1 and alt-diffusion based on SD v1.5, remains moderate. Many AI art platforms in China support only English or rely on Chinese-to-English translation tools. Current open-source text-to-image models predominantly support English, with limited bilingual capabilities. Our work, Taiyi-Diffusion-XL (Taiyi-XL), builds on these developments, focusing on enhancing Chinese text-to-image generation while retaining English proficiency, addressing the unique challenges of bilingual language processing.
|
22 |
|
|
|
55 |
|
56 |
## 人类偏好评估 Human Preference Evaluation
|
57 |
|
58 |
+
如下图所示,比较了不同模型在中英文文生图生成方面的表现。XL版本模型,如SD-XL和Taiyi-XL,在1.5版本模型如SD-v1.5和Alt-Diffusion上显示出显著改进。DALL-E 3以其生动的色彩和prompt-following的能力而著称。Taiyi-XL模型偏向生成摄影风格的图片,与Midjourney较为类似,但是Taiyi-XL并在双语(中英文)文生图生成方面表现更出色。
|
59 |
|
60 |
As shown in the figures below, a comparison of different models in Chinese and English text-to-image generation performance is presented. The XL version models, such as SD-XL and Taiyi-XL, show significant improvements over the 1.5 version models like SD-v1.5 and Alt-Diffusion. DALL-E 3 is renowned for its vibrant colors and its ability to closely follow text prompts, setting a high standard. Our Taiyi-XL model, with its photographic style, closely matches the performance of Midjourney and excels in bilingual (Chinese and English) text-to-image generation.
|
61 |
|
62 |
+
尽管Taiyi-XL可能还未能与商业模型相媲美,但它比当前双语开源模型优越不少。我们认为我们模型与商业模型的差距主要归因于训练数据的数量、质量和多样性的差异。我们的模型仅使用符合版权要求的图文数据进行训练。正如大家所知的,版权问题仍然是文生图和AIGC模型最大的问题。
|
63 |
|
64 |
Although Taiyi-XL may not yet rival commercial models, it excels among current bilingual open-source models. The gap with commercial models is mainly due to differences in the quantity, quality, and diversity of training data. Our model is trained exclusively on copyright-compliant image-text data. As is well known, copyright issues remain the biggest challenge in text-to-image and AI-generated content (AIGC) models.
|
65 |
|