File size: 2,475 Bytes
8177abd
 
 
 
 
 
 
 
996ed25
f3cea82
996ed25
8177abd
996ed25
 
85da839
 
996ed25
0f2c055
 
996ed25
 
 
 
856788b
 
 
 
27d192e
856788b
 
 
 
 
 
996ed25
856788b
 
 
 
2de19a7
856788b
 
 
 
 
 
524a2a5
856788b
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: README
emoji: 🏃
colorFrom: red
colorTo: yellow
sdk: static
pinned: false
---
<img src="https://github.com/dome272/Wuerstchen/assets/61938694/0617c863-165a-43ee-9303-2a17299a0cf9">
Welcome to <b>WARP</b>. This is our little organization for multimodal generative models, focusing on the visual domain. We have been working with generative image models a lot and
will soon work on video models as well. Our main team consists of:

- [Pablo Pernias](https://github.com/pabloppp/)
- [Dominic Rampas](https://github.com/dome272)
- [Marc Aubreville](https://www.linkedin.com/in/marc-aubreville-48a977120/?locale=en_US)
- [Mats L. Richter](https://scholar.google.com/citations?user=xtlV5SAAAAAJ&hl=de)

A special thanks to the Huggingface Team for helping to bring our research to Diffusers! (Special thanks to [Kashif](https://github.com/kashif/), [Patrick](https://github.com/patrickvonplaten) and [Sayak](https://github.com/sayakpaul)!)


Feel free to join our [Discord](https://discord.gg/BTUAzb8vFY) channel!

Models:
<details>
<summary>
  Paella
</summary>
  <img src="https://user-images.githubusercontent.com/61938694/231021615-38df0a0a-d97e-4f7a-99d9-99952357b4b1.png" width=1200>
  <ul>
    <li>A simple & straightforward text-conditional image generation model that works on quantized latents.</li>
    <li>More details can be found in the <a href="https://arxiv.org/abs/2211.07292v2">paper</a>, the <a href="https://laion.ai/blog/paella/">blog post</a> and the <a href="https://www.youtube.com/watch?v=zdE1I6kYKYc">YouTube video</a>.</li>
    <li>Only accessible through <a href="https://github.com/dome272/Paella">GitHub</a>.</li>
  </ul>
</details>

<details>
<summary>
  Würstchen
</summary>
  <img src="https://github.com/dome272/Wuerstchen/assets/61938694/647b6781-8b07-4467-ad7d-9932d0069aa3">
  <ul>
    <li>An efficient text-to-image model to train and use for inference. Achieves competetive performance to state-of-the-art methods, while needing only a fraction of the compute.</li>
    <li>More details can be found in the <a href="https://arxiv.org/abs/2306.006372">paper</a>.</li>
    <li>Versions:</li>
    <ul>
      <li>v1: Only accessible through <a href="https://github.com/dome272/Wuerstchen/">GitHub</a>.</li>
      <li>v2: Accessible through <a href="https://github.com/dome272/Wuerstchen/">GitHub</a> and <a href="https://huggingface.co/docs/diffusers/main/en/api/pipelines/wuerstchen">Diffusers</a</li>
    </ul>
  </ul>
</details>