ppaine-landscape / README.md
alkzar90's picture
Add a paragraph in the artistic style transfer
0ac7c2c
|
raw
history blame
No virus
7.69 kB
---
license: creativeml-openrail-m
tags:
- pytorch
- diffusers
- stable-diffusion
- text-to-image
- diffusion-models-class
- dreambooth-hackathon
- landscape
widget:
- text: a photo of ppaine landscape, NIKON Z FX, cinematic light, galaxy sky
---
# Dreambooth Hackaton 23': How can we use a text-to-image generative model to explore the cinematographic appeal of Torres del Paine 🇨🇱?
> _Torres del Paine National Park is a national park encompassing mountains, glaciers, lakes, and rivers in southern Chilean Patagonia._
> _It is also part of the End of the World Route, a tourist scenic route. [Wikipedia](https://en.wikipedia.org/wiki/Torres_del_Paine_National_Park)_
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/snowscat-H3oXiq7_bII-unsplash.jpg" alt="Torres del Paine Snowcatt photo, Unsplash">
<figcaption><a href="https://unsplash.com/@snowscat?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Snowscat</a>'s' Photo, <a href="https://unsplash.com/es/fotos/H3oXiq7_bII?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>
</figcaption>
</figure>
- Reddit post #1: [Dreambooth Hackaton: How can we use a text-to-image model to explore the cinematographic appeal of Torres del Paine 🇨🇱?](https://www.reddit.com/r/StableDiffusion/comments/109fjdu/dreambooth_hackaton_how_can_we_use_a_texttoimage/)
- Reddit post #2: [Animal statues at Torres del Paine, ppain landscape model; Dreambooth + StableDiffusion](https://www.reddit.com/r/StableDiffusion/comments/10a55pz/animal_statues_at_torres_del_paine_ppain/)
## Description
DreamBooth model for the ppaine concept trained by alkzar90 on the alkzar90/torres-del-paine dataset.
This is a Stable Diffusion model fine-tuned on the ppaine concept with DreamBooth. It can be used by modifying the `instance_prompt`: **a photo of ppaine landscape**
This model was created as part of the DreamBooth Hackathon 🔥. Visit the [organisation page](https://huggingface.co/dreambooth-hackathon) for instructions on how to take part!
This is a Stable Diffusion model fine-tuned on `landscape` images for the landscape theme.
## Cinematographics rendering & Object/Artifacts insertion
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/dreambooth-hackaton-patagonia-cinematographics.png" alt="Torres del Paine Landscape Model - Cinematographic Renderings/Artifacts Inmersion">
<figcaption>Figure 1: <b>Cinematographics renderings and object/artifacts insertions in the Chilean Torres del Paine national park</b>. Text prompts for generated images up-to-down rows and left-to-right; (i) <i>"The ppaine landscape in the middle earth, cinematic light, lord of the ring style, epic"</i>,
(ii) <i>"The ppaine landscape in the middle earth, a visible dragon skeleton bones, cinematic light, lord of the ring style, epic"</i>,
(iii) <i>"A long branches forest in the ppaine landscape, mountain peaks at the background, cinematic light, realistic, lord of the ring style, epic"</i>,
(iv) <i>"A futuristic jeep riding in ppaine landscape, cinematic light, technology</i>,
(v) <i>"A futuristic tensor airship flying over the ppaine landscape at night, NIKON-Z-FX"</i>,
(vi) <i>"A huge tensor bridge in the ppaine landscape, cinematic light, majestic, architecture"</i>.
</figcaption>
</figure>
### Animal Statues
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/animal-statues/a-photo-of-an-ancient-stone-condor-statue-in-the-ppaine-landscape%2C-michaelangelo%2C-majestic%2C-NIKON-Z-FX%2C-28mm.png" alt="Condor statue in Torres del Paine landscape">
<figcaption>Figure 2-a: <b>Animal statues in the Chilean Torres del Paine national park</b>. Text prompts for the image: <i>"A photo of an ancient stone condor statue in the ppaine landscape, michaelangelo, majestic, NIKON-Z-FX, 28mm"</i>,
</figcaption>
</figure>
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/animal-statues/a-photo-of-a-marble-huemul-statue-in-the-ppaine-landscape%2C-majestic%2C-michaelangelo%2C-NIKON-Z-FX%2C-28mm%20(1).png" alt="Huemul marble statue in Torres del Paine landscape">
<figcaption>Figure 2-b: <b>Animal statues in the Chilean Torres del Paine national park</b>. Text prompts for the image: <i>"A photo of a marble huemul statue in the ppaine landscape, majestic, michaelangelo, NIKON-Z-FX, 28mm"</i>,
</figcaption>
</figure>
## Director's eye view
What does the director's cut concept mean? The definition by the [Merriam-Webster dictionary](https://www.merriam-webster.com/dictionary/director%27s%20cut#:~:text=noun,version%20created%20for%20general%20distribution) is: _"a version of a motion picture that is edited according to the director's wishes and that usually includes scenes cut from the version created for general distribution"_.
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/dreambooth-hackaton-patagonia-wes-anderson-cut.png" alt="Torres del Paine Landscape Model - Wes Anderson's cut">
<figcaption>Figure 3: <b>Illustration of the director cuts of the Chilean Torres del Paine national park, in Wes Anderson's eyes</b>. Text prompts for generated images left-to-right;
(i) <i>"The ppaine landscape, Wes Anderson style, cinematic light"</i>,
(ii) <i>"The ppaine landscape with a small house in the middle, Wes Anderson style, fish eye"</i>,
(iii) <i>"The ppaine landscape with a small house in the middle, Wes Anderson style, fish eye"</i>.
</figcaption>
</figure>
## Artistic Style Transfer
One way to monitor the fine-tuning process is to look at the model capabilities for transferring well-known artistic styles into the Torres del Paine landscape.
<figure>
<img src="https://huggingface.co/alkzar90/ppaine-landscape/resolve/main/assets/dreambooth-hackaton-patagonia-landscape-painting.png" alt="Torres del Paine Landscape Model - Artist Style Painting">
<figcaption>Figure 4: <b>Artistic renderings of the Chilean Torres del Paine national park in the style of famous painters</b>. Text prompts for generated images up-to-down rows and left-to-right;
(i) <i>"A painting of the ppaine landscape, Vincent Van Gogh style"</i>,
(ii) <i>"A painting of the ppaine landscape, Michelangelo style"</i>,
(iii) <i>"A painting of the ppaine landscape, Botero style"</i>,
(iv) <i>"A painting of the ppaine landscape, Pierre-Auguste Renoir style"</i>,
(v) <i>"A painting of the ppaine landscape, Leonardo Da Vinci style"</i>,
(vi) <i>"A painting of the ppaine landscape, Rembrandt style"</i>.
</figcaption>
</figure>
## Usage
```python
from diffusers import StableDiffusionPipeline
pipeline = StableDiffusionPipeline.from_pretrained('alkzar90/ppaine-landscape')
image = pipeline().images[0]
image
```
## References
* [DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation (Ruiz et al. 2022)](https://arxiv.org/abs/2208.12242})
* [High-Resolution Image Synthesis with Latent Diffusion Models (Rombach et al., 2022 )](https://arxiv.org/abs/2112.10752)
* [Training Stable Diffusion with Dreambooth using 🧨 Diffusers (Post)](https://huggingface.co/blog/dreambooth)
* [Hugging Face DreamBooth Hackathon](https://github.com/huggingface/diffusion-models-class/tree/main/hackathon)
## Thanks to John Whitaker and Lewis Tunstall
Thanks to [John Whitaker](https://github.com/johnowhitaker) and [Lewis Tunstall](https://github.com/lewtun) for writing out and describing the initial hackathon parameters at https://huggingface.co/dreambooth-hackathon.