Probing ViTs
non-profit
AI & ML interests
We are interested to study the representations learned by Vision Transformers.
Recent Activity
View all activity
probing-vits's activity
Post
2002
Introducing a high-quality open-preference dataset to further this line of research for image generation.
Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!
So, we decided to work on one with the community!
Check it out here:
https://huggingface.co/blog/image-preferences
Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!
So, we decided to work on one with the community!
Check it out here:
https://huggingface.co/blog/image-preferences
Post
2084
The Control family of Flux from
@black-forest-labs
should be discussed more!
It enables structural controls like ControlNets while being significantly less expensive to run!
So, we're working on a Control LoRA training script ๐ค
It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
It enables structural controls like ControlNets while being significantly less expensive to run!
So, we're working on a Control LoRA training script ๐ค
It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
sayakpaulย
authored
a
paper
14 days ago
Post
1255
We are blessed with another iteration of Pali Gemma. Google launches PaliGemma 2.
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2
google/paligemma-2-release-67500e1e1dbfdd4dee27ba48
merve/paligemma2-vqav2
Post
1465
Let 2024 be the year of video model fine-tunes!
Check it out here:
https://github.com/a-r-r-o-w/cogvideox-factory/tree/main/training/mochi-1
Check it out here:
https://github.com/a-r-r-o-w/cogvideox-factory/tree/main/training/mochi-1
Post
2586
It's been a while we shipped native quantization support in
We currently support
This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
diffusers
๐งจWe currently support
bistandbytes
as the official backend but using others like torchao
is already very simple. This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
Post
1575
Cohere drops two new multilingual models!
CohereForAI/aya-expanse-8b
CohereForAI/aya-expanse-32b
Try them out here
CohereForAI/aya_expanse
CohereForAI/aya-expanse-8b
CohereForAI/aya-expanse-32b
Try them out here
CohereForAI/aya_expanse
Post
2752
Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:
* Decrease the rank of a LoRA
* Increase the rank of a LoRA
The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to
Check it out here:
sayakpaul/flux-lora-resizing
* Decrease the rank of a LoRA
* Increase the rank of a LoRA
The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to
torch.compile()
them. Check it out here:
sayakpaul/flux-lora-resizing
sayakpaulย
authored
a
paper
4 months ago
Post
1610
You can now use DoRA for your embedding layers!
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.
PR: https://github.com/huggingface/peft/pull/2006
I have documented my journey of this specific PR in a blog post for everyone to read. The highlight of the PR was when the first author of DoRA reviewed my code.
Blog Post: https://huggingface.co/blog/ariG23498/peft-dora
Huge thanks to @BenjaminB for all the help I needed.
Post
2945
Here is a hackable and minimal implementation showing how to perform distributed text-to-image generation with Diffusers and Accelerate.
Full snippet is here: https://gist.github.com/sayakpaul/cfaebd221820d7b43fae638b4dfa01ba
With @JW17
Full snippet is here: https://gist.github.com/sayakpaul/cfaebd221820d7b43fae638b4dfa01ba
With @JW17
Post
4477
Flux.1-Dev like images but in fewer steps.
Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged
Enjoy the Monday ๐ค
Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged
Enjoy the Monday ๐ค
Post
3793
With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.
We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.
We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.
Diffusers ๐ค Quanto โค๏ธ
This was a juicy collaboration between @dacorvo and myself.
Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.
We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.
Diffusers ๐ค Quanto โค๏ธ
This was a juicy collaboration between @dacorvo and myself.
Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
sayakpaulย
updated
a
Space
5 months ago
Post
2206
Were you aware that we have a dedicated guide on different prompting mechanisms to improve the image generation quality? ๐งจ
Takes you through simple prompt engineering, prompt weighting, prompt enhancement using GPT-2, and more.
Check out the guide here ๐ฆฏ
https://huggingface.co/docs/diffusers/main/en/using-diffusers/weighted_prompts
Takes you through simple prompt engineering, prompt weighting, prompt enhancement using GPT-2, and more.
Check out the guide here ๐ฆฏ
https://huggingface.co/docs/diffusers/main/en/using-diffusers/weighted_prompts
Post
3130
What is your favorite part of our Diffusers integration of Stable Diffusion 3?
My personal favorite is the ability to run it on a variety of different GPUs with minimal code changes.
Learn more about them here:
https://huggingface.co/blog/sd3
My personal favorite is the ability to run it on a variety of different GPUs with minimal code changes.
Learn more about them here:
https://huggingface.co/blog/sd3
sayakpaulย
authored
a
paper
6 months ago
Post
1866
๐งจ Diffusers 0.28.0 is out ๐ฅ
It features the first non-generative pipeline of the library -- Marigold ๐ฅ
Marigold shines at performing Depth Estimation and Surface Normal Estimation. It was contributed by @toshas , one of the authors of Marigold.
This release also features a massive refactor (led by @DN6 ) of the
Check out the release notes here:
https://github.com/huggingface/diffusers/releases/tag/v0.28.0
It features the first non-generative pipeline of the library -- Marigold ๐ฅ
Marigold shines at performing Depth Estimation and Surface Normal Estimation. It was contributed by @toshas , one of the authors of Marigold.
This release also features a massive refactor (led by @DN6 ) of the
from_single_file()
method, highlighting our efforts for making our library more amenable to community features ๐คCheck out the release notes here:
https://github.com/huggingface/diffusers/releases/tag/v0.28.0