Personal Coding Assistant
AI & ML interests
Code language models.
Recent Activity
View all activity
coding-assistant-custom's activity
chansungย
authored
a
paper
10 days ago
Post
2020
Introducing a high-quality open-preference dataset to further this line of research for image generation.
Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!
So, we decided to work on one with the community!
Check it out here:
https://huggingface.co/blog/image-preferences
Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!
So, we decided to work on one with the community!
Check it out here:
https://huggingface.co/blog/image-preferences
Post
2087
The Control family of Flux from
@black-forest-labs
should be discussed more!
It enables structural controls like ControlNets while being significantly less expensive to run!
So, we're working on a Control LoRA training script ๐ค
It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
It enables structural controls like ControlNets while being significantly less expensive to run!
So, we're working on a Control LoRA training script ๐ค
It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
sayakpaulย
authored
a
paper
16 days ago
Post
1465
Let 2024 be the year of video model fine-tunes!
Check it out here:
https://github.com/a-r-r-o-w/cogvideox-factory/tree/main/training/mochi-1
Check it out here:
https://github.com/a-r-r-o-w/cogvideox-factory/tree/main/training/mochi-1
Post
2593
It's been a while we shipped native quantization support in
We currently support
This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
diffusers
๐งจWe currently support
bistandbytes
as the official backend but using others like torchao
is already very simple. This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
Post
1795
๐๏ธ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.
Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.
๐๏ธ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/
This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)
"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer
This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!
Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.
๐๏ธ Try out yourself at https://deep-diver.github.io/ai-paper-reviewer/
This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)
"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in!
: https://github.com/deep-diver/paper-reviewer
This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!
Post
4631
Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!
It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/
Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer
It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq ) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents.
blog link: https://deep-diver.github.io/ai-paper-reviewer/
Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest!
: https://github.com/deep-diver/paper-reviewer
lvwerraย
authored
a
paper
about 2 months ago
Post
2752
Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:
* Decrease the rank of a LoRA
* Increase the rank of a LoRA
The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to
Check it out here:
sayakpaul/flux-lora-resizing
* Decrease the rank of a LoRA
* Increase the rank of a LoRA
The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to
torch.compile()
them. Check it out here:
sayakpaul/flux-lora-resizing
sayakpaulย
authored
a
paper
4 months ago
chansungย
authored
a
paper
4 months ago
Post
2945
Here is a hackable and minimal implementation showing how to perform distributed text-to-image generation with Diffusers and Accelerate.
Full snippet is here: https://gist.github.com/sayakpaul/cfaebd221820d7b43fae638b4dfa01ba
With @JW17
Full snippet is here: https://gist.github.com/sayakpaul/cfaebd221820d7b43fae638b4dfa01ba
With @JW17
Post
4477
Flux.1-Dev like images but in fewer steps.
Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged
Enjoy the Monday ๐ค
Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged
Enjoy the Monday ๐ค
Post
3793
With larger and larger diffusion transformers coming up, it's becoming increasingly important to have some good quantization tools for them.
We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.
We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.
Diffusers ๐ค Quanto โค๏ธ
This was a juicy collaboration between @dacorvo and myself.
Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
We present our findings from a series of experiments on quantizing different diffusion pipelines based on diffusion transformers.
We demonstrate excellent memory savings with a bit of sacrifice on inference latency which is expected to improve in the coming days.
Diffusers ๐ค Quanto โค๏ธ
This was a juicy collaboration between @dacorvo and myself.
Check out the post to learn all about it
https://huggingface.co/blog/quanto-diffusers
lvwerraย
authored
a
paper
6 months ago
Post
2206
Were you aware that we have a dedicated guide on different prompting mechanisms to improve the image generation quality? ๐งจ
Takes you through simple prompt engineering, prompt weighting, prompt enhancement using GPT-2, and more.
Check out the guide here ๐ฆฏ
https://huggingface.co/docs/diffusers/main/en/using-diffusers/weighted_prompts
Takes you through simple prompt engineering, prompt weighting, prompt enhancement using GPT-2, and more.
Check out the guide here ๐ฆฏ
https://huggingface.co/docs/diffusers/main/en/using-diffusers/weighted_prompts
Post
3130
What is your favorite part of our Diffusers integration of Stable Diffusion 3?
My personal favorite is the ability to run it on a variety of different GPUs with minimal code changes.
Learn more about them here:
https://huggingface.co/blog/sd3
My personal favorite is the ability to run it on a variety of different GPUs with minimal code changes.
Learn more about them here:
https://huggingface.co/blog/sd3
sayakpaulย
authored
a
paper
6 months ago