PZ PRO

philipp-zettl

AI & ML interests

NLP/CV/Multimodal learning

Recent Activity

liked a model about 8 hours ago
shuttleai/shuttle-3-diffusion
liked a Space about 8 hours ago
huggingface-projects/repo_duplicator
reacted to merve's post with ❤️ about 8 hours ago

Organizations

philipp-zettl's activity

reacted to merve's post with ❤️ about 8 hours ago
view post
Post
476
your hugging face profile now has your recent activities 🤗
replied to their post about 1 month ago
replied to their post about 1 month ago
view reply

I think you got me wrong there. I'm mostly concerned about image generation LoRAs that are trained on your person or for instance the pictures of children.
Gate keeping the secret sauce for base models is different and I totally agree with you on that part.

replied to their post about 1 month ago
view reply

I'm more concerned about bad actors using them to create content that might harm you or put you in a bad spot by creating visual content with your face.
For instance to blackmail you or harm your reputation.

I am for sure a big supporter of open source and publish all the things I have the rights to. Yet, I wouldn't publish a LoRA that is trained on my face.

posted an update about 1 month ago
view post
Post
714
This is probably a very hot take, but here goes nothing.

With the incredibly accurate LoRAs we see emerge for high quality models like FLUX from services like fal.ai that offer training within single digit minutes, e.g. 2 min per 1000 iterations.

Why the hell are people publishing private LoRAs as public models?!
Take a look at this listing: https://huggingface.co/models?other=base_model:adapter:black-forest-labs%2FFLUX.1-dev&sort=created

I would expect that people that hold a HF account have some kind of forward thinking. Heck, do you really want to give anyone the power to create ultra realistic images of yourself?!

Didn't we learn anything from social media?
I am puzzled..
·
reacted to clem's post with ❤️ about 1 month ago
view post
Post
4149
Open-source AI creates healthy competition in a field where natural tendencies lead to extreme concentration of power. Imagine a world where only one or two companies could build software. This is the biggest risk and ethical challenge of them all IMO. Let's fight this!
  • 3 replies
·
reacted to reach-vb's post with 🔥 about 1 month ago
view post
Post
5363
Multimodal Ichigo Llama 3.1 - Real Time Voice AI 🔥

> WhisperSpeech X Llama 3.1 8B
> Trained on 50K hours of speech (7 languages)
> Continually trained on 45hrs 10x A1000s
> MLS -> WhisperVQ tokens -> Llama 3.1
> Instruction tuned on 1.89M samples
> 70% speech, 20% transcription, 10% text
> Apache 2.0 licensed ⚡

Architecture:
> WhisperSpeech/ VQ for Semantic Tokens
> Llama 3.1 8B Instruct for Text backbone
> Early fusion (Chameleon)

I'm super bullish on HomeBrew/ Jan and early fusion, audio and text, multimodal models!

(P.S. Play with the demo on Hugging Face: jan-hq/Ichigo-llama3.1-s-instruct)
reacted to tomaarsen's post with 🔥 about 1 month ago
view post
Post
6321
📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
  • 1 reply
·
posted an update about 2 months ago
view post
Post
1394
🚀 Finishing up the prototype of my weekend project called ChessPT 🚀

- The game state is now being rendered. This simplifies coming up with own new moves
- The model space philipp-zettl/ChessPT was updated to provide an interactive mode.
- The space is currently running v0.4 of philipp-zettl/chessPT
- New updates will come this week.
- Training runs will be logged under https://wandb.ai/philipp-zettl/chessPT/

**Note**: The model is still not performing on a level that I want it to. It predicts too frequently invalid moves (according to the game state). In addition to that the post-processing step is a little faulty, so it might be possible that you end up in a state where the model didn't provide a next move.
posted an update about 2 months ago
view post
Post
593
Version 0.2a of ChessPT is currently training.

I decided to wait with the actual v1.0 until I have a better understanding where I want to go and successfully trained the first fine tune.

I'm playing around with a loss that is highly influenced by the idea of reinforcement.

Basically I'm punishing the model for generating invalid PGN strings.
The current approach sets on simplicity

-2: wrong characters in output
-1: invalid PGN string, but valid charset
0: valid PGN string, incl. valid moves


GPT-4o helped me with the implementation. I'm expecting some errors in the implementation.

The training should finish in somewhat 14h, I will upload the new weights then.
But I still need to run extensive tests on this loss before I can happily call it v0.2 ✌️

BTW, I'm also building a space for the model which will be published tonight after adding descriptions and a nice interface. ♟️

philipp-zettl/chessPT
philipp-zettl/ChessPT
posted an update about 2 months ago
view post
Post
1038
This is my first post, so I need to start with a bang!

The people over at https://huggingface.co/Lichess published some amazing data sets over the past weeks, including a collection of >1M standard chess games ( Lichess/standard-chess-games).

Finally it's time to revive my chess buddy project from back in 2021 🎉

So without any further ado... I'm currently training my first character level LLM, and to be quite frank, I'm pretty astonished with the quality of my testing samples.

I'm using e4 g6, the Modern Defense (https://en.wikipedia.org/wiki/Modern_Defense) as a validation sample.
My model currently predicts mostly d4 Bg7 which are the strongest next moves for white and black.

Now in between I see some results that take lower ranked moves, which makes me very excited.

Once the pre-training is done for the base model, I want to run some fine tuning on more specific data sets, which are
Lichess/chess-openings
Lichess/chess-puzzles

Here are some intermediate examples

Step 6000: 
1. e4 g6 13. Rb1 d5 14. Bd3 Nxf3 15. Nxf3 Nxe3+ 16. Rxd3 Rxd3 17. Rxd6 Rhe8 18. Nd6 Rxd4 19. Rxd7+ Kxd7 20. Re7 Rxe7 21. Qxe7 1-0

Step 12000:
1. e4 g6 22. Be2 Re8 23. Kg2 1-0
1. d4 d5 2. c4 c6 3. Nf3 e6 4. dxe6 Qe7 5. Bb5+ Be8 6. Bxb7# 1-0
1. d4 d5 2. dxe5 Bd6 3. Nc3 h6 4. e4 Bf5 5. exf5 Nd7 6. exd5 Nxd5 7. Bxc4 Bxe2 8. f4 d4 9. Ng3 Bb4+ 10. Bxd4 Qxd4 11. Nfxe2 O-O-O 12. Ne6 Qf5 13. fxg4 Nxe5

Step 30000:
1. e4 g6 2. d4 Bg7 3. Nf3 d6 4. b3 e6 5. Bb2 f5 6. e5 c5 7. dxc5 dxc5 8. Nbd2 Nf6 9. Nce2 O-O 10. Qe2 c4 11. Na4 Bd6 12. f3 Ng4 13. fxg4 1-0
1. c4 c5 2. a3 Nc6 3. cxd5 Nxd5 4. Bf4 g6 5. Be2 Bg7 6. Nf3 Bg4 7. b4 Nf6 8. h3 Bxf3 9. Bxf3 a6 10. Nc3 O-O 11. Qc2 e

(each line starting with 1. is a set of moves)

You can find a first pre trained version here:
philipp-zettl/chessPT
reacted to Wauplin's post with 🤗🔥 about 2 months ago
view post
Post
2720
What a great milestone to celebrate! The huggingface_hub library is slowly becoming a cornerstone of the Python ML ecosystem when it comes to interacting with the @huggingface Hub. It wouldn't be there without the hundreds of community contributions and feedback! No matter if you are loading a model, sharing a dataset, running remote inference or starting jobs on our infra, you are for sure using it! And this is only the beginning so give a star if you wanna follow the project 👉 https://github.com/huggingface/huggingface_hub
  • 1 reply
·
reacted to clem's post with ❤️ 2 months ago
view post
Post
3629
This isn’t a goal of ours because we have plenty of money in the bank but quite excited to see that @huggingfaceis profitable these days, with 220 team members and most of our platform being free (like model hosting) and open-source for the community!

Especially noteworthy at a time when most AI startups wouldn’t survive a year or two without VC money. Yay!
·
reacted to Jofthomas's post with 🔥 3 months ago
view post
Post
3221
Everchanging Quest is out !

It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.

Come test it on the space :
Jofthomas/Everchanging-Quest
  • 2 replies
·
reacted to gokaygokay's post with 🔥 3 months ago
view post
Post
7921
I've built a space for creating prompts for FLUX

gokaygokay/FLUX-Prompt-Generator

You can create long prompts from images or simple words. Enhance your short prompts with prompt enhancer. You can configure various settings such as artform, photo type, character details, scene details, style, and artist to create tailored prompts.

And you can combine all of them with custom prompts using llms (Mixtral, Mistral, Llama 3, and Mistral-Nemo).

The UI is a bit complex, but it includes almost everything you need. Choosing random option is the most fun!

And i've created some other spaces for using FLUX models with captioners and enhancers.

- gokaygokay/FLUX.1-dev-with-Captioner
·
reacted to davanstrien's post with ❤️ 4 months ago
view post
Post
3150
Is your summer reading list still empty? Curious if an LLM can generate a book blurb you'd enjoy and help build a KTO preference dataset at the same time?

A demo using Hugging Face Spaces and Gradio to collect LLM output preferences: davanstrien/would-you-read-it
  • 1 reply
·
reacted to Ameeeee's post with 🔥 4 months ago
view post
Post
3534
❤️‍🔥 Just released version 2.0 of Argilla!

This small revolution includes:

🔌 You can now integrate with the Hugging Face Hub and get started in under five minutes.
🪂 A single Dataset class is now designed to handle multiple tasks.
🔧 It’s 100 times simpler to configure your dataset now with the new SDK!
📖 The documentation has been revamped to be cleaner and more user-friendly.
🍌  A new feature automates splitting annotation tasks among a team.
✍️ The layout has been made more flexible to accommodate many use cases.

Check out the release highlights for more details: https://github.com/argilla-io/argilla/releases/tag/v2.0.0
  • 1 reply
·
reacted to alvdansen's post with ❤️ 5 months ago
view post
Post
3245
Hey All!

I've been asked a lot of share more on how I train LoRAs. The truth is I don't think my advice is very helpful without also including more contextual, theoretical commentary on how I **think** about training LoRAs for SDXL and other models.

I wrote a first article here about it - let me know what you think.

https://huggingface.co/blog/alvdansen/thoughts-on-lora-training-1

Edit: Also people kept asking where to start so I made a list of possible resources:
https://huggingface.co/blog/alvdansen/thoughts-on-lora-training-pt-2-training-services
·
reacted to dvilasuero's post with 🔥 5 months ago
view post
Post
7943
Today is a huge day in Argilla’s history. We couldn’t be more excited to share this with the community: we’re joining Hugging Face!

We’re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.

Over the past year, we’ve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyr’s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets

After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, we’re now the same team.

To those of you who’ve been following us, this won’t be a huge surprise, but it will be a big deal in the coming months. This acquisition means we’ll double down on empowering the community to build and collaborate on high quality datasets, we’ll bring full support for multimodal datasets, and we’ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.

As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amélie.

Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.

Would love to answer any questions you have so feel free to add them below!
·