Thomas Wolf PRO

thomwolf

AI & ML interests

NLP and open-source :-)

Recent Activity

liked a dataset 3 days ago
opensourceorg/osaid
replied to nyuuzyou's post 3 days ago
reacted to nyuuzyou's post with πŸ”₯ 3 days ago

Articles

Organizations

thomwolf's activity

replied to nyuuzyou's post 3 days ago
reacted to nyuuzyou's post with πŸ”₯ 3 days ago
view post
Post
896
πŸ–ΌοΈ Introducing Public Domain Pictures Dataset - nyuuzyou/publicdomainpictures

Dataset highlights:
- 644,412 public domain images with comprehensive metadata from publicdomainpictures.net
- English language metadata including titles, descriptions, and keywords
- Each entry contains rich metadata including:
- Unique image ID and full-size image URLs
- Detailed titles and descriptions
- Keyword/tag collections
- Creator attribution
- Released to the public domain under Creative Commons Zero (CC0) license
  • 2 replies
Β·
posted an update 3 days ago
replied to sequelbox's post 3 days ago
reacted to sequelbox's post with πŸ‘ 3 days ago
reacted to LukeNeumann's post with πŸ‘πŸ”₯ 3 days ago
view post
Post
1831
Hello Hugging Face community!

I wanted to introduce myself and my company @Overlaiapp . We are a collective of filmmakers, photographers, and AI engineers working on high resolution (8K+) training data.

We plan to share a lot of our datasets with the community and are kicking things off with two curated datasets:

- Overlaiai/OregonCoastin4K

- Overlaiai/SubArcticPolarBear


Overlai.ai Dataset Features

πŸŽ₯ Oversampled: Every clip is captured in stunning 8K resolution, delivering rich detail ideal for fine tuning scenic landscapes and ocean dynamics.

πŸ“Έ Variance: Includes close-up details, slow-motion footage of crashing waves, sweeping landscapes, and wildlife shots.

πŸ“‹ Detailed Metadata: Every clip is paired with structured metadata, including creative descriptions, precise camera movements, lens information, field of view calculations, and shot settings, ensuring AI models can fully understand and replicate real-world cinematography with accuracy.

βš™οΈ Consistency: Re-thinking training data at the point of capture by "overshooting" a subject, enabling models to learn more nuanced relationships and views across scenes.

πŸŒ… Light: Shot during early morning and sunset light for optimal color contrast and dynamic range, maximizing visual quality for color and lighting-sensitive tasks.

πŸ” Curation: Curated specifically for machine learning, providing clean, high-quality data for next generation model training.
reacted to merve's post with ❀️ 28 days ago
view post
Post
2440
Lotus πŸͺ· is a new foundation model on monocular depth estimation ✨
Compared to previous diffusion-based MDE models, Lotus is modified for dense prediction tasks
Authors also released a model for normal prediction πŸ€—
Find everything in this collection merve/lotus-6718fb957dc1c85a47ca1210
reacted to singhsidhukuldeep's post with ❀️ 28 days ago
view post
Post
2731
If you have ~300+ GB of V-RAM, you can run Mochi from @genmo

A SOTA model that dramatically closes the gap between closed and open video generation models.

Mochi 1 introduces revolutionary architecture featuring joint reasoning over 44,520 video tokens with full 3D attention. The model implements extended learnable rotary positional embeddings (RoPE) in three dimensions, with network-learned mixing frequencies for space and time axes.

The model incorporates cutting-edge improvements, including:
- SwiGLU feedforward layers
- Query-key normalization for enhanced stability
- Sandwich normalization for controlled internal activations

What is currently available?
The base model delivers impressive 480p video generation with exceptional motion quality and prompt adherence. Released under the Apache 2.0 license, it's freely available for both personal and commercial applications.

What's Coming?
Genmo has announced Mochi 1 HD, scheduled for release later this year, which will feature:
- Enhanced 720p resolution
- Improved motion fidelity
- Better handling of complex scene warping
  • 2 replies
Β·
reacted to fdaudens's post with ❀️ 28 days ago
posted an update 28 days ago
view post
Post
4045
Parents in the 1990: Teach the kids to code
Parents now: Teach the kids to fix the code when it starts walking around πŸ€–βœ¨
  • 2 replies
Β·
reacted to singhsidhukuldeep's post with πŸ”₯ 2 months ago
view post
Post
1219
Remember when @Google launched MediaPipe in an effort to create efficient on-device pipelines?

They've just unlocked the ability to run 7B+ parameter language models directly in your browser. This is a game-changer for on-device AI!

Yes, they are streaming 8.6 GB model files!

Currently, they have Gemma 2B/7B running, but imagine Dynamic LoRA, multimodal support, quantization, and you never leaving Chrome!

This is a significant technical advancement, especially in Memory Optimization:

- Redesigned the model-loading code to work around WebAssembly's 4 GB memory limit.
- Implemented asynchronous loading of transformer stack layers (28 for Gemma 1.1 7B).
- Reduced peak WebAssembly memory usage to less than 1% of previous requirements.

Cross-Platform Compatibility
- Compiled the C++ codebase to WebAssembly for broad browser support.
- Utilized the WebGPU API for native GPU acceleration in browsers.

Here's why this matters:

1. Privacy: No need to send data to remote servers.
2. Cost-Efficiency: Eliminates server expenses.
3. Offline Capabilities: Use powerful AI without an internet connection.

Blog: https://research.google/blog/unlocking-7b-language-models-in-your-browser-a-deep-dive-with-google-ai-edges-mediapipe/
reacted to alex-abb's post with πŸ‘πŸ”₯ 5 months ago
view post
Post
4768
Hi everyone!
I'm Alex, I'm 16, I've been an internship at Hugging Face for a little over a week and I've already learned a lot about using and prompting LLM models. With @victor as tutor I've just finished a space that analyzes your feelings by prompting an LLM chat model. The aim is to extend it so that it can categorize hugging face posts.

alex-abb/LLM_Feeling_Analyzer
Β·
reacted to fdaudens's post with ❀️ 5 months ago
view post
Post
3414
A nice improvement for Hugging Face on Sheets: You can now customize your prompt and select the model of your choice directly on the sheet.

Thanks to @louisbrulenaudet for the contribution. Really cool to see the community improving this tool!

Try it here: JournalistsonHF/huggingface-on-sheets
reacted to yunusserhat's post with πŸš€ 5 months ago
view post
Post
3129
Hello everyone,

I am pleased to announce that I have founded the University of Glasgow organization on Huggingface. If you are affiliated with the University of Glasgow or have a relative who is, you can log in through the relevant link.

https://huggingface.co/UniversityofGlasgow
  • 1 reply
Β·
reacted to KnutJaegersberg's post with πŸ‘ 6 months ago
reacted to frimelle's post with β€οΈπŸ€— 6 months ago
view post
Post
1842
Wikimedia and Hugging Face seem kind of naturally complementary: Both are community-centred, value openness and consent. That's why I'd love to see more Wikipedia and other Wikimedia projects' datasets on Hugging Face to advance machine learning with diverse, community-curated data! See my new article on the Hugging Face hub for why and how to create more Wikimedia datasets on Hugging Face: https://huggingface.co/blog/frimelle/wikipedias-treasure-trove-ml-data
reacted to Salama1429's post with 😎 6 months ago
view post
Post
2432
πŸ“Ί Introducing the YouTube-Commons Dataset πŸ“Ί

🌐 Overview: The YouTube Commons Dataset is a comprehensive collection of 30 billion words from 15,112,121 original and automatically translated transcripts, drawn from 2,063,066 videos on YouTube.

πŸ”— License: All videos are shared under the CC-BY license, with the majority (71%) in English.

πŸ€– Applications: This dataset is ideal for training powerful AI models for converting speech to text (ASR) and translation models.

πŸ“Š Utilization: The text can be used for model training and is republishable for reproducibility purposes.

🀝 Collaboration: This dataset is the result of a collaboration between state start-up LANGU:IA, the French Ministry of Culture, and DINUM. It will be expanded in the coming months.

πŸ”— Explore the dataset here: https://lnkd.in/d_paWKFE

#YouTubeCommons #AIResearch #MachineLearning #OpenData #ArtificialIntelligence #NLP #Dataset #TechCollaboration #Innovation #DigitalTransformation