1 2 193

q

aust-t

AI & ML interests

None yet

Recent Activity

liked a Space 4 days ago

deepseek-ai/Janus-Pro-7B

liked a model 5 days ago

myshell-ai/MeloTTS-Chinese

reacted to lewtun's post with 🔥 17 days ago

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open! 🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1. 🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code. 🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training. Follow along: https://github.com/huggingface/open-r1

View all activity

Organizations

None yet

aust-t's activity

liked a Space 4 days ago

1.68k

Chat With Janus-Pro-7B

🌍

A unified multimodal understanding and generation model.

liked a model 5 days ago

myshell-ai/MeloTTS-Chinese

Text-to-Speech • Updated Mar 1, 2024 • 22.6k • 72

reacted to lewtun's post with 🔥 17 days ago

Post

10048

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

liked 2 models 22 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 3 days ago • 2.94M • • 8.37k

onnx-community/DeepSeek-R1-Distill-Qwen-1.5B-ONNX

Text Generation • Updated 21 days ago • 43.8k • 44

liked a model 26 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 10 days ago • 364k • 3.05k

liked 2 models 28 days ago

openbmb/MiniCPM-o-2_6

Any-to-Any • Updated 1 day ago • 460k • 939

Qwen/Qwen2.5-Math-PRM-7B

Text Classification • Updated 26 days ago • 15k • 51

liked a model about 1 month ago

Alibaba-NLP/gte-Qwen2-1.5B-instruct

reacted to roseking's post with 🚀 about 1 month ago

Post

2648

🤗 Hugging Face Download Tool

The Hugging Face Download Tool is a sophisticated graphical user interface application designed to simplify the process of downloading resources from Hugging Face repositories. This tool addresses common challenges in model and file downloads through its intelligent features and user-friendly interface.

✨ Key Features
- 🖥️ Intuitive graphical interface for easy operation
- 🔄 Advanced retry mechanism with smart error handling
- ⏸️ Resume capability for interrupted downloads
- 📊 Real-time download status monitoring
- 🔐 Secure access to private repositories via token authentication

🛠️ Technical Highlights
The tool implements several advanced features to ensure reliable downloads:
- 📦 Chunk-based downloading with 1MB segments
- ⚡ Adaptive retry intervals (5-300 seconds) based on error types
- 🔌 Connection pooling for optimized performance
- 🛡️ Built-in rate limiting protection
- 🔑 Secure token handling for private repository access

This tool is ideal for researchers, developers, and AI practitioners who regularly work with Hugging Face resources and need a reliable, user-friendly download solution. 💻 It supports all major operating systems and requires minimal setup, making it accessible to users of all technical levels. 🚀

GitHub：https://github.com/2404589803/hf_downloader

3 replies

liked 3 models about 2 months ago

reacted to AdinaY's post with 🔥 about 2 months ago

Post

3033

QvQ-72B-Preview🎄 an open weight model for visual reasoning just released by Alibaba_Qwen team
Qwen/qvq-676448c820912236342b9888
✨ Combines visual understanding & language reasoning.
✨ Scores 70.3 on MMMU
✨ Outperforms Qwen2-VL-72B-Instruct in complex problem-solving

liked a model about 2 months ago

IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated Dec 25, 2024 • 3.68k • 587

reacted to freddyaboulton's post with 🔥 about 2 months ago

Post

2132

Version 0.0.21 of gradio-pdf now properly loads chinese characters!

liked a model 2 months ago

OuteAI/OuteTTS-0.2-500M

Text-to-Speech • Updated Dec 3, 2024 • 3.32k • 285

liked a model 3 months ago

Qwen/QwQ-32B-Preview

Text Generation • Updated Jan 12 • 205k • • 1.61k

reacted to victor's post with 🔥 3 months ago

Post

1838

Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!

liked a Space 3 months ago

340

Qwen2.5 Turbo 1M Demo

💻

Upload documents for Q&A with Qwen-Turbo