q

aust-t

AI & ML interests

None yet

Recent Activity

Organizations

None yet

aust-t's activity

reacted to lewtun's post with πŸ”₯ 17 days ago
view post
Post
10048
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

πŸ§ͺ Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

πŸ”₯ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
Β·
reacted to roseking's post with πŸš€ about 1 month ago
view post
Post
2648
πŸ€— Hugging Face Download Tool

The Hugging Face Download Tool is a sophisticated graphical user interface application designed to simplify the process of downloading resources from Hugging Face repositories. This tool addresses common challenges in model and file downloads through its intelligent features and user-friendly interface.

✨ Key Features
- πŸ–₯️ Intuitive graphical interface for easy operation
- πŸ”„ Advanced retry mechanism with smart error handling
- ⏸️ Resume capability for interrupted downloads
- πŸ“Š Real-time download status monitoring
- πŸ” Secure access to private repositories via token authentication

πŸ› οΈ Technical Highlights
The tool implements several advanced features to ensure reliable downloads:
- πŸ“¦ Chunk-based downloading with 1MB segments
- ⚑ Adaptive retry intervals (5-300 seconds) based on error types
- πŸ”Œ Connection pooling for optimized performance
- πŸ›‘οΈ Built-in rate limiting protection
- πŸ”‘ Secure token handling for private repository access

This tool is ideal for researchers, developers, and AI practitioners who regularly work with Hugging Face resources and need a reliable, user-friendly download solution. πŸ’» It supports all major operating systems and requires minimal setup, making it accessible to users of all technical levels. πŸš€

GitHub:https://github.com/2404589803/hf_downloader
Β·
reacted to AdinaY's post with πŸ”₯ about 2 months ago
view post
Post
3033
QvQ-72B-PreviewπŸŽ„ an open weight model for visual reasoning just released by Alibaba_Qwen team
Qwen/qvq-676448c820912236342b9888
✨ Combines visual understanding & language reasoning.
✨ Scores 70.3 on MMMU
✨ Outperforms Qwen2-VL-72B-Instruct in complex problem-solving
reacted to freddyaboulton's post with πŸ”₯ about 2 months ago
view post
Post
2132
Version 0.0.21 of gradio-pdf now properly loads chinese characters!
reacted to victor's post with πŸ”₯ 3 months ago
view post
Post
1838
Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!