Unofficial Mistral Community

community

Activity Feed Request to join this org

AI & ML interests

Unofficial org for community upload of Mistral's Open Source models.

Recent Activity

ArthurZ new activity about 23 hours ago

mistral-community/pixtral-12b:Fastest way for inference?

nielsr new activity 1 day ago

mistral-community/pixtral-12b:Fastest way for inference?

reach-vb authored a paper 1 day ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

View all activity

mistral-community's activity

ArthurZ

in mistral-community/pixtral-12b about 23 hours ago

Fastest way for inference?

#28 opened 1 day ago by

psycy

nielsr

in mistral-community/pixtral-12b 1 day ago

Fastest way for inference?

#28 opened 1 day ago by

psycy

reach-vb

authored a paper 1 day ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 3 days ago • 131

RaushanTurganbay

in mistral-community/pixtral-12b 3 days ago

Getting shape mismatch while loading saved Pixtral model

#24 opened 4 days ago by

ss007

v2ray

posted an update 3 days ago

Post

1812

GPT4chan Series Release

GPT4chan is a series of models I trained on v2ray/4chan dataset, which is based on lesserfield/4chan-datasets. The dataset contains mostly posts from 2023. Not every board is included, for example, /pol/ is NOT included. To see which boards are included, visit v2ray/4chan.

This release contains 2 models sizes, 8B and 24B. The 8B model is based on meta-llama/Llama-3.1-8B and the 24B model is based on mistralai/Mistral-Small-24B-Base-2501.

Why I made these models? Because for a long time after the original gpt-4chan model, there aren't any serious fine-tunes on 4chan datasets. 4chan is a good data source since it contains coherent replies and nice topics. It's fun to talk to an AI generated version of 4chan and get instant replies, and without the need to actually visit 4chan. You can also sort of analyze the content and behavior of 4chan posts by probing the model's outputs.

Disclaimer: The GPT4chan models should only be used for research purposes, the outputs they generated do not represent the view of me on the subjects. Moderate the responses before sending it online.

Model links:

Full model:
- v2ray/GPT4chan-8B
- v2ray/GPT4chan-24B

Adapter:
- v2ray/GPT4chan-8B-QLoRA
- v2ray/GPT4chan-24B-QLoRA

AWQ:
- v2ray/GPT4chan-8B-AWQ
- v2ray/GPT4chan-24B-AWQ

FP8:
- v2ray/GPT4chan-8B-FP8

JustinLin610

authored a paper 11 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 13 days ago • 54

JustinLin610

authored a paper 12 days ago

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

Paper • 2501.14492 • Published 15 days ago • 29

clem

posted an update 12 days ago

Post

6940

AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!

RaushanTurganbay

updated a model 12 days ago

mistral-community/pixtral-12b

Image-Text-to-Text • Updated 12 days ago • 39.1k • 86

clem

posted an update 14 days ago

Post

2273

The 🐳 just crossed 10,000 followers on HF

https://huggingface.co/deepseek-ai

ehartford

in mistral-community/Mixtral-8x22B-v0.1 15 days ago

We are working on creating a single 22b from this model

#5 opened 10 months ago by

rombodawg

mrfakename

posted an update 15 days ago

Post

1132

I’m excited to introduce a new leaderboard UI + keyboard shortcuts on the TTS Arena!

The refreshed UI for the leaderboard is smoother and (hopefully) more intuitive. You can now view models based on a simpler win-rate percentage and exclude closed models.

In addition, the TTS Arena now supports keyboard shortcuts. This should make voting much more efficient as you can now vote without clicking anything!

In both the normal Arena and Battle Mode, press "r" to select a random text, Cmd/Ctrl + Enter to synthesize, and "a"/"b" to vote! View more details about keyboard shortcuts by pressing "?" (Shift + /) on the Arena.

Check out all the new updates on the TTS Arena:

TTS-AGI/TTS-Arena

1 reply

RaushanTurganbay

in mistral-community/pixtral-12b 16 days ago

Is the chat template correct? (issue for vLLM)

#22 opened about 1 month ago by

MichaelAI23

JustinLin610

authored a paper 17 days ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 18 days ago • 63

danielhanchen

posted an update 19 days ago

Post

2387

I uploaded DeepSeek R1 GGUFs!

unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF
unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF
2bit for MoE: unsloth/DeepSeek-R1-GGUF
unsloth/DeepSeek-R1-Zero-GGUF

More at unsloth/deepseek-r1-all-versions-678e1c48f5d2fce87892ace5

JustinLin610

authored a paper 25 days ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 26 days ago • 89

JustinLin610

authored a paper 26 days ago

Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published 29 days ago • 70

danielhanchen

posted an update 28 days ago

Post

4643

We fixed many bugs in Phi-4 & uploaded fixed GGUF + 4-bit versions! ✨

Our fixed versions are even higher on the Open LLM Leaderboard than Microsoft's!

GGUFs: unsloth/phi-4-GGUF
Dynamic 4-bit: unsloth/phi-4-unsloth-bnb-4bit

You can also now finetune Phi-4 for free on Colab: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Phi_4-Conversational.ipynb

Read our blogpost for more details on bug fixes etc: https://unsloth.ai/blog/phi4

JustinLin610

authored a paper 29 days ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 54

danielhanchen

posted an update about 1 month ago

Post

3139

Deepseek V3, including GGUF + bf16 versions are now uploaded!

Includes 2, 3, 4, 5, 6 and 8-bit quantized versions.

GGUFs: unsloth/DeepSeek-V3-GGUF
bf16: unsloth/DeepSeek-V3-bf16

Min. hardware requirements to run: 48GB RAM + 250GB of disk space for 2-bit.

See how to run them with examples and the full collection: unsloth/deepseek-v3-all-versions-677cf5cfd7df8b7815fc723c

AI & ML interests

Recent Activity

Team members 23

mistral-community's activity

Fastest way for inference?

Fastest way for inference?

Getting shape mismatch while loading saved Pixtral model

We are working on creating a single 22b from this model

Is the chat template correct? (issue for vLLM)