Zeroshot Classifiers
These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better.
Zero-Shot Classification • Updated • 76.8k • • 86Note - Performance: most performant model - Size: 0.43 B parameters, 870 MB - Other: language English; context length max 512 tokens; can be a bit slower than RoBERTa models - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/deberta-v3-large-zeroshot-v2.0-c longer context & multilingual: https://huggingface.co/MoritzLaurer/bge-m3-zeroshot-v2.0
MoritzLaurer/bge-m3-zeroshot-v2.0
Zero-Shot Classification • Updated • 748k • 42Note - Performance: most performant multilingual model - Size: 0.57 B parameters, 1.14 GB - Other: 100+ languages; context length max 8192 tokens; based on bge-m3-retromae, which is based on XLM-RoBERTa - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/bge-m3-zeroshot-v2.0-c Note: English-only models combined with machine translated text can perform better (https://github.com/UKPLab/EasyNMT )
MoritzLaurer/deberta-v3-base-zeroshot-v2.0
Zero-Shot Classification • Updated • 2.49k • 8Note - Performance: most performant base-size model - Size: 0.18 B parameters, 369 MB - Other: language English; context length max 512 tokens; faster than RoBERTa-large/BGE-3 models, but slower than RoBERTa-base - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/deberta-v3-base-zeroshot-v2.0-c longer context & multilingual: MoritzLaurer/bge-m3-zeroshot-v2.0
MoritzLaurer/roberta-large-zeroshot-v2.0-c
Zero-Shot Classification • Updated • 1.72k • 2Note - Performance & speed: less performant than deberta-v3 variants, but a bit faster and compatible with flash attention and TEI containers - Size: 0.35B parameters, 711 MB - Other: only trained on commercially-friendly data; language English; context length max 512 tokens - Alternatives: Smaller, more efficient version: https://huggingface.co/MoritzLaurer/roberta-base-zeroshot-v2.0
MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33
Zero-Shot Classification • Updated • 4.59k • 52Note [old] Zeroshot model, trained on a mixture of 33 datasets and 389 classes reformatted in the universal NLI format. It's compatible with the Hugging Face zershot pipeline. The model is English only. You can also use it for multilingual zeroshot classification by first machine translating texts to English.
MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33
Zero-Shot Classification • Updated • 435k • 24Note [old] This is essentially the same as its larger sister MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33 only that it's smaller. Use it if you need more speed. The model is English-only.
MoritzLaurer/deberta-v3-xsmall-zeroshot-v1.1-all-33
Zero-Shot Classification • Updated • 47.3k • 4Note [old] Same as above, just smaller/faster.
MoritzLaurer/xtremedistil-l6-h256-zeroshot-v1.1-all-33
Zero-Shot Classification • Updated • 1.09k • 6Note [old] Same as above, just even faster. The model only has 22 million backbone parameters. The model is 25 MB small (or 13 MB with ONNX quantization).
MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
Zero-Shot Classification • Updated • 33.6k • 303Note [old] This model can do zeroshot classification in 100~ languages. Advice: multilingual models tend to be less good than English-only models. For maximum performance, it can be better to first machine translate texts to English and then use an English-only model for zeroshot classification. See the other English-only models in this collection. For free open-source machine translation, I recommend https://github.com/UKPLab/EasyNMT.
MoritzLaurer/mDeBERTa-v3-base-mnli-xnli
Zero-Shot Classification • Updated • 238k • • 230Note [old] I've received some feedback from users that MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 is less good on some languages. It might be worth trying this one too.
MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli
Zero-Shot Classification • Updated • 34.1k • 88Note This model is only trained on 5 NLI datasets. It might be better at specifically the NLI task compared to the "zeroshot" models and it returns three classes (entailment/contradiction/neutral). I would generally recommend the "zeroshot" models, however, as they have also been trained on the same 5 NLI datasets, only with many other additional datasets (they only return entailment/not_entailment).