Running 100 100 TxT360: Trillion Extracted Text π Create a large, deduplicated dataset for LLM pre-training
bunkalab/Phi-3-mini-128k-instruct-LinearBunkaScore-4.6k-DPO Text Generation β’ Updated May 30, 2024 β’ 77 β’ 2
OrdalieTech/Solon-embeddings-large-0.1 Feature Extraction β’ Updated Mar 26, 2024 β’ 14.9k β’ β’ 47
MoritzLaurer/deberta-v3-base-zeroshot-v1 Zero-Shot Classification β’ Updated Nov 29, 2023 β’ 477 β’ β’ 38