DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Paper • 1910.01108 • Published Oct 2, 2019 • 14
distilbert/distilbert-base-uncased-finetuned-sst-2-english Text Classification • Updated Dec 19, 2023 • 9.47M • • 622
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 18
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 21
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT Paper • 2402.16840 • Published Feb 26 • 23
GPTVQ: The Blessing of Dimensionality for LLM Quantization Paper • 2402.15319 • Published Feb 23 • 19
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21 • 47
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17 • 21
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19 • 21
MonoFormer: One Transformer for Both Diffusion and Autoregression Paper • 2409.16280 • Published Sep 24 • 17