view article Article Decoding Strategies in Large Language Models By mlabonne • about 1 month ago • 38
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17 • 55
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 73
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21 • 61
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • May 7 • 40
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 118
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 30
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 62
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 603