-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 74 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40
Collections
Discover the best community collections!
Collections including paper arxiv:2311.00571
-
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Paper • 2310.09478 • Published • 19 -
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Paper • 2310.08678 • Published • 12 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 241 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 13
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8
-
meta-llama/Llama-2-7b-chat-hf
Text Generation • Updated • 838k • • 3.95k -
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
Paper • 2311.04257 • Published • 20 -
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper • 2311.00571 • Published • 40 -
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61