The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines Paper • 2408.01050 • Published Aug 2, 2024 • 8
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 53
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 71
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published Sep 6, 2024 • 23
From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents Paper • 2409.03512 • Published Sep 5, 2024 • 27
Political DEBATE: Efficient Zero-shot and Few-shot Classifiers for Political Text Paper • 2409.02078 • Published Sep 3, 2024 • 9
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 52
TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text Encoder Paper • 2409.08248 • Published Sep 12, 2024 • 13
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Paper • 2409.06595 • Published Sep 10, 2024 • 37
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 36
Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey Paper • 2409.11564 • Published Sep 17, 2024 • 19
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study Paper • 2409.17580 • Published Sep 26, 2024 • 7
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Paper • 2409.18943 • Published Sep 27, 2024 • 28
SLM: Bridge the thin gap between speech and text foundation models Paper • 2310.00230 • Published Sep 30, 2023
ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation Paper • 2410.01731 • Published Oct 2, 2024 • 16
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 112
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Paper • 2410.09732 • Published Oct 13, 2024 • 54
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper • 2410.10783 • Published Oct 14, 2024 • 26
Intriguing Properties of Large Language and Vision Models Paper • 2410.04751 • Published Oct 7, 2024 • 16
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition Paper • 2410.05603 • Published Oct 8, 2024 • 11
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 64
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 46
Survey of Cultural Awareness in Language Models: Text and Beyond Paper • 2411.00860 • Published Oct 30, 2024 • 23
Multi-expert Prompting Improves Reliability, Safety, and Usefulness of Large Language Models Paper • 2411.00492 • Published Nov 1, 2024 • 6
Survey of User Interface Design and Interaction Techniques in Generative AI Applications Paper • 2410.22370 • Published Oct 28, 2024 • 11
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks Paper • 2410.24032 • Published Oct 31, 2024 • 9
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published Oct 28, 2024 • 11
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Paper • 2410.21169 • Published Oct 28, 2024 • 30
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance Paper • 2410.18889 • Published Oct 24, 2024 • 15
Counting Ability of Large Language Models and Impact of Tokenization Paper • 2410.19730 • Published Oct 25, 2024 • 10
Can Knowledge Editing Really Correct Hallucinations? Paper • 2410.16251 • Published Oct 21, 2024 • 54
Looking Inward: Language Models Can Learn About Themselves by Introspection Paper • 2410.13787 • Published Oct 17, 2024 • 7
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper • 2410.12784 • Published Oct 16, 2024 • 43
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16, 2024 • 30
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper • 2410.13639 • Published Oct 17, 2024 • 16
Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant Paper • 2410.13360 • Published Oct 17, 2024 • 8
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Paper • 2410.12787 • Published Oct 16, 2024 • 31
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Paper • 2410.12628 • Published Oct 16, 2024 • 29
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published Nov 7, 2024 • 21
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 44
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation Paper • 2411.13281 • Published Nov 20, 2024 • 17
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper • 2411.13476 • Published Nov 20, 2024 • 15
Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages Paper • 2411.12240 • Published Nov 19, 2024 • 6
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration Paper • 2411.10958 • Published Nov 17, 2024 • 52
Personalized Multimodal Large Language Models: A Survey Paper • 2412.02142 • Published Dec 3, 2024 • 12
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper • 2411.19655 • Published Nov 29, 2024 • 20
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 27 days ago • 66
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published Dec 4, 2024 • 46
In Case You Missed It: ARC 'Challenge' Is Not That Challenging Paper • 2412.17758 • Published 13 days ago • 16
Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning Paper • 2412.09078 • Published 24 days ago
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 14 days ago • 44
Outcome-Refining Process Supervision for Code Generation Paper • 2412.15118 • Published 17 days ago • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 13 days ago • 21
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 6 days ago • 21