Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models about 21 hours ago • 2
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition 2 days ago
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions 3 days ago • 1
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model 4 days ago • 1
view article Article Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models By mikelabs • about 21 hours ago • 2
view article Article Robust ASR Error Correction with Conservative Data Filtering By mikelabs • 2 days ago • 2
view article Article That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip Design By mikelabs • 3 days ago • 1
view article Article Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions By mikelabs • 3 days ago • 1
view article Article The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use By mikelabs • 3 days ago • 1
view article Article Modeling AdaGrad, RMSProp, and Adam with Integro-Differential Equations By mikelabs • 3 days ago • 1
view article Article StableV2V: Stablizing Shape Consistency in Video-to-Video Editing By mikelabs • 3 days ago • 2
view article Article GPTree: Towards Explainable Decision-Making via LLM-powered Decision Trees By mikelabs • 3 days ago • 1
view article Article Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model By mikelabs • 4 days ago • 1
Part123: Part-aware 3D Reconstruction from a Single-view Image Paper • 2405.16888 • Published May 27 • 11
STT: Stateful Tracking with Transformers for Autonomous Driving Paper • 2405.00236 • Published Apr 30 • 7
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published Apr 25 • 15
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 74
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 124
HyperFields: Towards Zero-Shot Generation of NeRFs from Text Paper • 2310.17075 • Published Oct 26, 2023 • 14
3D-GPT: Procedural 3D Modeling with Large Language Models Paper • 2310.12945 • Published Oct 19, 2023 • 57
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 74
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams Paper • 2310.08678 • Published Oct 12, 2023 • 12
Table-GPT: Table-tuned GPT for Diverse Table Tasks Paper • 2310.09263 • Published Oct 13, 2023 • 39