Papers to Read - a adrish Collection

adrish 's Collections

Image Processing

Papers to Read

updated Sep 11

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16 • 42
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 53
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 102
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 19
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 47
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models

Paper • 2402.10986 • Published Feb 16 • 77
2D Matryoshka Sentence Embeddings

Paper • 2402.14776 • Published Feb 22 • 6
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

Paper • 2403.05313 • Published Mar 8 • 9
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138
Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2 • 35
ReALM: Reference Resolution As Language Modeling

Paper • 2403.20329 • Published Mar 29 • 21
ProAgent: Building Proactive Cooperative AI with Large Language Models

Paper • 2308.11339 • Published Aug 22, 2023
ProAgent: From Robotic Process Automation to Agentic Process Automation

Paper • 2311.10751 • Published Nov 2, 2023 • 8
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 81
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 83