CMU-LTI

university

LTIatCMU

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Xuhui authored a paper about 19 hours ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

gneubig authored a paper 1 day ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

gneubig authored a paper 12 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

View all activity

cmu-lti's activity

Xuhui

authored a paper about 19 hours ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 2 days ago • 39

gneubig

authored a paper 1 day ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 2 days ago • 39

gneubig

authored a paper 12 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 14 days ago • 44

gneubig

authored a paper 14 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 16 days ago • 43

seungone

authored a paper 15 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 16 days ago • 43

gneubig

authored a paper 29 days ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published 30 days ago • 28

aashiqmuhamed

authored a paper about 2 months ago

Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models

Paper • 2411.00743 • Published Nov 1 • 6

seungone

authored 2 papers about 2 months ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23 • 1

Better Instruction-Following Through Minimum Bayes Risk

Paper • 2410.02902 • Published Oct 3

gneubig

authored a paper about 2 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

skhanuja

authored a paper about 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

gneubig

authored a paper about 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

Nyandwi

authored a paper about 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

seungone

authored a paper about 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

gneubig

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

skhanuja

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

zhiqiulin

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

Nyandwi

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

gneubig

authored a paper 2 months ago

Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published Oct 17 • 29

zhiqings

authored a paper 2 months ago

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Paper • 2408.00724 • Published Aug 1 • 1