Yale LILY Lab

university

https://yale-lily.github.io/

YaleLILY_Lab

Yale-LILY

Activity Feed Request to join this org

AI & ML interests

Language, Information, and Learning at Yale

Recent Activity

yilunzhao authored a paper 19 days ago

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

yilunzhao authored a paper 19 days ago

Investigating Data Contamination in Modern Benchmarks for Large Language Models

yilunzhao authored a paper 19 days ago

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

View all activity

Yale-LILY's activity

yilunzhao

authored 13 papers 19 days ago

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

Paper • 2311.09184 • Published Nov 15, 2023 • 1

Investigating Data Contamination in Modern Benchmarks for Large Language Models

Paper • 2311.09783 • Published Nov 16, 2023 • 2

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

Paper • 2311.10537 • Published Nov 16, 2023 • 3

ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples

Paper • 2210.12374 • Published Oct 22, 2022

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies

Paper • 2305.12586 • Published May 21, 2023

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 52

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Paper • 2212.07981 • Published Dec 15, 2022

ReIFE: Re-evaluating Instruction-Following Evaluation

Paper • 2410.07069 • Published Oct 9

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

Paper • 2410.23266 • Published Oct 30 • 20

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models

Paper • 2411.04075 • Published Nov 6 • 15

henryL7

authored a paper 5 months ago

Understanding Reference Policies in Direct Preference Optimization

Paper • 2407.13709 • Published Jul 18 • 16

hails

authored a paper 6 months ago

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

Paper • 2406.16838 • Published Jun 24 • 2

hails

authored a paper 7 months ago

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Paper • 2406.04391 • Published Jun 6 • 7

Simeng

authored 3 papers 7 months ago

FOLIO: Natural Language Reasoning with First-Order Logic

Paper • 2209.00840 • Published Sep 2, 2022

QTSumm: A New Benchmark for Query-Focused Table Summarization

Paper • 2305.14303 • Published May 23, 2023

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

Paper • 2311.09184 • Published Nov 15, 2023 • 1

niansong1996

authored a paper 8 months ago

NExT: Teaching Large Language Models to Reason about Code Execution

Paper • 2404.14662 • Published Apr 23 • 4

AI & ML interests

Recent Activity

Team members 11

Yale-LILY's activity