Pu Fanyi's picture

Pu Fanyi

pufanyi

·

https://pufanyi.github.io

AI & ML interests

CV

Recent Activity

upvoted a paper 3 days ago

Solving math word problems with process- and outcome-based feedback

liked a Space 4 days ago

lmms-lab/LiveBench

upvoted a paper 7 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

View all activity

Organizations

pufanyi's activity

upvoted a paper 3 days ago

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 8

upvoted 2 papers 7 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 9 days ago • 91

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 4

upvoted 2 papers 11 days ago

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 50

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 64

upvoted a paper 22 days ago

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 32

upvoted an article about 1 month ago

Article

Fine-tuning Mistral on Your Dataset

By

•

Jul 22, 2024

• 19

upvoted 2 papers about 2 months ago

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published Nov 22, 2024 • 16

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Paper • 2409.12961 • Published Sep 19, 2024 • 25

upvoted 2 collections 2 months ago

Oryx-1.5

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution • 2 items • Updated Oct 23, 2024 • 3

Oryx

Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding • 6 items • Updated about 1 month ago • 15

upvoted a collection 3 months ago

LongVA

Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13

upvoted 4 papers 3 months ago

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 13

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published Oct 17, 2024 • 75

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 35

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 38

upvoted a collection 4 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 554

upvoted 2 papers 5 months ago

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 60

upvoted a paper 6 months ago

FunQA: Towards Surprising Video Comprehension

Paper • 2306.14899 • Published Jun 26, 2023 • 1