Yonatan Bitton's picture

8 4

Yonatan Bitton

Yonatan-Bitton

·

https://yonatanbitton.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

upvoted a paper 7 months ago

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

authored a paper 7 months ago

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

View all activity

Organizations

Yonatan-Bitton's activity

upvoted a paper about 1 month ago

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Paper • 2501.12224 • Published Jan 21 • 46

upvoted a paper 7 months ago

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

Paper • 2407.19474 • Published Jul 28, 2024 • 23

authored a paper 7 months ago

Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models

Paper • 2407.19474 • Published Jul 28, 2024 • 23

upvoted an article 7 months ago

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18, 2024

• 72

upvoted a paper 8 months ago

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 26

New activity in google/imageinwords-explorer 8 months ago

Update app.py

#7 opened 8 months ago by

updated a Space 8 months ago

ImageInWords (IIW) Explorer

New activity in google/imageinwords-explorer 10 months ago

Update README.md

#6 opened 10 months ago by

Update app.py

#5 opened 10 months ago by

Update app.py

#4 opened 10 months ago by

update app to support additional IIW data release sets.

#3 opened 10 months ago by

updated a Space 10 months ago

My Test

reacted to gsarti's post with 👍 about 1 year ago

Post

🔍 Today's pick in Interpretability & Analysis of LMs: A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains by @alonjacovi @yonatanbitton B. Bohnet J. Herzig @orhonovic M. Tseng M. Collins @roeeaharoni @mega

This work introduces a new methodology for human verification of reasoning chains and adopts it to annotate a dataset of chain-of-thought reasoning chains produced by 3 LMs. The annotated dataset, REVEAL, can be used to benchmark automatic verifiers of reasoning in LMs.

In their analysis, the authors find that LM-produced CoTs generally contain faulty steps, often leading to incorrect automatic verification. In particular, CoT-generating LMs are found to produce non-attributable reasoning steps often, and reasoning verifiers generally struggle to verify logical correctness.

📄 Paper: A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains (2402.00559)
🔡 Dataset: google/reveal

New activity in mlfoundations/VisIT-Bench about 1 year ago

image urls failed to load

#2 opened about 1 year ago by

New activity in nlphuji/flickr30k about 1 year ago

Thank you

#1 opened over 1 year ago by

New activity in google/docci over 1 year ago

Upload docci.csv

#1 opened over 1 year ago by