Leshem Choshen's picture

Leshem Choshen

borgr

·

https://ktilana.wixsite.com/leshem-choshen

AI & ML interests

Merging models, collaboratively improving pretraining, evaluation, understanding

Recent Activity

liked a dataset 17 days ago

tinyBenchmarks/tinyWinogrande

new activity 20 days ago

baichuan-inc/Baichuan2-7B-Intermediate-Checkpoints:during training data?

new activity 24 days ago

CohereForAI/Global-MMLU:Duplicates for NL

View all activity

Organizations

borgr's activity

upvoted a paper 28 days ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published about 1 month ago • 17

upvoted 3 papers 3 months ago

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Paper • 2410.10783 • Published Oct 14, 2024 • 26

SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification

Paper • 2410.05057 • Published Oct 7, 2024 • 7

Acceptable Use Policies for Foundation Models

Paper • 2409.09041 • Published Aug 29, 2024 • 1

upvoted a paper 4 months ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15, 2024 • 21

upvoted 3 papers 5 months ago

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Paper • 2408.08291 • Published Aug 15, 2024 • 10

Learning from Naturally Occurring Feedback

Paper • 2407.10944 • Published Jul 15, 2024 • 4

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation

Paper • 2407.13696 • Published Jul 18, 2024 • 5

upvoted a paper 6 months ago

Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

Paper • 2407.00402 • Published Jun 29, 2024 • 22

upvoted a paper 7 months ago

Large Language Model Confidence Estimation via Black-Box Access

Paper • 2406.04370 • Published Jun 1, 2024 • 20