9 25 22

Solomatin Roman

Samoed

AI & ML interests

None yet

Recent Activity

updated a dataset 2 days ago

mteb/TenKGnadClassification

updated a dataset 2 days ago

mteb/TweetEmotionClassification

updated a dataset 2 days ago

mteb/HotelReviewSentimentClassification

View all activity

Organizations

Samoed's activity

upvoted a paper 3 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 5 days ago • 97

upvoted a collection 27 days ago

Hymba

Collection

A series of Hybrid Small Language Models. • 2 items • Updated Nov 22 • 24

upvoted a collection about 1 month ago

Tulu 3 Models

Collection

All models released with Tulu 3 -- state of the art open post-training recipes. • 7 items • Updated 26 days ago • 29

upvoted a paper about 1 month ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13 • 43

upvoted a paper about 2 months ago

On the Power of Decision Trees in Auto-Regressive Language Modeling

Paper • 2409.19150 • Published Sep 27 • 4

upvoted a paper 2 months ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21 • 58

upvoted a paper 3 months ago

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted an article 3 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18

• 213

upvoted a paper 3 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10 • 63

upvoted 5 papers 4 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3 • 77

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 124

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 57

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published Aug 22 • 23

upvoted 6 papers 5 months ago