Peter Szemraj's picture

Peter Szemraj PRO

pszemraj

·

https://pszemraj.carrd.co/

pszemraj

AI & ML interests

metallic intuition

Recent Activity

liked a model about 10 hours ago

avsolatorio/GIST-small-Embedding-v0

liked a model about 19 hours ago

dunzhang/stella_en_400M_v5

reacted to MoritzLaurer's post with 👍 1 day ago

Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification! This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D Congrats @answerdotai, @LightOnIO and collaborators like @tomaarsen ! Paper and models here 👇https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb

View all activity

Organizations

pszemraj's activity

liked a model about 10 hours ago

avsolatorio/GIST-small-Embedding-v0

Sentence Similarity • Updated Feb 28 • 213k • 24

liked a model about 19 hours ago

dunzhang/stella_en_400M_v5

Sentence Similarity • Updated 10 days ago • 668k • 142

reacted to MoritzLaurer's post with 👍 1 day ago

Post

1560

Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here 👇https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb

New activity in postbot/t5-base-kw2email-v4 2 days ago

Adding `safetensors` variant of this model

#2 opened 2 days ago by

New activity in postbot/t5-small-kw2email-v2 2 days ago

Adding `safetensors` variant of this model

#1 opened 2 days ago by

upvoted 2 papers 2 days ago

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 3 days ago • 34

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Paper • 2412.15204 • Published 3 days ago • 27

updated a dataset 2 days ago

BEE-spoke-data/google_wellformed_query-hf

Viewer • Updated 2 days ago • 25.1k • 4

upvoted a paper 2 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 3 days ago • 285

liked 4 models 3 days ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated about 19 hours ago • 244k • • 1.23k

Datou1111/shou_xin

Text-to-Image • Updated 13 days ago • 30.3k • • 686

answerdotai/ModernBERT-large

Fill-Mask • Updated 3 days ago • 3.22k • 165

answerdotai/ModernBERT-base

Fill-Mask • Updated 3 days ago • 10.7k • 273

upvoted a paper 3 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 4 days ago • 92

upvoted a collection 3 days ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 3 days ago • 78

upvoted 3 papers 3 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 9 days ago • 68

OmniPred: Language Models as Universal Regressors

Paper • 2402.14547 • Published Feb 22 • 12

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

Paper • 2404.07544 • Published Apr 11 • 19

New activity in AtlaAI/judge-arena 3 days ago

Which models do you want to see on here?

#2 opened about 1 month ago by

liked a dataset 3 days ago

cfli/bge-full-data

Updated Oct 11 • 1.35k • 26