2 10 1

Fares Obeid

Fareso

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

upvoted a paper 29 days ago

YuLan-Mini: An Open Data-efficient Language Model

upvoted a paper about 2 months ago

Training Large Language Models to Reason in a Continuous Latent Space

View all activity

Organizations

None yet

Fareso's activity

upvoted a paper 11 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 11 days ago • 268

upvoted a paper 29 days ago

YuLan-Mini: An Open Data-efficient Language Model

Paper • 2412.17743 • Published Dec 23, 2024 • 64

upvoted 3 papers about 2 months ago

liked a Space about 2 months ago

Paused

👀

Hymba Chat

Chatting with Hymba

upvoted a paper 2 months ago

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 33

authored a paper 6 months ago

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55

commented 2 papers 6 months ago

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55 •

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55 •

upvoted a paper 6 months ago

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55

commented a paper 6 months ago

GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression

Paper • 2407.12077 • Published Jul 16, 2024 • 55 •

upvoted 2 papers 7 months ago

In-Context Pretraining: Language Modeling Beyond Document Boundaries

Paper • 2310.10638 • Published Oct 16, 2023 • 29

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers

Paper • 2406.16747 • Published Jun 24, 2024 • 19

authored a paper 10 months ago

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Paper • 2404.05892 • Published Apr 8, 2024 • 33

upvoted a paper 10 months ago

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Paper • 2404.05892 • Published Apr 8, 2024 • 33

New activity in vikp/textbook_quality_programming 10 months ago

KeyError: 'length'

#4 opened about 1 year ago by

Fareso

New activity in vikp/textbook_quality_programming about 1 year ago

KeyError: 'length'

#4 opened about 1 year ago by

Fareso