thiomajid (Abdoul Majid O. Thiombiano)

upvoted 2 papers 4 days ago

Making Text Embedders Few-Shot Learners

Paper • 2409.15700 • Published 6 days ago • 26

MaskBit: Embedding-free Image Generation via Bit Tokens

Paper • 2409.16211 • Published 6 days ago • 13

upvoted a paper 7 days ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published 12 days ago • 116

upvoted a collection 13 days ago

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 8 items • Updated 3 days ago • 54

upvoted a paper 18 days ago

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

Paper • 2409.05591 • Published 21 days ago • 26

upvoted a paper 23 days ago

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published 25 days ago • 85

upvoted a paper 27 days ago

InkubaLM: A small language model for low-resource African languages

Paper • 2408.17024 • Published Aug 30 • 10

upvoted a paper 30 days ago

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29 • 92

upvoted 15 papers about 1 month ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 110

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22 • 50

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 49

Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15 • 19

Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers

Paper • 2408.05506 • Published Aug 10 • 8

UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling

Paper • 2408.04810 • Published Aug 9 • 22

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12 • 57

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9 • 36

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12 • 114

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15 • 51

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9 • 46

ShortCircuit: AlphaZero-Driven Circuit Design

Paper • 2408.09858 • Published Aug 19 • 16

upvoted an article about 1 month ago

Article

Introduction to ggml

Aug 13

• 95

upvoted an article about 2 months ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12

• 98

upvoted 2 articles 2 months ago

Article

History of State Space Models (SSM) in 2022

By

•

Apr 11

• 12

Article

Introduction to State Space Models (SSM)

By

•

Jul 19

• 81

upvoted 6 papers 3 months ago

Human-like Episodic Memory for Infinite Context LLMs

Paper • 2407.09450 • Published Jul 12 • 56

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12 • 123

MAVIS: Mathematical Visual Instruction Tuning

Paper • 2407.08739 • Published Jul 11 • 30

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Paper • 2407.08083 • Published Jul 10 • 27

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

Paper • 2407.08348 • Published Jul 11 • 50

Do Vision and Language Models Share Concepts? A Vector Space Alignment Study

Paper • 2302.06555 • Published Feb 13, 2023 • 9

upvoted 2 articles 3 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 199

Article

Vision Language Models Explained

Apr 11

• 182

upvoted 5 papers 3 months ago

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5 • 31

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Paper • 2407.01392 • Published Jul 1 • 39

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published Jul 1 • 42

Efficient World Models with Context-Aware Tokenization

Paper • 2406.19320 • Published Jun 27 • 7

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

Paper • 2403.08715 • Published Mar 13 • 20

upvoted 2 collections 3 months ago

Agents

Collection

15 items • Updated Aug 23 • 1

RL

Collection

21 items • Updated Jul 19 • 1

upvoted 3 papers 3 months ago

Symbolic Learning Enables Self-Evolving Agents

Paper • 2406.18532 • Published Jun 26 • 10

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17 • 29

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20 • 20

upvoted 15 papers 4 months ago

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11 • 36

Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13 • 43

Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

Paper • 2406.04314 • Published Jun 6 • 26

Audio Mamba: Bidirectional State Space Model for Audio Representation Learning

Paper • 2406.03344 • Published Jun 5 • 17

Self-Improving Robust Preference Optimization

Paper • 2406.01660 • Published Jun 3 • 18

Parrot: Multilingual Visual Instruction Tuning

Paper • 2406.02539 • Published Jun 4 • 35

Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Paper • 2406.01014 • Published Jun 3 • 30

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30

Open-Endedness is Essential for Artificial Superhuman Intelligence

Paper • 2406.04268 • Published Jun 6 • 11

Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning

Paper • 2406.00392 • Published Jun 1 • 12

μLO: Compute-Efficient Meta-Generalization of Learned Optimizers

Paper • 2406.00153 • Published May 31 • 9

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Paper • 2405.19332 • Published May 29 • 15

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Paper • 2405.18669 • Published May 29 • 11

Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29 • 13

Abdoul Majid O. Thiombiano

AI & ML interests

Organizations

thiomajid's activity

Introduction to ggml

Welcome FalconMamba: The first strong attention-free 7B model

History of State Space Models (SSM) in 2022

Introduction to State Space Models (SSM)

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Vision Language Models Explained