Matt Mistele's picture

5 67 9

Matt Mistele

mmistele

·

https://www.moveworks.com/ai

AI & ML interests

Natural language processing, text classification, reasoning, agents, privacy

Organizations

None yet

mmistele's activity

upvoted a paper 2 months ago

Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published Oct 10 • 16

upvoted a paper 3 months ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 138

upvoted 4 papers 6 months ago

Eliminating Position Bias of Language Models: A Mechanistic Approach

Paper • 2407.01100 • Published Jul 1 • 6

TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

Paper • 2406.19380 • Published Jun 27 • 47

On scalable oversight with weak LLMs judging strong LLMs

Paper • 2407.04622 • Published Jul 5 • 11

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5 • 27

upvoted 7 papers 7 months ago

Aya 23: Open Weight Releases to Further Multilingual Progress

Paper • 2405.15032 • Published May 23 • 27

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

Paper • 2405.15071 • Published May 23 • 37

Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Paper • 2405.17258 • Published May 27 • 14

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Paper • 2405.17428 • Published May 27 • 17

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27 • 52

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 86

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23 • 36

upvoted 7 papers 8 months ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22 • 126

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Paper • 2404.12387 • Published Apr 18 • 38

From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples

Paper • 2404.07544 • Published Apr 11 • 19

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 82

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 43

Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11 • 29

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11 • 43