Travis King's picture

Travis King

travisking

·

AI & ML interests

have you heard of generative AI?

Recent Activity

upvoted a paper 1 day ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

new activity 1 day ago

google/Gemma-Embeddings-v1.0:no weights?

liked a model 1 day ago

Marqo/marqo-ecommerce-embeddings-B

View all activity

Organizations

None yet

travisking's activity

upvoted 2 papers 1 day ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published 5 days ago • 35

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 3 days ago • 83

upvoted a collection 1 day ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 1 day ago • 67

upvoted 3 papers 2 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 8 days ago • 67

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 3 days ago • 78

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

Paper • 2412.13018 • Published 4 days ago • 39

upvoted a collection 2 days ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 1 day ago • 68

upvoted a paper 8 days ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 9 days ago • 87

upvoted 2 papers 9 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 16 days ago • 43

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16 • 44

upvoted a collection about 2 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 19 days ago • 194

upvoted an article 3 months ago

Article

SetFit: Efficient Few-Shot Learning Without Prompts

Sep 26, 2022

• 20

upvoted 2 collections 4 months ago

Zeroshot Classifiers

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 114

OLMoE

Artifacts for open mixture-of-experts language models. • 13 items • Updated 23 days ago • 27

upvoted 2 articles 4 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14

• 52

Article

Chat Templates: An End to the Silent Performance Killer

Oct 3, 2023

• 14

upvoted a paper 5 months ago

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 75

upvoted a collection 5 months ago

Bad Data Toolbox

PleIAs collection of models for the data processing of challenging document and data sources. • 5 items • Updated Jul 18 • 15

upvoted an article 5 months ago

Article

Advanced RAG: Fine-Tune Embeddings from HuggingFace for RAG

By

•

Jul 5

• 4

upvoted a collection 6 months ago

MatMulfree LM

Pre-trined models for Matmulfree LM. • 4 items • Updated Jun 10 • 25