Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.13663

Papers - Text - Encoders - Bert

about 16 hours ago

Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings

Paper • 2305.13571 • Published May 23, 2023 • 2
BERTs are Generative In-Context Learners

Paper • 2406.04823 • Published Jun 7 • 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 3 days ago • 82

Papers - Embeddings - Text

about 16 hours ago

Gecko: Versatile Text Embeddings Distilled from Large Language Models

Paper • 2403.20327 • Published Mar 29 • 47
2D Matryoshka Sentence Embeddings

Paper • 2402.14776 • Published Feb 22 • 6
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 3 days ago • 82

Papers - Text - Classification

about 16 hours ago

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Paper • 2306.14924 • Published Jun 23, 2023 • 2
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes

Paper • 2404.12365 • Published Apr 18 • 1
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

Paper • 2311.06668 • Published Nov 11, 2023 • 5
Wave Network: An Ultra-Small Language Model

Paper • 2411.02674 • Published Nov 4 • 3

To read... eventually

A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics.

about 22 hours ago

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 124
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19 • 50
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12
LLM Agent Operating System

Paper • 2403.16971 • Published Mar 25 • 65

Papers - Encoders

about 16 hours ago

Functional Interpolation for Relative Positions Improves Long Context Transformers

Paper • 2310.04418 • Published Oct 6, 2023 • 4
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs

Paper • 2106.09997 • Published Jun 18, 2021 • 2
Neural Machine Translation of Rare Words with Subword Units

Paper • 1508.07909 • Published Aug 31, 2015 • 4
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

Paper • 2403.14438 • Published Mar 21 • 2

Papers - Text - Bidirectional Encoders

about 16 hours ago

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

Paper • 1901.08746 • Published Jan 25, 2019 • 3
Pretraining-Based Natural Language Generation for Text Summarization

Paper • 1902.09243 • Published Feb 25, 2019 • 2
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 7
DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3

Papers - Text - Encoders

about 16 hours ago

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 15
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 13
Triple-Encoders: Representations That Fire Together, Wire Together

Paper • 2402.12332 • Published Feb 19 • 2
BERTs are Generative In-Context Learners

Paper • 2406.04823 • Published Jun 7 • 1

Large Language Model (LLM) and NLP related papers.

about 1 hour ago

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20 • 17
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20 • 11
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 66

about 17 hours ago

DocGraphLM: Documental Graph Language Model for Information Extraction

Paper • 2401.02823 • Published Jan 5 • 35
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 62
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration

Paper • 2309.01131 • Published Sep 3, 2023 • 1

about 24 hours ago

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 9
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 11
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 47

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs