-
Fast Vocabulary Transfer for Language Model Compression
Paper • 2402.09977 • Published • 2 -
Multi-Word Tokenization for Sequence Compression
Paper • 2402.09949 • Published -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 4 -
Language Model Tokenizers Introduce Unfairness Between Languages
Paper • 2305.15425 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2405.07883
-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 10 -
Attention Is All You Need
Paper • 1706.03762 • Published • 44 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 4
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 16 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 10 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 64
-
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 13 -
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 2 -
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
Paper • 2311.10797 • Published
-
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Paper • 2310.05737 • Published • 4 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1 -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 1 -
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Paper • 2305.11554 • Published • 2