Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2405.00732

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118
apple/DCLM-7B

Updated Jul 26 • 576 • 815

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

Papers - Fine-tuning - LoRA - LoRAX

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

Potential Training

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3

Paper • 2405.00664 • Published May 1 • 18
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 83
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16 • 15
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 24
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14 • 25

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 96
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 103
TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 43

Interesting Papers

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 83
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

Paper • 2404.05961 • Published Apr 9 • 63
Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15 • 27
Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23 • 59

Collection of resources related to Agents.

Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 2
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 34
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 14

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 103
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 38
ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 51
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

Paper • 2403.18814 • Published Mar 27 • 44

Previous
1
2
3
4
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs