WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Paper • 2406.18495 • Published 3 days ago • 10
Octo-planner: On-device Language Model for Planner-Action Agents Paper • 2406.18082 • Published 3 days ago • 42
Adam-mini: Use Fewer Learning Rates To Gain More Paper • 2406.16793 • Published 5 days ago • 50
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published 8 days ago • 52
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges Paper • 2406.12624 • Published 11 days ago • 34
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published 12 days ago • 37
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks Paper • 2406.12925 • Published 15 days ago • 17
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Paper • 2406.11811 • Published 12 days ago • 14
Tokenization Falling Short: The Curse of Tokenization Paper • 2406.11687 • Published 12 days ago • 13
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries Paper • 2406.12824 • Published 11 days ago • 20
view article Article The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models Jan 29 • 8
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published 19 days ago • 60
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated 8 days ago • 20
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • 6 days ago • 28
Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Paper • 2405.09874 • Published May 16 • 15
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper • 2405.19893 • Published 30 days ago • 26
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 21
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 29 days ago • 346
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published May 15 • 23
Customizing Text-to-Image Models with a Single Image Pair Paper • 2405.01536 • Published May 2 • 17
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 116
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 106
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 67
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 124
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 75
How Far Can We Go with Practical Function-Level Program Repair? Paper • 2404.12833 • Published Apr 19 • 6
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19 • 27
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 240
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18 • 51
MeshLRM: Large Reconstruction Model for High-Quality Mesh Paper • 2404.12385 • Published Apr 18 • 24
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data Paper • 2404.12195 • Published Apr 18 • 11
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper • 2404.05719 • Published Apr 8 • 57
Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm Paper • 2403.11781 • Published Mar 18 • 17
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18 • 60
RAFT: Adapting Language Model to Domain Specific RAG Paper • 2403.10131 • Published Mar 15 • 65
3D-GPT: Procedural 3D Modeling with Large Language Models Paper • 2310.12945 • Published Oct 19, 2023 • 52
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams Paper • 2310.08678 • Published Oct 12, 2023 • 11
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Paper • 2309.06380 • Published Sep 12, 2023 • 32
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Paper • 2308.06721 • Published Aug 13, 2023 • 25
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models Paper • 2309.15103 • Published Sep 26, 2023 • 42
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models Paper • 2309.05793 • Published Sep 11, 2023 • 50
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Paper • 2310.00426 • Published Sep 30, 2023 • 60
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning Paper • 2309.15091 • Published Sep 26, 2023 • 32
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? Paper • 2309.08963 • Published Sep 16, 2023 • 9
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models Paper • 2307.09793 • Published Jul 19, 2023 • 45