Course-Correction: Safety Alignment Using Synthetic Preferences Paper • 2407.16637 • Published Jul 23 • 24
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? Paper • 2407.16607 • Published Jul 23 • 21
Efficient Inference of Vision Instruction-Following Models with Elastic Cache Paper • 2407.18121 • Published Jul 25 • 15
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic Paper • 2407.18129 • Published Jul 25 • 11
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 30
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26 • 18
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 22
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18 • 38
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher Paper • 2407.20183 • Published Jul 29 • 37
Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings Paper • 2407.20581 • Published Jul 30 • 23
A Large Encoder-Decoder Family of Foundation Models For Chemical Language Paper • 2407.20267 • Published Jul 24 • 31
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model Paper • 2407.16982 • Published Jul 24 • 40
Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification Paper • 2407.19340 • Published Jul 27 • 56
TAPTRv2: Attention-based Position Update Improves Tracking Any Point Paper • 2407.16291 • Published Jul 23 • 10
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge Paper • 2407.19594 • Published Jul 28 • 19
SHIC: Shape-Image Correspondences with no Keypoint Supervision Paper • 2407.18907 • Published Jul 26 • 39
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 64