AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? Paper • 2407.15711 • Published Jul 22 • 9
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper • 2401.13919 • Published Jan 25 • 27
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • May 22 • 25
InternVL1.0 Collection Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks • 16 items • Updated 2 days ago • 18
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Nov 14 • 534
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time Paper • 2404.10667 • Published Apr 16 • 18
Vector-io compatible Datasets Collection These datasets can be loaded into your vector database with a single line bash command • 15 items • Updated Sep 19 • 3