RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper β’ 2409.10516 β’ Published Sep 16, 2024 β’ 41
view article Article Mixedbread π€ deepset: Announcing our New German/English Embedding Model By shadeMe β’ Jul 19, 2024 β’ 15
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper β’ 2407.02490 β’ Published Jul 2, 2024 β’ 23
π Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets β’ 8 items β’ Updated Jun 12, 2024 β’ 37