Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 67 items • Updated Jul 3 • 87
jina-embeddings-v2 Collection The V2 family of Jina Embeddings supports encoding large documents with 8k sequence length. • 8 items • Updated Sep 17 • 15
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 75
Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models Paper • 2307.11224 • Published Jul 20, 2023 • 6
Lost in the Middle: How Language Models Use Long Contexts Paper • 2307.03172 • Published Jul 6, 2023 • 37
Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks Paper • 2301.12847 • Published Jan 30, 2023 • 2