Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers ⢠67 items ⢠Updated Jul 3, 2024 ⢠91
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper ⢠2402.10986 ⢠Published Feb 16, 2024 ⢠77
LayoutLM Collection The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. ⢠5 items ⢠Updated Jul 11, 2024 ⢠14
SpeechT5 Collection The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. ⢠8 items ⢠Updated Jul 11, 2024 ⢠23
nielsr/lilt-roberta-en-base-finetuned-funsd Token Classification ⢠Updated 25 days ago ⢠95 ⢠2