Sigrid Jin's picture

Sigrid Jin PRO

sigridjineth

·

https://sigridjin.medium.com

AI & ML interests

Newbie

Recent Activity

liked a model 35 minutes ago

yunguks/walk1009-4bit

liked a model about 1 hour ago

LGAI-EXAONE/EXAONE-3.5-32B-Instruct

liked a model about 12 hours ago

avsolatorio/NoInstruct-small-Embedding-v0

View all activity

Organizations

sigridjineth's activity

upvoted 2 papers 8 days ago

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1

DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection

Paper • 2406.00856 • Published Jun 2, 2024 • 11

upvoted a collection 16 days ago

NeMo Curator - Classifier Models

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 9 items • Updated 19 days ago • 10

upvoted a paper 18 days ago

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published May 30, 2024 • 34

upvoted a collection 18 days ago

jina-clip

Multimodal text-image embeddings • 4 items • Updated 19 days ago • 10

upvoted a collection 20 days ago

MMTEB

Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. • 4 items • Updated Jun 6, 2024 • 1

upvoted a collection 25 days ago

Arctic-embed

A collection of text embedding models optimized for retrieval accuracy and efficiency • 8 items • Updated 27 days ago • 17

upvoted a collection about 1 month ago

ColPali Models

Pre-trained checkpoints for the ColPali model. • 8 items • Updated 24 days ago • 3

upvoted a collection 3 months ago

Small LMs Text Embedding

Contrastive fine-tuned version of Language Models up to 2B parameters using LoRA • 3 items • Updated May 8, 2024 • 4

upvoted a collection 4 months ago

Papers I want to read

Papers in my to-read list • 254 items • Updated 4 days ago • 27

upvoted 4 collections 5 months ago

Korean Pretraining Dataset

15 items • Updated Nov 19, 2024 • 10

Matryoshka Embedding Models

https://huggingface.co/blog/matryoshka • 14 items • Updated Jun 4, 2024 • 15

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 57

GTE models

General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 19 items • Updated 12 days ago • 18

upvoted an article 5 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29, 2024

• 260

upvoted an article 6 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 235

upvoted 2 papers 6 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 129

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 62

upvoted 2 collections 8 months ago

Yi-1.5 (2024/05)

10 items • Updated May 20, 2024 • 91

Base + Language + Instruct (Korean)

8 items • Updated May 24, 2024 • 3