Sinisa Stanivuk

Stopwolf

AI & ML interests

Multilingual LLMs, STT and TTS models

Recent Activity

new activity 23 days ago
stepfun-ai/GOT-OCR2_0:Batch inference
liked a dataset about 1 month ago
CohereForAI/Global-MMLU
liked a model about 1 month ago
Snowflake/snowflake-arctic-embed-m-v2.0
View all activity

Organizations

Intellya Data Science Team's profile picture Data Is Better Together Contributor's profile picture

Stopwolf's activity

New activity in stepfun-ai/GOT-OCR2_0 23 days ago

Batch inference

#38 opened 23 days ago by
Stopwolf
reacted to nataliaElv's post with 👀 about 2 months ago
view post
Post
1637
Would you like to get a high-quality dataset to pre-train LLMs in your language? 🌏

At Hugging Face we're preparing a collaborative annotation effort to build an open-source multilingual dataset as part of the Data is Better Together initiative.

Follow the link below, check if your language is listed and sign up to be a Language Lead!

https://forms.gle/s9nGajBh6Pb9G72J6
reacted to prithivMLmods's post with 🔥🚀 3 months ago
view post
Post
3968
I’m recently experimenting with the Flux-Ultra Realism and Real Anime LoRA models, using the Flux.1-dev model as the base. The model and its demo example are provided in the Flux LoRA DLC collections.📃

🥳Demo : 🔗 prithivMLmods/FLUX-LoRA-DLC

🥳Model:
- prithivMLmods/Canopus-LoRA-Flux-UltraRealism-2.0
- prithivMLmods/Flux-Dev-Real-Anime-LoRA

🥳For more details, please visit the README.md of the Flux LoRA DLC Space & prithivMLmods/lora-space-collections-6714b72e0d49e1c97fbd6a32
  • 1 reply
·
reacted to tomaarsen's post with 🔥 3 months ago
view post
Post
6973
📣 Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.

1️⃣ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2️⃣ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.

Usage is as simple as SentenceTransformer("all-MiniLM-L6-v2", backend="onnx"). Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later 😉

🔒 Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:

1️⃣ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with from_model2vec or with from_distillation where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.
2️⃣ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
  • 1 reply
·
reacted to alielfilali01's post with 👍 3 months ago
view post
Post
2575
Don't you think we should add a tag "Evaluation" for datasets that are meant to be benchmarks and not for training ?

At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.
reacted to MoritzLaurer's post with ❤️ 4 months ago
view post
Post
4589
#phdone - I defended my PhD yesterday! A key lesson: it is amazing how open science and open source can empower beginners with limited resources:

I first learned about instruction-based classifiers like BERT-NLI 3-4 years ago, through the @HuggingFace ZeroShotClassificationPipeline. Digging deeper into this, it was surprisingly easy to find new datasets, newer base models, and reusable fine-tuning scripts on the HF Hub to create my own zeroshot models - although I didn't know much about fine-tuning at the time.

Thanks to the community effect of the Hub, my models were downloaded hundreds of thousands of times after a few months. Seeing my research being useful for people motivated me to improve and upload newer models. Leaving my contact details in the model cards led to academic cooperation and consulting contracts (and eventually my job at HF).

That's the power of open science & open source: learning, sharing, improving, collaborating.

I mean every word in my thesis acknowledgments (screenshot). I'm very grateful to my supervisors @vanatteveldt @CasAndreu @KasperWelbers for their guidance; to @profAndreaRenda and @CEPS_thinktank for enabling me to work part-time during the first year; to @huggingface for creating awesome tools and an awesome platform; and to many others who are not active on social media.

Links to the full thesis and the collection of my most recent models are below.

PS: If someone happens to speak Latin, let me know if my diploma contains some hidden Illuminati code or something :D
·
upvoted an article 4 months ago
view article
Article

Deploy Embedding Models with Hugging Face Inference Endpoints

2
New activity in DjMel/oz-eval 5 months ago

Add Qwen2-72B-Instruct

#7 opened 5 months ago by
Stopwolf

Add Hermes3, Llama3.1 70B

#6 opened 5 months ago by
Stopwolf