Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
merveΒ 
posted an update Jan 18, 2024
Post
Posting about a very underrated model that tops paperswithcode across different segmentation benchmarks: OneFormer πŸ‘‘

OneFormer is a "truly universal" model for semantic, instance and panoptic segmentation tasks βš”οΈ
What makes is truly universal is that it's a single model that is trained only once and can be used across all tasks.
The enabler here is the text conditioning, i.e. the model is given a text query that states task type along with the appropriate input, and using contrastive loss, the model learns the difference between different task types πŸ‘‡ (see in the image below)

It's also super easy to use with transformers.
from transformers import OneFormerProcessor, OneFormerForUniversalSegmentation

processor = OneFormerProcessor.from_pretrained("shi-labs/oneformer_ade20k_swin_large")
model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_ade20k_swin_large")

# swap the postprocessing and task_inputs for different types of segmentation
semantic_inputs = processor(images=image, task_inputs=["semantic"], return_tensors="pt")
semantic_outputs = model(**semantic_inputs)
predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

I have drafted a notebook for you to try right away ✨ https://colab.research.google.com/drive/1wfJhoTFqUqcTAYAOUc6TXUubBTmOYaVa?usp=sharing
You can also check out the Space without checking out the code itself πŸ‘‰ shi-labs/OneFormer

AWESOME

Wow, love it, great work... Something like this would be great implemented in bin-picking and working station in my work-place.

I work as a Mechanical Engineer for a foundry and I would like to learn more, but I don't know anything about coding.

Is there anyway that I could learn or train Mechanical Robot with zero/minor coding abilities?

Thanks, Tay from Italy.

So good! This kind of architecture has been on my mind, thank you very much for sharing. The approach and method used for generating 'pixel tokens' looks great.