Edit model card

UperNet, Swin Transformer large-sized backbone

UperNet framework for semantic segmentation, leveraging a Swin Transformer backbone. UperNet was introduced in the paper Unified Perceptual Parsing for Scene Understanding by Xiao et al.

Combining UperNet with a Swin Transformer backbone was introduced in the paper Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

Disclaimer: The team releasing UperNet + Swin Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

UperNet is a framework for semantic segmentation. It consists of several components, including a backbone, a Feature Pyramid Network (FPN) and a Pyramid Pooling Module (PPM).

Any visual backbone can be plugged into the UperNet framework. The framework predicts a semantic label per pixel.

UperNet architecture

Intended uses & limitations

You can use the raw model for semantic segmentation. See the model hub to look for fine-tuned versions (with various backbones) on a task that interests you.

How to use

For code examples, we refer to the documentation.

Downloads last month
4,130
Safetensors
Model size
234M params
Tensor type
I64
ยท
F32
ยท

Space using openmmlab/upernet-swin-large 1