File size: 2,903 Bytes
150d962 63b502d 150d962 d7b2280 8869360 150d962 63b502d d7b2280 63b502d 150d962 8869360 656b5ab 1e165f4 3ec648f 150d962 656b5ab afa6d79 656b5ab 150d962 2ff84f1 1351f60 2ff84f1 c35fa1f 150d962 63b502d 8751bf1 c35fa1f 4681c4c c35fa1f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
---
title: Open Remove Background Model (ormbg)
license: apache-2.0
tags:
- segmentation
- remove background
- background
- background-removal
- Pytorch
pretty_name: Open Remove Background Model
models:
- schirrmacher/ormbg
datasets:
- schirrmacher/humans
emoji: 💻
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 4.29.0
app_file: hf_space/app.py
pinned: false
---
# Open Remove Background Model (ormbg)
[>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!
![](examples/image/image01_no_background.png)
This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS).
## Inference
```
python ormbg/inference.py
```
## Training
Install dependencies:
```
conda env create -f environment.yaml
conda activate ormbg
```
Replace dummy dataset with [training dataset](https://huggingface.co/datasets/schirrmacher/humans).
```
python3 ormbg/train_model.py
```
# Research
I started training the model with synthetic images of the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) crafted with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse). However, I noticed that the model struggles to perform well on real images.
Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
### Next steps:
- Expand dataset with synthetic and real images
- Research on state of the art loss functions
### Latest changes (26/07/2024):
- Created synthetic dataset with 10k images, crafted with [BlenderProc](https://github.com/DLR-RM/BlenderProc)
- Removed training data created with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse), since it lacks the accuracy needed
- Improved model performance (after 100k iterations):
- F1: 0.9888 -> 0.9932
- MAE: 0.0113 -> 0.008
- Scores based on [this validation dataset](https://drive.google.com/drive/folders/1Yy9clZ58xCiai1zYESQkEKZCkslSC8eg)
### 05/07/2024
- Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
- Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
- Added [PPM-100](https://github.com/ZHKKKe/PPM) dataset for training and validation
- Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter
|