A newer version of the Streamlit SDK is available:
1.41.1
metadata
title: 'XLabel: eXplainable Labeling Assistant'
emoji: 💻
colorFrom: pink
colorTo: gray
sdk: streamlit
sdk_version: 1.15.2
app_file: app.py
pinned: true
license: apache-2.0
XLabel: eXplainable Labeling Assistant
XLabel is an open-source Streamlit app that takes an explainable machine learning approach to visual-interactive data labeling.
This is the official code of the following paper: An Explainable Machine Learning Approach to Visual-Interactive Labeling: A Case Study on Non-communicable Disease Data Donlapark Ponnoprat, Parichart Pattarapanitchai, Phimphaka Taninpong, Suthep Suantai
News (01/01/2023)
- Use tabs instead of radio buttons for multiple labels.
- The app now requires
streamlit>=1.16.0
for the tabs andinterpret>=0.3.0
for handling missing data.
Features
XLabel can:
- Predict the most probable labels using Explainable Boosting Machine (EBM).
- Show the contributions of each feature towards the predicted labels.
- Provide an option to write the labels directly into the data file (use
XLabel.py
) or save them in a separate file (useXLabelDL.py
) - Support data with multiple labels and multiple classes.
- Support data with missing values (thanks to EBM) and/or non-numeric categorical features.
Usage
Before using XLabel, the data file must follow the following tabular convention:
- The file must be in either CSV or Excel format.
- The first row of the file must be the names of the columns.
- The first column must contain a unique identifier (id) for each row.
- The label columns must appear last. In addition, a few instances must have already been labeled, with each class appearing at least once (For example, if a label has five possible classes, then the required minimum number of labeled instances is 5).
With your data file satisfying these conditions, you can now start data labeling with XLabel!
- Copy
XLabel.py
to the directory that contains the data file and run thestreamlit
command:streamlit run XLabel.py
- By design,
XLabel.py
will write the labeled data to the original data file. If instead you would like to download the labeled data as a separate file, useXLabelDL.py
instead. - You can assign a specific list of input features for each label by editing
configs.json
and copying it along withXLabel.py
. There are also other sidebar options that you can play around as well. Here is an example ofrconfigs.json
.
- By design,
- Upload a data file (only on the first run), select the options on the sidebar, and then click "Sample". The samples with lowest predictive confidences will be shown first in the main screen.
- Check the suggested labels; you can keep the correct ones and change the wrong ones.
- Click the "Submit Labels" button at the bottom of the page to save the labels.
- If you are using
XLabel.py
, the labels will be saved directly to the original data file. - If you are using
XLabelDL.py
, you need to click theDownload labeled data
in the sidebar to download the labeled data as a new file.
- If you are using