TACDEC Pipeline
This is a simple model with weights and reproducible code for the results in TACDEC-paper.
What you can find in this repo is:
- The simple model used in the TACDEC-paper
- The weights used in the proof-of-concept section in the TACDEC-paper
- A first notebook, feature_extraction.ipynb, that contains a feature extraction process using DINOv2.
- A second notebook, train_classifier.ipynb, that uses the features that were either extracted using the first notebook, or downloaded directly from TACDEC repo.
We highly recommend downloading the already extracted and concatenated (features)[https://huggingface.co/datasets/SimulaMet-HOST/TACDEC/resolve/main/sorted_cls_tokens_features.pt] and the concatenated (labels)[https://huggingface.co/datasets/SimulaMet-HOST/TACDEC/resolve/main/sorted_cls_tokens_labels.npy] if you wish to try the dataset/model. You would then just have to run the second notebook.
Prior to running the notebooks, make sure to have all the dependencies needed. See requirements.txt. When downloaded, run:
pip install -r requirements.txt
If you hold more interest in DINOv2, the feature_extraction.ipynb could hold good value.
In both notebooks, there should be good enough documentation, but should you have any questions, see TACDEC.
More information
For any other information or information about the dataset: TACDEC