patrickjohncyh
/

fashion-clip

Zero-Shot Image Classification

Inference Endpoints

Model card Files Files and versions Community

patrickjohncyh commited on Mar 10, 2023

Commit

12b28ac

•

1 Parent(s): 83cb9b6

update model card to reflect new model

Files changed (1) hide show

README.md +12 -0

README.md CHANGED Viewed

@@ -25,6 +25,18 @@ Disclaimer: The model card adapts the model card from [here](https://huggingface
 ## Model Details
 FashionCLIP is a CLIP-based model developed to produce general product representations for fashion concepts. Leveraging the pre-trained checkpoint (ViT-B/32) released by [OpenAI](https://github.com/openai/CLIP), we train FashionCLIP on a large, high-quality novel fashion dataset to study whether domain specific fine-tuning of CLIP-like models is sufficient to produce product representations that are zero-shot transferable to entirely new datasets and tassks. FashionCLIP was not developed for model deplyoment - to do so, researchers will first need to carefully study their capabilities in relation to the specific context they’re being deployed within.
 ### Model Date

 ## Model Details
+UPDATE (10/03/22): We have updated the model! We found that [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) checkpoint worked better than original OpenAI CLIP on Fashion. We thus fine-tune a newer (and better!) version of FashionCLIP (henceforth FashionCLIP 2.0), while keeping the architecture the same. We postulate that the perofrmance gains afforded by `laion/CLIP-ViT-B-32-laion2B-s34B-b79K` are due to the increased training data (5x OpenAI CLIP data). Our [thesis](https://www.nature.com/articles/s41598-022-23052-9), however, remains the same -- fine-tuning `laion/CLIP` on our fashion dataset improved zero-shot perofrmance across our benchmarks. See the below table comparing weighted macro F1 score across models.
+| Model             | FMNIST        | KAGL          | DEEP          |
+| -------------     | ------------- | ------------- | ------------- |
+| OpenAI CLIP       | 0.66          | 0.63          | 0.45          |
+| FashionCLIP       | 0.74          | 0.67          | 0.48          |
+| Laion CLIP        | 0.78          | 0.71          | 0.58          |
+| FashionCLIP 2.0   | __0.83__          | __0.73__          | __0.62__          |
+---
 FashionCLIP is a CLIP-based model developed to produce general product representations for fashion concepts. Leveraging the pre-trained checkpoint (ViT-B/32) released by [OpenAI](https://github.com/openai/CLIP), we train FashionCLIP on a large, high-quality novel fashion dataset to study whether domain specific fine-tuning of CLIP-like models is sufficient to produce product representations that are zero-shot transferable to entirely new datasets and tassks. FashionCLIP was not developed for model deplyoment - to do so, researchers will first need to carefully study their capabilities in relation to the specific context they’re being deployed within.
 ### Model Date