README.md · deepghs/anime_classification at 0d381623aeb57aa5dcb3c827696ad433fb49729d

metadata

license: mit
datasets:
  - deepghs/anime_classification
metrics:
  - accuracy
pipeline_tag: image-classification
tags:
  - art

The model used to predict the types of anime images, which includes the following four categories:

3D: Images rendered in 3D, including Mikumikudance, Koikatsu, etc.
Bangumi: Screenshots from anime videos.
Comic: Images of manga that contain a significant amount of text or panel sequences.
Illustration: General anime illustrations.
Not Painting: (Only available in new models) Any content that cannot be called a painting, such as artist promotional posts, game screenshots, chat logs, etc.

Name	FLOPS	Params	Accuracy	AUC	Confusion	Labels
caformer_s36_v1.5_focal	22.10G	37.22M	94.93%	0.9956	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.4_focal_fixed	22.10G	37.22M	96.21%	0.9971	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.4_focal_fp32	22.10G	37.22M	95.98%	0.9969	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
mobilenetv3_v1.4_dist	0.63G	4.18M	94.77%	0.9950	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.4_focal	22.10G	37.22M	95.82%	0.9967	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
mobilenetv3_v1.3_dist	0.63G	4.18M	96.41%	0.9973	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.3_focal	22.10G	37.22M	97.16%	0.9982	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
mobilenetv3_v1.2_dist	0.63G	4.18M	96.53%	0.9972	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.2_focal	22.10G	37.22M	97.23%	0.9982	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1.1_focal	22.10G	37.22M	95.99%	0.9967	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
mobilenetv3_v1_dist	0.63G	4.18M	94.04%	0.9928	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
caformer_s36_v1	22.10G	37.22M	94.72%	0.9934	confusion	`3d`, `bangumi`, `comic`, `illustration`, `not_painting`
mobilenetv3_dist	0.63G	4.18M	91.98%	0.9879	confusion	`3d`, `bangumi`, `comic`, `illustration`
mobilenetv3_sce_dist	0.63G	4.18M	92.35%	0.9854	confusion	`3d`, `bangumi`, `comic`, `illustration`
caformer_s36_plus	22.10G	37.22M	93.47%	0.9891	confusion	`3d`, `bangumi`, `comic`, `illustration`
mobilevitv2_150	9.09G	9.79M	88.21%	N/A	confusion	`3d`, `bangumi`, `comic`, `illustration`
mobilenetv3	0.63G	4.18M	88.96%	N/A	confusion	`3d`, `bangumi`, `comic`, `illustration`
mobilenetv3_sce	0.63G	4.18M	89.92%	0.9786	confusion	`3d`, `bangumi`, `comic`, `illustration`
caformer_s36	22.10G	37.22M	88.19%	N/A	confusion	`3d`, `bangumi`, `comic`, `illustration`

Model	FLOPs	Accuracy	Confusion Matrix	Description
caformer_s36	22.10G	88.19%	Confusion Matrix	Model: caformer_s36 from timm
caformer_s36_plus	22.10G	93.47%	Confusion Matrix	Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm
mobilenetv3	0.63G	88.96%	Confusion Matrix	Model: mobilenetv3_large_100 from timm
mobilenetv3_dist	0.63G	91.98%	Confusion Matrix	Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with focal loss
mobilenetv3_sce	0.63G	89.92%	Confusion Matrix	Model: mobilenetv3_large_100 from timm, use SCELoss as loss function
mobilenetv3_sce_dist	0.63G	92.35%	Confusion Matrix	Distrillated from caformer_s36_plus, using mobilenetv3_large_100 with SCELoss
mobilevitv2_150	9.09G	88.21%	Confusion Matrix	Model: mobilevitv2_150 from timm