reichenbach
commited on
Commit
•
090a35f
1
Parent(s):
202e839
adding files
Browse files- README.md +23 -0
- variables/variables.index +0 -0
README.md
ADDED
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- ConvMixer
|
5 |
+
- keras-io
|
6 |
+
license: apache-2.0
|
7 |
+
datasets:
|
8 |
+
- cifar10
|
9 |
+
---
|
10 |
+
|
11 |
+
# ConvMixer model
|
12 |
+
|
13 |
+
The ConvMixer model is trained on Cifar10 dataset and is based on [the paper](https://arxiv.org/abs/2201.09792v1), [github](https://github.com/locuslab/convmixer).
|
14 |
+
|
15 |
+
Disclaimer : This is a demo model for Sayak Paul's keras [example](https://keras.io/examples/vision/convmixer/). Please refrain from using this model for any other purpose.
|
16 |
+
|
17 |
+
## Description
|
18 |
+
|
19 |
+
The paper uses 'patches' (square group of pixels) extracted from the image, which has been done in other Vision Transformers like [ViT](https://arxiv.org/abs/2010.11929v2). One notable dawback of such architectures is the quadratic runtime of self-attention layers which takes a lot of time and resources to train for usable output. The ConvMixer model, instead uses Convolutions along with the MLP-mixer to obtain similar results to that of transformers at a fraction of cost.
|
20 |
+
|
21 |
+
### Intended Use
|
22 |
+
|
23 |
+
This model is intended to be used as a demo model for keras-io.
|
variables/variables.index
ADDED
Binary file (17.3 kB). View file
|
|