gyrojeff commited on
Commit
cef303d
β€’
1 Parent(s): 9e89388

doc: add experiments and badget

Browse files
Files changed (1) hide show
  1. README.md +12 -9
README.md CHANGED
@@ -16,6 +16,7 @@ app_port: 7860
16
  <img alt="License" src="https://img.shields.io/github/license/JeffersonQin/YuzuMarker.FontDetection"/>
17
  <img alt="Contributors" src="https://img.shields.io/github/contributors/JeffersonQin/YuzuMarker.FontDetection"/>
18
  </p>
 
19
  </div>
20
 
21
  ## Scene Text Font Dataset Generation
@@ -204,7 +205,7 @@ On our synthesized dataset,
204
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I | float32 |
205
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I | bfloat16_3x |
206
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 16.24% | ff82fe6 | I | bfloat16_3x |
207
- | ResNet-18 | βœ…*<sup>7</sup> | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 27.71% | a976004 | I | bfloat16_3x |
208
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Tanh | 512x512 | I | 29.95% | 8364103 | I | bfloat16_3x |
209
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 29.37% [Early stop] | 8d2e833 | I | bfloat16_3x |
210
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 416x416 | I | [Lower Trend] | d5a3215 | I | bfloat16_3x |
@@ -214,11 +215,12 @@ On our synthesized dataset,
214
  | ResNet-50 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 34.21% | e980b66 | I | bfloat16_3x |
215
  | ResNet-18 | βœ…* | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 31.24% | 416c7bb | I | bfloat16_3x |
216
  | ResNet-18 | βœ…* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 34.69% | 855e240 | I | bfloat16_3x |
217
- | ResNet-18 | βœ”οΈ*<sup>8</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 38.32% | 1750035 | I | bfloat16_3x |
218
  | ResNet-18 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III<sup>3</sup> | 38.87% | 0693434 | I | bfloat16_3x |
219
  | ResNet-50 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 48.99% | bc0f7fc | II<sup>6</sup> | bfloat16_3x |
220
- | ResNet-50 | βœ”οΈ | βœ… | βœ… | βœ…<sup>10</sup> | Sigmoid | 512x512 | III | 46.12% | 0f071a5 | II | bfloat16 |
221
- | ResNet-50 | ❕<sup>9</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 43.86% | 0f071a5 | II | bfloat16 |
 
222
  | ResNet-50 | ❕ | βœ… | βœ… | βœ… | Sigmoid | 512x512 | III | 41.35% | 0f071a5 | II | bfloat16 |
223
 
224
  * \* Bug in implementation
@@ -227,11 +229,12 @@ On our synthesized dataset,
227
  * <sup>3</sup> `learning rate = 0.001, lambda = (2, 0.5, 1)`
228
  * <sup>4</sup> `learning rate = 0.01, lambda = (2, 0.5, 1)`
229
  * <sup>5</sup> Initial version of synthesized dataset
230
- * <sup>6</sup> Doubled synthesized dataset
231
- * <sup>7</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
232
- * <sup>8</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
233
- * <sup>9</sup> Data Augmentation v3: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°] + Random Horizontal Flip + Random Downsample [1, 2]
234
- * <sup>10</sup> Preserve Aspect Ratio by Random Cropping
 
235
 
236
  ## Pretrained Models
237
 
 
16
  <img alt="License" src="https://img.shields.io/github/license/JeffersonQin/YuzuMarker.FontDetection"/>
17
  <img alt="Contributors" src="https://img.shields.io/github/contributors/JeffersonQin/YuzuMarker.FontDetection"/>
18
  </p>
19
+ <a href="https://www.buymeacoffee.com/gyrojeff" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
20
  </div>
21
 
22
  ## Scene Text Font Dataset Generation
 
205
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 18.58% | 5c43f60 | I | float32 |
206
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Sigmoid | 512x512 | II<sup>2</sup> | 14.39% | 5a85fd3 | I | bfloat16_3x |
207
  | ResNet-18 | ❌ | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 16.24% | ff82fe6 | I | bfloat16_3x |
208
+ | ResNet-18 | βœ…*<sup>8</sup> | ❌ | ❌ | ❌ | Tanh | 512x512 | II | 27.71% | a976004 | I | bfloat16_3x |
209
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Tanh | 512x512 | I | 29.95% | 8364103 | I | bfloat16_3x |
210
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 29.37% [Early stop] | 8d2e833 | I | bfloat16_3x |
211
  | ResNet-18 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 416x416 | I | [Lower Trend] | d5a3215 | I | bfloat16_3x |
 
215
  | ResNet-50 | βœ…* | ❌ | ❌ | ❌ | Sigmoid | 512x512 | I | 34.21% | e980b66 | I | bfloat16_3x |
216
  | ResNet-18 | βœ…* | βœ… | ❌ | ❌ | Sigmoid | 512x512 | I | 31.24% | 416c7bb | I | bfloat16_3x |
217
  | ResNet-18 | βœ…* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 34.69% | 855e240 | I | bfloat16_3x |
218
+ | ResNet-18 | βœ”οΈ*<sup>9</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | I | 38.32% | 1750035 | I | bfloat16_3x |
219
  | ResNet-18 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III<sup>3</sup> | 38.87% | 0693434 | I | bfloat16_3x |
220
  | ResNet-50 | βœ”οΈ* | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 48.99% | bc0f7fc | II<sup>6</sup> | bfloat16_3x |
221
+ | ResNet-50 | βœ”οΈ | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 48.45% | 0f071a5 | II | bfloat16_3x |
222
+ | ResNet-50 | βœ”οΈ | βœ… | βœ… | βœ…<sup>11</sup> | Sigmoid | 512x512 | III | 46.12% | 0f071a5 | II | bfloat16 |
223
+ | ResNet-50 | ❕<sup>10</sup> | βœ… | βœ… | ❌ | Sigmoid | 512x512 | III | 43.86% | 0f071a5 | II | bfloat16 |
224
  | ResNet-50 | ❕ | βœ… | βœ… | βœ… | Sigmoid | 512x512 | III | 41.35% | 0f071a5 | II | bfloat16 |
225
 
226
  * \* Bug in implementation
 
229
  * <sup>3</sup> `learning rate = 0.001, lambda = (2, 0.5, 1)`
230
  * <sup>4</sup> `learning rate = 0.01, lambda = (2, 0.5, 1)`
231
  * <sup>5</sup> Initial version of synthesized dataset
232
+ * <sup>6</sup> Doubled synthesized dataset (2x)
233
+ * <sup>7</sup> Quadruple synthesized dataset (4x)
234
+ * <sup>8</sup> Data Augmentation v1: Color Jitter + Random Crop [81%-100%]
235
+ * <sup>9</sup> Data Augmentation v2: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°]
236
+ * <sup>10</sup> Data Augmentation v3: Color Jitter + Random Crop [30%-130%] + Random Gaussian Blur + Random Gaussian Noise + Random Rotation [-15Β°, 15Β°] + Random Horizontal Flip + Random Downsample [1, 2]
237
+ * <sup>11</sup> Preserve Aspect Ratio by Random Cropping
238
 
239
  ## Pretrained Models
240