File size: 515 Bytes
1933d4d
b97fa3e
 
 
355785c
b97fa3e
8327714
 
 
355785c
 
 
 
 
 
 
8327714
 
 
a1632f2
8327714
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Resnet50 Test Quantization for accelerated int8-int8 compute for Apple Neural Engine

Resnet From Code:<br>

Int8ANE.ipynb - test notebook to create <br>

<br>
CoreMLTools 8.01b, M4 iPad Pro 16GB, iPad OS18.1 beta<br>
<br>

LUT 4-bit FP16 1.03ms<br>
A4W8 0.58 ms<br>
LUT 4-bit (A8W8) 0.92ms - no Accell!<br>


Tourch Model:
8-bit<br>
resnet50-LUT8-iOS17.mlpackage 1.08 ms<br>
resnet50-W8A8-iOS17.mlpackage 0.81 ms<br>

4-bit<br>
resnet50-LUT4-iOS17.mlpackage 0.93 ms<br>	
resnet50-W4A8-iOS18.mlpackage 0.68 ms<br>