查看: 1217|回复: 0

[ESP8266/ESP32] Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)

本帖最后由 nihui 于 2024-4-14 14:44 编辑

介绍 Beetle ESP32 C6
Beetle ESP32-C6是一款基于ESP32-C6芯片设计的迷你体积的低功耗物联网开发板

超小体积，尺寸仅25*20.5mm
搭载ESP32-C6芯片，支持Wi-Fi、BLE、Zigbee、Thread通讯协议
支持Wi-Fi 6协议，更低延迟，更低功耗
超低功耗，deep-sleep 14uA
集成锂电池充电功能
支持电池电压检测，了解设备电量信息

配置编译开发环境，编译 ncnn，使用 ncnn 进行推理
此处步骤省略，请参考上一篇文章 :D

https://mc.dfrobot.com.cn/thread-318402-1-1.html

https://zhuanlan.zhihu.com/p/690982179

ncnn 模型量化过程
参考 ncnn 量化工具使用教程

https://github.com/Tencent/ncnn/wiki/quantized-int8-inference

ncnn 量化工具需要图片数据集做校准，获得量化的 scale 系数，通常使用测试集图片

下载 mnist 测试数据集，转为 png 图片

http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz

为了方便，可以下载这里png包直接使用

https://github.com/myleott/mnist_png/blob/master/mnist_png.tar.gz?raw=true

解压缩出来，用 find + shuf 命令生成图片文件列表

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图1

find mnist_png/ -type f | grep testing | shuf > imagelist.txt
复制代码

imagelist.txt 内容如下

mnist_png/testing/7/5655.png
mnist_png/testing/5/9234.png
mnist_png/testing/3/1962.png
mnist_png/testing/4/7741.png
mnist_png/testing/0/7231.png
mnist_png/testing/8/4398.png
mnist_png/testing/4/7283.png
...
复制代码

使用 ncnn2table 工具，输入fp32模型和校准数据集，输出校准table

mnist 模型输入的是 0~1 的浮点数，png 图片解码出 0~255 整数，设置 norm 为 1/255 做预处理变换

ncnn2table mnist.param mnist.bin imagelist.txt mnist.table mean=[0] norm=[0.00392156862745] shape=[28,28,1] pixel=GRAY method=kl
复制代码

输出信息

mean = [0.000000]
norm = [0.003922]
shape = [28,28,1]
pixel = GRAY
thread = 32
method = kl
---------------------------------------
count the absmax 0.00% [ 0 / 10000 ]
count the absmax 1.00% [ 100 / 10000 ]
...
count the absmax 99.00% [ 9900 / 10000 ]
count the absmax 98.00% [ 9800 / 10000 ]
build histogram 0.00% [ 0 / 10000 ]
build histogram 1.00% [ 100 / 10000 ]
...
build histogram 96.00% [ 9600 / 10000 ]
build histogram 97.00% [ 9700 / 10000 ]
conv0                                    : max = 1.000000         threshold = 0.997314         scale = 127.341980
conv1                                    : max = 3.666233         threshold = 3.002981         scale = 42.291306
fc                                       : max = 11.118896        threshold = 10.828436        scale = 11.728379
ncnn int8 calibration table create success, best wish for your int8 inference has a low accuracy loss...\(^0^)/...233...
复制代码

使用 ncnn2int8 工具，输入fp32模型和校准table，生成量化后的int8模型

ncnn2int8 mnist.param mnist.bin mnist-int8.param mnist-int8.bin mnist.table
复制代码

输出信息，可以看到mnist模型中的2个卷积层和1个fc层都进行了量化

quantize_convolution conv0
quantize_convolution conv1
quantize_innerproduct fc
复制代码

量化后的模型体积减小到原始的 1/4

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图2

esp32c6 加载量化后的模型
与fp32模型一样，用 ncnn2mem 转换为静态数组，内存加载模型

ncnn2mem mnist-int8.param mnist-int8.bin mnist-int8.id.h mnist-int8.mem.h
复制代码

只需要修改加载模型的2行代码即可，其他代码保持原样

<div class="blockcode"><blockquote>#include "mnist-int8.mem.h"

extern "C" void app_main(void)
{
    ncnn::Net net;

    // net.load_param(mnist_param_bin);
    // net.load_model(mnist_bin);
    net.load_param(mnist_int8_param_bin);
    net.load_model(mnist_int8_bin);
}
复制代码

效果和性能对比
esp32c3加速9.1倍，esp32c6加速5.4倍，提速明显！

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图3

ESP-ROM:esp32c6-20220919
Build:Sep 19 2022
rst:0xc (SW_CPU),boot:0xc (SPI_FAST_FLASH_BOOT)
Saved PC:0x4001975a
SPIWP:0xee
mode:DIO, clock div:2
load:0x40875720,len:0x1804
load:0x4086c110,len:0xe2c
load:0x4086e610,len:0x2e30
entry 0x4086c11a
I (22) boot: ESP-IDF v5.3-dev-2815-gbe06a6f5ff 2nd stage bootloader
I (23) boot: compile time Apr 14 2024 11:52:36
I (24) boot: chip revision: v0.0
I (26) boot.esp32c6: SPI Speed      : 80MHz
I (31) boot.esp32c6: SPI Mode       : DIO
I (36) boot.esp32c6: SPI Flash Size : 2MB
I (41) boot: Enabling RNG early entropy source...
I (46) boot: Partition Table:
I (50) boot: ## Label            Usage          Type ST Offset   Length
I (57) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (64) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (72) boot:  2 factory          factory app      00 00 00010000 00100000
I (79) boot: End of partition table
I (83) esp_image: segment 0: paddr=00010020 vaddr=420b0020 size=0a9e0h ( 43488) map
I (110) esp_image: segment 1: paddr=0001aa08 vaddr=40800000 size=05610h ( 22032) load
I (122) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=aa960h (698720) map
I (407) esp_image: segment 3: paddr=000ca988 vaddr=40805610 size=03e74h ( 15988) load
I (416) esp_image: segment 4: paddr=000ce804 vaddr=40809490 size=00f64h (  3940) load
I (424) boot: Loaded app from partition at offset 0x10000
I (425) boot: Disabling RNG early entropy source...
I (436) cpu_start: Unicore app
I (446) cpu_start: Pro cpu start user code
I (446) cpu_start: cpu freq: 160000000 Hz
I (447) app_init: Application information:
I (449) app_init: Project name:     main
I (454) app_init: App version:      a36153d-dirty
I (459) app_init: Compile time:     Apr 14 2024 11:52:33
I (465) app_init: ELF file SHA256:  0a0e007ce...
I (470) app_init: ESP-IDF:          v5.3-dev-2815-gbe06a6f5ff
I (477) efuse_init: Min chip rev:     v0.0
I (482) efuse_init: Max chip rev:     v0.99
I (487) efuse_init: Chip rev:         v0.0
I (491) heap_init: Initializing. RAM available for dynamic allocation:
I (499) heap_init: At 4080B3B0 len 00071260 (452 KiB): RAM
I (505) heap_init: At 4087C610 len 00002F54 (11 KiB): RAM
I (511) heap_init: At 50000000 len 00003FE8 (15 KiB): RTCRAM
I (518) spi_flash: detected chip: generic
I (522) spi_flash: flash io: dio
W (526) spi_flash: Detected size(4096k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (539) sleep: Configure to isolate all GPIO pins in sleep state
I (546) sleep: Enable automatic switching of GPIO sleep configuration
I (553) coexist: coex firmware version: d96c1e51f
I (558) coexist: coexist rom version 5b8dcfa
I (563) main_task: Started on CPU0
I (563) main_task: Calling app_main()
Loading ncnn mnist model...Done.
Preparing input...Start Mesuring!
Done!
0: -1.63
1: 3.05
2: 12.36
3: 7.95
4: -17.89
5: -9.99
6: -15.13
7: 16.36
8: 0.45
9: -3.34
I think it is number 7!
Latency, avg: 78.77ms, max: 79.48, min: 78.68. Avg Flops: 9.90MFlops
Restarting now.
复制代码