2024-4-14 14:42:45 [显示全部楼层]
405浏览
查看: 405|回复: 0

[ESP8266/ESP32] Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)

[复制链接]
本帖最后由 nihui 于 2024-4-14 14:44 编辑

介绍 Beetle ESP32 C6
Beetle ESP32-C6是一款基于ESP32-C6芯片设计的迷你体积的低功耗物联网开发板

  • 超小体积,尺寸仅25*20.5mm
  • 搭载ESP32-C6芯片,支持Wi-Fi、BLE、Zigbee、Thread通讯协议
  • 支持Wi-Fi 6协议,更低延迟,更低功耗
  • 超低功耗,deep-sleep 14uA
  • 集成锂电池充电功能
  • 支持电池电压检测,了解设备电量信息
配置编译开发环境,编译 ncnn,使用 ncnn 进行推理
此处步骤省略,请参考上一篇文章 :D

https://mc.dfrobot.com.cn/thread-318402-1-1.html

https://zhuanlan.zhihu.com/p/690982179


ncnn 模型量化过程
参考 ncnn 量化工具使用教程

https://github.com/Tencent/ncnn/wiki/quantized-int8-inference

ncnn 量化工具需要图片数据集做校准,获得量化的 scale 系数,通常使用测试集图片

下载 mnist 测试数据集,转为 png 图片

http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz

为了方便,可以下载这里png包直接使用

https://github.com/myleott/mnist_png/blob/master/mnist_png.tar.gz?raw=true

解压缩出来,用 find + shuf 命令生成图片文件列表

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图1


  1. find mnist_png/ -type f | grep testing | shuf > imagelist.txt
复制代码
imagelist.txt 内容如下

  1. mnist_png/testing/7/5655.png
  2. mnist_png/testing/5/9234.png
  3. mnist_png/testing/3/1962.png
  4. mnist_png/testing/4/7741.png
  5. mnist_png/testing/0/7231.png
  6. mnist_png/testing/8/4398.png
  7. mnist_png/testing/4/7283.png
  8. ...
复制代码
使用 ncnn2table 工具,输入fp32模型和校准数据集,输出校准table

mnist 模型输入的是 0~1 的浮点数,png 图片解码出 0~255 整数,设置 norm 为 1/255 做预处理变换

  1. ncnn2table mnist.param mnist.bin imagelist.txt mnist.table mean=[0] norm=[0.00392156862745] shape=[28,28,1] pixel=GRAY method=kl
复制代码
输出信息
  1. mean = [0.000000]
  2. norm = [0.003922]
  3. shape = [28,28,1]
  4. pixel = GRAY
  5. thread = 32
  6. method = kl
  7. ---------------------------------------
  8. count the absmax 0.00% [ 0 / 10000 ]
  9. count the absmax 1.00% [ 100 / 10000 ]
  10. ...
  11. count the absmax 99.00% [ 9900 / 10000 ]
  12. count the absmax 98.00% [ 9800 / 10000 ]
  13. build histogram 0.00% [ 0 / 10000 ]
  14. build histogram 1.00% [ 100 / 10000 ]
  15. ...
  16. build histogram 96.00% [ 9600 / 10000 ]
  17. build histogram 97.00% [ 9700 / 10000 ]
  18. conv0                                    : max = 1.000000         threshold = 0.997314         scale = 127.341980
  19. conv1                                    : max = 3.666233         threshold = 3.002981         scale = 42.291306
  20. fc                                       : max = 11.118896        threshold = 10.828436        scale = 11.728379
  21. ncnn int8 calibration table create success, best wish for your int8 inference has a low accuracy loss...\(^0^)/...233...
复制代码
使用 ncnn2int8 工具,输入fp32模型和校准table,生成量化后的int8模型
  1. ncnn2int8 mnist.param mnist.bin mnist-int8.param mnist-int8.bin mnist.table
复制代码
输出信息,可以看到mnist模型中的2个卷积层和1个fc层都进行了量化
  1. quantize_convolution conv0
  2. quantize_convolution conv1
  3. quantize_innerproduct fc
复制代码
量化后的模型体积减小到原始的 1/4

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图2


esp32c6 加载量化后的模型
与fp32模型一样,用 ncnn2mem 转换为静态数组,内存加载模型
  1. ncnn2mem mnist-int8.param mnist-int8.bin mnist-int8.id.h mnist-int8.mem.h
复制代码
只需要修改加载模型的2行代码即可,其他代码保持原样
  1. <div class="blockcode"><blockquote>#include "mnist-int8.mem.h"
  2. extern "C" void app_main(void)
  3. {
  4.     ncnn::Net net;
  5.     // net.load_param(mnist_param_bin);
  6.     // net.load_model(mnist_bin);
  7.     net.load_param(mnist_int8_param_bin);
  8.     net.load_model(mnist_int8_bin);
  9. }
复制代码



效果和性能对比

esp32c3加速9.1倍,esp32c6加速5.4倍,提速明显!

Beetle ESP32 C6 ncnn神经网络数字识别(续篇×量化加速)图3

  1. ESP-ROM:esp32c6-20220919
  2. Build:Sep 19 2022
  3. rst:0xc (SW_CPU),boot:0xc (SPI_FAST_FLASH_BOOT)
  4. Saved PC:0x4001975a
  5. SPIWP:0xee
  6. mode:DIO, clock div:2
  7. load:0x40875720,len:0x1804
  8. load:0x4086c110,len:0xe2c
  9. load:0x4086e610,len:0x2e30
  10. entry 0x4086c11a
  11. I (22) boot: ESP-IDF v5.3-dev-2815-gbe06a6f5ff 2nd stage bootloader
  12. I (23) boot: compile time Apr 14 2024 11:52:36
  13. I (24) boot: chip revision: v0.0
  14. I (26) boot.esp32c6: SPI Speed      : 80MHz
  15. I (31) boot.esp32c6: SPI Mode       : DIO
  16. I (36) boot.esp32c6: SPI Flash Size : 2MB
  17. I (41) boot: Enabling RNG early entropy source...
  18. I (46) boot: Partition Table:
  19. I (50) boot: ## Label            Usage          Type ST Offset   Length
  20. I (57) boot:  0 nvs              WiFi data        01 02 00009000 00006000
  21. I (64) boot:  1 phy_init         RF data          01 01 0000f000 00001000
  22. I (72) boot:  2 factory          factory app      00 00 00010000 00100000
  23. I (79) boot: End of partition table
  24. I (83) esp_image: segment 0: paddr=00010020 vaddr=420b0020 size=0a9e0h ( 43488) map
  25. I (110) esp_image: segment 1: paddr=0001aa08 vaddr=40800000 size=05610h ( 22032) load
  26. I (122) esp_image: segment 2: paddr=00020020 vaddr=42000020 size=aa960h (698720) map
  27. I (407) esp_image: segment 3: paddr=000ca988 vaddr=40805610 size=03e74h ( 15988) load
  28. I (416) esp_image: segment 4: paddr=000ce804 vaddr=40809490 size=00f64h (  3940) load
  29. I (424) boot: Loaded app from partition at offset 0x10000
  30. I (425) boot: Disabling RNG early entropy source...
  31. I (436) cpu_start: Unicore app
  32. I (446) cpu_start: Pro cpu start user code
  33. I (446) cpu_start: cpu freq: 160000000 Hz
  34. I (447) app_init: Application information:
  35. I (449) app_init: Project name:     main
  36. I (454) app_init: App version:      a36153d-dirty
  37. I (459) app_init: Compile time:     Apr 14 2024 11:52:33
  38. I (465) app_init: ELF file SHA256:  0a0e007ce...
  39. I (470) app_init: ESP-IDF:          v5.3-dev-2815-gbe06a6f5ff
  40. I (477) efuse_init: Min chip rev:     v0.0
  41. I (482) efuse_init: Max chip rev:     v0.99
  42. I (487) efuse_init: Chip rev:         v0.0
  43. I (491) heap_init: Initializing. RAM available for dynamic allocation:
  44. I (499) heap_init: At 4080B3B0 len 00071260 (452 KiB): RAM
  45. I (505) heap_init: At 4087C610 len 00002F54 (11 KiB): RAM
  46. I (511) heap_init: At 50000000 len 00003FE8 (15 KiB): RTCRAM
  47. I (518) spi_flash: detected chip: generic
  48. I (522) spi_flash: flash io: dio
  49. W (526) spi_flash: Detected size(4096k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
  50. I (539) sleep: Configure to isolate all GPIO pins in sleep state
  51. I (546) sleep: Enable automatic switching of GPIO sleep configuration
  52. I (553) coexist: coex firmware version: d96c1e51f
  53. I (558) coexist: coexist rom version 5b8dcfa
  54. I (563) main_task: Started on CPU0
  55. I (563) main_task: Calling app_main()
  56. Loading ncnn mnist model...Done.
  57. Preparing input...Start Mesuring!
  58. Done!
  59. 0: -1.63
  60. 1: 3.05
  61. 2: 12.36
  62. 3: 7.95
  63. 4: -17.89
  64. 5: -9.99
  65. 6: -15.13
  66. 7: 16.36
  67. 8: 0.45
  68. 9: -3.34
  69. I think it is number 7!
  70. Latency, avg: 78.77ms, max: 79.48, min: 78.68. Avg Flops: 9.90MFlops
  71. Restarting now.
复制代码



代码已更新到 https://github.com/nihui/ncnn_on_esp32




您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

为本项目制作心愿单
购买心愿单
心愿单 编辑
[[wsData.name]]

硬件清单

  • [[d.name]]
btnicon
我也要做!
点击进入购买页面
上海智位机器人股份有限公司 沪ICP备09038501号-4

© 2013-2024 Comsenz Inc. Powered by Discuz! X3.4 Licensed

mail