2024-11-28 17:30:19 [显示全部楼层]
24浏览
查看: 24|回复: 0

[项目] 创意项目——AI轻松熊对话伴侣

[复制链接]
本帖最后由 云天 于 2024-11-28 17:37 编辑

【项目背景】


        在这个快节奏、高度数字化的时代,人们越来越渴望与智能设备进行更自然、更深入的交流。为了满足这一需求,我们推出了“轻松熊对话伴侣”——一个集成了最新人工智能技术的互动装置,旨在为用户提供一个既有趣又实用的对话伙伴。

【项目设计】


        “轻松熊对话伴侣”项目的核心是一个搭载了ESP32-S3芯片的智能装置,它通过Micropython编程语言与Thonny软件的结合,实现了对MEMS麦克风阵列的精准控制和声音采集。用户只需简单按下按钮,装置便能捕捉到周围的声音,并利用urequests库将音频数据安全地发送至百度智能云进行语音识别。

        识别后的文本信息将被进一步发送至百度千帆ModelBuilder大模型,这个强大的对话模型能够理解用户的意图并生成合适的回复。随后,这些文本回复将被转换成流畅自然的语音,通过百度语音合成技术实现。最后,ESP32-S3利用其I2S接口与MAX98357 I2S功放模块相连,将合成的音频信号传输至喇叭,播放出清晰、悦耳的声音。

        整个装置被巧妙地隐藏在一个可爱的轻松熊玩偶中,不仅增加了互动的趣味性,也使得技术与日常生活的融合更加自然和谐。无论是家庭娱乐、教育辅助还是简单的日常对话,“轻松熊对话伴侣”都能为用户提供一个温馨、智能的交流体验。

【项目硬件】
        ESP32-S3是一款由乐鑫科技(Espressif)推出的高性能、低成本的Wi-Fi和蓝牙微控制器,适用于物联网(IoT)应用。它集成了RISC-V 32位单核处理器,具有丰富的外设接口和强大的网络功能。支持Wi-Fi和蓝牙,具有多个I/O端口,集成I2S接口,支持音频应用,适合处理音频数据和网络通信。
        MEMS麦克风阵列:MEMS(微电机系统)麦克风是一种小型化、集成化的麦克风,能够精确捕捉声音。阵列形式的MEMS麦克风可以提供更好的声音定位和噪声抑制功能。体积小、功耗低、响应速度快,适合用于语音采集和声音识别。
        MAX98357 I2S功放模块:MAX98357是一款由Maxim Integrated生产的I2S数字音频功率放大器,专为驱动扬声器而设计。支持I2S接口,可以直接与ESP32-S3等微控制器连接,提供高质量的音频输出。
创意项目——AI轻松熊对话伴侣图2


创意项目——AI轻松熊对话伴侣图1


创意项目——AI轻松熊对话伴侣图8

创意项目——AI轻松熊对话伴侣图7



【硬件电路】
1.# 定义功放I2S引脚
I2S功放模块:VCC引脚---(连接)---ESP32-S3主控: 3V3;
I2S功放模块:GND引脚---(连接)---ESP32-S3主控:GND;
I2S功放模块:LRC引脚---(连接)---ESP32-S3主控: 12;
I2S功放模块:BCLK引脚---(连接)---ESP32-S3主控:14;
I2S功放模块:DIN引脚---(连接)---ESP32-S3主控: 15;
I2S功放模块:SPK+引脚---(连接)---喇叭:正极
I2S功放模块:SPK-引脚---(连接)---喇叭:负极

2.定义MEMS的麦克风I2S引脚
MEMS的麦克风模块:VIN引脚---(连接)---ESP32-S3主控: VCC/5V;
MEMS的麦克风模块:GND引脚---(连接)---ESP32-S3主控:GND;
MEMS的麦克风模块:MIC_WS引脚---(连接)---ESP32-S3主控: 18;
MEMS的麦克风模块:MIC_CK引脚---(连接)---ESP32-S3主控:17;
MEMS的麦克风模块:MIC_D0引脚---(连接)---ESP32-S3主控: 16;
MEMS的麦克风模块:LED_CK引脚---(连接)---ESP32-S3主控:13;
MEMS的麦克风模块:LED_DA引脚---(连接)---ESP32-S3主控:21;

3.定义按钮引脚

按钮模块:D引脚---(连接)---ESP32-S3主控:7;

创意项目——AI轻松熊对话伴侣图9
【获取百度access_token】
        1.创建百度大模型应用,并获取(API_KEY,SECRET_KEY)
创意项目——AI轻松熊对话伴侣图3



        2.创建百度语音技术应用,并获取(API_KEY,SECRET_KEY)
创意项目——AI轻松熊对话伴侣图4

【编写程序】

1.程序初始化

  1. import machine
  2. import time
  3. import struct
  4. from machine import Pin, I2S,SoftSPI
  5. import micropython_dotstar as dotstar
  6. import network
  7. import urequests
  8. import ujson
  9. import ubinascii
  10. import gc
  11. # 定义I2S引脚
  12. blck_pin = Pin(14)
  13. lrc_pin = Pin(12)
  14. din_pin = Pin(15)
  15. # 初始化功放I2S
  16. audio_out = I2S(
  17.     1,
  18.     sck=blck_pin, ws=lrc_pin, sd=din_pin,
  19.     mode=I2S.TX,
  20.     bits=16,
  21.     format=machine.I2S.MONO,
  22.     rate=16000,
  23.     ibuf=4096
  24. )
  25. # 定义麦克风I2S配置
  26. i2s = machine.I2S(
  27.     0,
  28.     sck=machine.Pin(17),
  29.     ws=machine.Pin(18),
  30.     sd=machine.Pin(16),
  31.     mode=machine.I2S.RX,
  32.     bits=16,
  33.     format=machine.I2S.MONO,
  34.     rate=16000,
  35.     ibuf=4096
  36. )
  37. #定义麦克风上LED
  38. spi = SoftSPI(sck=Pin(13), mosi=Pin(21), miso=Pin(3))
  39. dots = dotstar.DotStar(spi, 12, brightness=0.2)
复制代码
2.连接WIFI函数
  1. def connect_to_wifi(ssid, password):
  2.     wlan = network.WLAN(network.STA_IF)  # 创建WLAN对象
  3.     wlan.active(True)  # 激活接口
  4.     if not wlan.isconnected():  # 如果尚未连接到WiFi
  5.         print('正在连接到网络...')
  6.         wlan.connect(ssid, password)  # 连接到指定的WiFi网络
  7.         while not wlan.isconnected():  # 等待直到连接成功
  8.             pass
  9.     print('网络配置:', wlan.ifconfig())  # 打印网络配置信息
  10. def connect_wifi():
  11.     # 替换为你的WiFi SSID和密码
  12.     #ssid = "TP-LINK_CB88"
  13.     #password = "jiaoyan2"
  14.     ssid = "sxs"
  15.     password = "smj080823"
  16.     connect_to_wifi(ssid, password)
复制代码
3.播放音频函数
  1. def play_audio(filename):
  2.    
  3.    
  4.     with open(filename, 'rb') as f:
  5.         f.seek(44)  # 跳过头部信息
  6.         
  7.         # 播放音频数据
  8.         while True:
  9.             data = f.read(1024)  # 每次读取1024字节的数据
  10.             if not data:
  11.                 break
  12.             audio_out.write(data)
复制代码
4.录制音频函数
  1. def write_wav_header(file, sample_rate, bits_per_sample, channels):
  2.     num_channels = channels
  3.     bytes_per_sample = bits_per_sample // 8
  4.     byte_rate = sample_rate * num_channels * bytes_per_sample
  5.     block_align = num_channels * bytes_per_sample
  6.    
  7.     # Write the WAV file header
  8.     file.write(b'RIFF')
  9.     file.write(struct.pack('<I', 36))  # Chunk size (36 + SubChunk2Size)
  10.     file.write(b'WAVE')
  11.     file.write(b'fmt ')
  12.     file.write(struct.pack('<I', 16))  # Subchunk1Size (PCM header size)
  13.     file.write(struct.pack('<H', 1))   # AudioFormat (PCM)
  14.     file.write(struct.pack('<H', num_channels))  # NumChannels
  15.     file.write(struct.pack('<I', sample_rate))   # SampleRate
  16.     file.write(struct.pack('<I', byte_rate))     # ByteRate
  17.     file.write(struct.pack('<H', block_align))   # BlockAlign
  18.     file.write(struct.pack('<H', bits_per_sample))  # BitsPerSample
  19.     file.write(b'data')
  20.     file.write(struct.pack('<I', 0))  # Subchunk2Size (to be filled later)
  21. def record(filename,re_time):
  22.     audio_buffer = bytearray(4096)
  23.     sample_rate = 16000
  24.     bits_per_sample = 16
  25.     channels = 1
  26.     duration = re_time  # Record for 5 seconds
  27.     print("开始录音")
  28.     play_audio("start.wav")#播放音频
  29.     with open(filename, 'wb') as f:
  30.         write_wav_header(f, sample_rate, bits_per_sample, channels)
  31.         subchunk2_size = 0
  32.         
  33.         start_time = time.ticks_ms()
  34.         end_time = start_time + duration * 1000
  35.         
  36.         try:
  37.             while time.ticks_ms() < end_time and not(button_state()):
  38.                 num_bytes = i2s.readinto(audio_buffer)
  39.                 if num_bytes > 0:
  40.                     f.write(audio_buffer[:num_bytes])
  41.                     subchunk2_size += num_bytes
  42.                 time.sleep(0.01)
  43.         except KeyboardInterrupt:
  44.             print("Recording stopped")
  45.         
  46.         # Go back and update the Subchunk2Size in the WAV header
  47.         f.seek(40)
  48.         f.write(struct.pack('<I', subchunk2_size))
  49.         f.seek(4)
  50.         f.write(struct.pack('<I', 36 + subchunk2_size))
  51.     print("录音结束")
  52.     play_audio("end.wav")#播放音频
复制代码
5.URL 编码
  1. def urlencode(params):#编码成 URL 编码格式的字符串
  2.     encoded_pairs = []
  3.     for key, value in params.items():
  4.         # 确保键和值都是字符串
  5.         key_str = str(key)
  6.         value_str = str(value)
  7.         # 手动实现简单的URL编码
  8.         encoded_key = key_str.replace(" ", "%20")
  9.         encoded_value = value_str.replace(" ", "%20")
  10.         encoded_pairs.append(f"{encoded_key}={encoded_value}")
  11.     return "&".join(encoded_pairs)
复制代码
6.语音合成函数
  1. def post_tts(filename,text, token):
  2.     # 设置请求参数
  3.     params = {
  4.         'tok': token,
  5.         'tex': text,  # 直接使用text,不需要quote_plus
  6.         'per': 5,#基础音库:度小宇=1,度小美=0,度逍遥(基础)=3,度丫丫=4,精品音库:度逍遥(精品)=5003,度小鹿=5118,度博文=106,度小童=110,度小萌=111,度米朵=103,度小娇=5
  7.         'spd': 5,#中语速
  8.         'pit': 5,#中语调
  9.         'vol': 9,#中音量
  10.         'aue': 6,#wav,3为mp3格式(默认); 4为pcm-16k;5为pcm-8k;6为wav(内容同pcm-16k);
  11.         'cuid': "ZZloekkfqvZFKhpVtFXGlAopgnHnHCgQ",#用户唯一标识
  12.         'lan': 'zh',
  13.         'ctp': 1  #客户端类型选择,web端填写固定值1
  14.     }
  15.     headers = {
  16.         'Content-Type': 'application/x-www-form-urlencoded',
  17.         'Accept': '*/*'
  18.     }
  19.     # 将参数编码,然后放入body,生成Request对象
  20.     data = urlencode(params).encode('utf-8')
  21.     # 发送POST请求
  22.     response = urequests.post("http://tsn.baidu.com/text2audio", headers=headers,data=data)
  23.     # 检查响应状态码
  24.     if response.status_code == 200:
  25.         # 将返回的音频数据写入文件
  26.         print("开始生成合成音频")
  27.         
  28.         gc.collect()  # 写入前收集垃圾
  29.         with open(filename, "wb") as f:
  30.             f.write(response.content)
  31.         gc.collect()  # 写入后收集垃圾
  32.         print("完成生成合成音频")
  33.     else:
  34.         print("Failed to retrieve the MP3 file")
  35. def tts(audio_file,text):#语音合成
  36.    
  37.    
  38.     token = '24.65578d0ef206e7de3a028e59691b2f1c.2592000.1735099609.282335-*******'
  39.     post_tts(audio_file,text,token)
复制代码
7.大模型对话
  1. def spark(text):#大模型对话函数
  2.     #24.a56fc70c4012e7d9482f65b0eb896537.2592000.1735199722.282335-*******
  3.     API_KEY="Yu184c3QPZ2tkH9HSuN9TmsB"
  4.     SECRET_KEY="OcxUg0slEFB3nPrgchjQrBdsxfZUBM5q"
  5.     access_token="24.b4f3fc416c50a274a1f37a3f95083616.2592000.1735201337.282335-******"
  6.     #url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions?access_token="+get_access_token(API_KEY,SECRET_KEY)
  7.     url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions?access_token="+access_token
  8.    
  9.     payload = {
  10.         "user_id": "yuntian365",
  11.         "messages": [
  12.             {
  13.                 "role": "user",
  14.                 "content": text
  15.             }
  16.         ],
  17.         "temperature": 0.95,
  18.         "top_p": 0.8,
  19.         "penalty_score": 1,
  20.         "enable_system_memory": False,
  21.         "disable_search": False,
  22.         "enable_citation": False,
  23.         "system": "请将回答控制在20字",
  24.         
  25.         "response_format": "text"
  26.     }
  27.     header = {
  28.         'Content-Type': 'application/json'
  29.     }
  30.     data_str = ujson.dumps(payload).encode('utf-8')
  31.     #print(data_str)
  32.    
  33.     response = urequests.post(url,headers=header,data=data_str)
  34.     response_json=response.json()
  35.     #print(response_json)
  36.     response_content=response_json['result']
  37.     decoded_str = response_content.encode('utf-8').decode('unicode_escape')
  38.     print(decoded_str)
  39.     return(decoded_str)
复制代码
8.语音识别
  1. def base64_encode(data):#base64编码
  2.     """Base64 encode bytes-like objects"""
  3.     return ubinascii.b2a_base64(data)[:-1]  # 去掉结果末尾的换行符
  4. def get_access_token(API_KEY,SECRET_KEY):
  5.    
  6.     url = "https://aip.baidubce.com/oauth/2.0/token"
  7.     params = {
  8.         'grant_type': 'client_credentials',
  9.         'client_id': API_KEY,
  10.         'client_secret': SECRET_KEY
  11.     }
  12.     data = urlencode(params).encode('utf-8')
  13.     response = urequests.post(url, data=data)
  14.     access_token=ujson.loads(response.text)['access_token']
  15.     print(access_token)
  16.     return access_token
  17. def baidu_speech_recognize(audio_file, token):#百度语音识别
  18.     #url = "http://vop.baidu.com/server_api"  #标准版
  19.     url = "https://vop.baidu.com/pro_api"     #极速版
  20.     headers = {
  21.         'Content-Type': 'application/json',
  22.         'Accept': 'application/json'
  23.     }
  24.     with open(audio_file, "rb") as f:
  25.         audio_data = f.read()
  26.     base64_data = base64_encode(audio_data).decode('utf-8')
  27.     data = {
  28.         'format': 'wav',
  29.         'rate': 16000,
  30.         'channel': 1,
  31.         'token': token,
  32.         'cuid': "ESP32",
  33.         "dev_pid": 80001,#极速版,标准版去掉
  34.         'speech': base64_data,
  35.         'len': len(audio_data)
  36.     }
  37.     response = urequests.post(url, headers=headers, json=data)
  38.     return ujson.loads(response.text)
  39. def speech(audio_file):#语音识别
  40.     # 替换为您的百度AI开放平台的API Key和Secret Key
  41.    
  42.     API_KEY = "22fDjayW1nVOIP5tIAH8cODd"
  43.     SECRET_KEY = "4qC1wtOfKySgRM80NRId6rZ6CjVNrIWb"
  44.     # 获取access_token
  45.     #token = get_access_token(API_KEY,SECRET_KEY)
  46.     token = '24.65578d0ef206e7de3a028e59691b2f1c.2592000.1735099609.282335-116382488'
  47.     # 发送音频数据到百度语音识别API
  48.     result = baidu_speech_recognize(audio_file, token)
  49.     if result["err_no"]==0:
  50.         text=result["result"][0].encode('utf-8').decode('unicode_escape')
  51.         print(text)
  52.         return text
  53.     else:
  54.         return False
复制代码
9.麦克风上LED灯控制
  1. def light_close():
  2.     n_dots = len(dots)
  3.     for dot in range(n_dots):
  4.         dots[dot] = (0, 0, 0)
  5. def light_on():
  6.     n_dots = len(dots)
  7.     for dot in range(n_dots):
  8.         dots[dot] = (255, 0, 0)
复制代码
micropython_dotstar.py库文件:
  1. # The MIT License (MIT)
  2. #
  3. # Copyright (c) 2016 Damien P. George (original Neopixel object)
  4. # Copyright (c) 2017 Ladyada
  5. # Copyright (c) 2017 Scott Shawcroft for Adafruit Industries
  6. # Copyright (c) 2019 Matt Trentini (porting back to MicroPython)
  7. #
  8. # Permission is hereby granted, free of charge, to any person obtaining a copy
  9. # of this software and associated documentation files (the "Software"), to deal
  10. # in the Software without restriction, including without limitation the rights
  11. # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  12. # copies of the Software, and to permit persons to whom the Software is
  13. # furnished to do so, subject to the following conditions:
  14. #
  15. # The above copyright notice and this permission notice shall be included in
  16. # all copies or substantial portions of the Software.
  17. #
  18. # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  19. # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  20. # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  21. # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  22. # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  23. # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  24. # THE SOFTWARE.
  25. """
  26. `micropython_dotstar` - DotStar strip driver
  27. ====================================================
  28. * Author(s): Damien P. George, Limor Fried, Scott Shawcroft, Matt Trentini
  29. """
  30. __version__ = "0.0.0-auto.0"
  31. __repo__ = "https://github.com/mattytrentini/micropython-dotstar"
  32. START_HEADER_SIZE = 4
  33. LED_START = 0b11100000  # Three "1" bits, followed by 5 brightness bits
  34. # Pixel color order constants
  35. RGB = (0, 1, 2)
  36. RBG = (0, 2, 1)
  37. GRB = (1, 0, 2)
  38. GBR = (1, 2, 0)
  39. BRG = (2, 0, 1)
  40. BGR = (2, 1, 0)
  41. class DotStar:
  42.     """
  43.     A sequence of dotstars.
  44.     :param SPI spi: The SPI object to write output to.
  45.     :param int n: The number of dotstars in the chain
  46.     :param float brightness: Brightness of the pixels between 0.0 and 1.0
  47.     :param bool auto_write: True if the dotstars should immediately change when
  48.         set. If False, `show` must be called explicitly.
  49.     :param tuple pixel_order: Set the pixel order on the strip - different
  50.          strips implement this differently. If you send red, and it looks blue
  51.          or green on the strip, modify this! It should be one of the values above
  52.     Example for TinyPICO:
  53.     .. code-block:: python
  54.         from micropython_dotstar import DotStar
  55.         from machine import Pin, SPI
  56.         spi = SPI(sck=Pin(12), mosi=Pin(13), miso=Pin(18)) # Configure SPI - note: miso is unused
  57.         dotstar = DotStar(spi, 1)
  58.         dotstar[0] = (128, 0, 0) # Red
  59.     """
  60.     def __init__(self, spi, n, *, brightness=1.0, auto_write=True,
  61.                  pixel_order=BGR):
  62.         self._spi = spi
  63.         self._n = n
  64.         # Supply one extra clock cycle for each two pixels in the strip.
  65.         self.end_header_size = n // 16
  66.         if n % 16 != 0:
  67.             self.end_header_size += 1
  68.         self._buf = bytearray(n * 4 + START_HEADER_SIZE + self.end_header_size)
  69.         self.end_header_index = len(self._buf) - self.end_header_size
  70.         self.pixel_order = pixel_order
  71.         # Four empty bytes to start.
  72.         for i in range(START_HEADER_SIZE):
  73.             self._buf[i] = 0x00
  74.         # Mark the beginnings of each pixel.
  75.         for i in range(START_HEADER_SIZE, self.end_header_index, 4):
  76.             self._buf[i] = 0xff
  77.         # 0xff bytes at the end.
  78.         for i in range(self.end_header_index, len(self._buf)):
  79.             self._buf[i] = 0xff
  80.         self._brightness = 1.0
  81.         # Set auto_write to False temporarily so brightness setter does _not_
  82.         # call show() while in __init__.
  83.         self.auto_write = False
  84.         self.brightness = brightness
  85.         self.auto_write = auto_write
  86.     def deinit(self):
  87.         """Blank out the DotStars and release the resources."""
  88.         self.auto_write = False
  89.         for i in range(START_HEADER_SIZE, self.end_header_index):
  90.             if i % 4 != 0:
  91.                 self._buf[i] = 0
  92.         self.show()
  93.         if self._spi:
  94.             self._spi.deinit()
  95.     def __enter__(self):
  96.         return self
  97.     def __exit__(self, exception_type, exception_value, traceback):
  98.         self.deinit()
  99.     def __repr__(self):
  100.         return "[" + ", ".join([str(x) for x in self]) + "]"
  101.     def _set_item(self, index, value):
  102.         """
  103.         value can be one of three things:
  104.                 a (r,g,b) list/tuple
  105.                 a (r,g,b, brightness) list/tuple
  106.                 a single, longer int that contains RGB values, like 0xFFFFFF
  107.             brightness, if specified should be a float 0-1
  108.         Set a pixel value. You can set per-pixel brightness here, if it's not passed it
  109.         will use the max value for pixel brightness value, which is a good default.
  110.         Important notes about the per-pixel brightness - it's accomplished by
  111.         PWMing the entire output of the LED, and that PWM is at a much
  112.         slower clock than the rest of the LEDs. This can cause problems in
  113.         Persistence of Vision Applications
  114.         """
  115.         offset = index * 4 + START_HEADER_SIZE
  116.         rgb = value
  117.         if isinstance(value, int):
  118.             rgb = (value >> 16, (value >> 8) & 0xff, value & 0xff)
  119.         if len(rgb) == 4:
  120.             brightness = value[3]
  121.             # Ignore value[3] below.
  122.         else:
  123.             brightness = 1
  124.         # LED startframe is three "1" bits, followed by 5 brightness bits
  125.         # then 8 bits for each of R, G, and B. The order of those 3 are configurable and
  126.         # vary based on hardware
  127.         # same as math.ceil(brightness * 31) & 0b00011111
  128.         # Idea from https://www.codeproject.com/Tips/700780/Fast-floor-ceiling-functions
  129.         brightness_byte = 32 - int(32 - brightness * 31) & 0b00011111
  130.         self._buf[offset] = brightness_byte | LED_START
  131.         self._buf[offset + 1] = rgb[self.pixel_order[0]]
  132.         self._buf[offset + 2] = rgb[self.pixel_order[1]]
  133.         self._buf[offset + 3] = rgb[self.pixel_order[2]]
  134.     def __setitem__(self, index, val):
  135.         if isinstance(index, slice):
  136.             start, stop, step = index.indices(self._n)
  137.             length = stop - start
  138.             if step != 0:
  139.                 # same as math.ceil(length / step)
  140.                 # Idea from https://fizzbuzzer.com/implement-a-ceil-function/
  141.                 length = (length + step - 1) // step
  142.             if len(val) != length:
  143.                 raise ValueError("Slice and input sequence size do not match.")
  144.             for val_i, in_i in enumerate(range(start, stop, step)):
  145.                 self._set_item(in_i, val[val_i])
  146.         else:
  147.             self._set_item(index, val)
  148.         if self.auto_write:
  149.             self.show()
  150.     def __getitem__(self, index):
  151.         if isinstance(index, slice):
  152.             out = []
  153.             for in_i in range(*index.indices(self._n)):
  154.                 out.append(
  155.                     tuple(self._buf[in_i * 4 + (3 - i) + START_HEADER_SIZE] for i in range(3)))
  156.             return out
  157.         if index < 0:
  158.             index += len(self)
  159.         if index >= self._n or index < 0:
  160.             raise IndexError
  161.         offset = index * 4
  162.         return tuple(self._buf[offset + (3 - i) + START_HEADER_SIZE]
  163.                      for i in range(3))
  164.     def __len__(self):
  165.         return self._n
  166.     @property
  167.     def brightness(self):
  168.         """Overall brightness of the pixel"""
  169.         return self._brightness
  170.     @brightness.setter
  171.     def brightness(self, brightness):
  172.         self._brightness = min(max(brightness, 0.0), 1.0)
  173.         if self.auto_write:
  174.             self.show()
  175.     def fill(self, color):
  176.         """Colors all pixels the given ***color***."""
  177.         auto_write = self.auto_write
  178.         self.auto_write = False
  179.         for i in range(self._n):
  180.             self[i] = color
  181.         if auto_write:
  182.             self.show()
  183.         self.auto_write = auto_write
  184.     def show(self):
  185.         """Shows the new colors on the pixels themselves if they haven't already
  186.         been autowritten.
  187.         The colors may or may not be showing after this function returns because
  188.         it may be done asynchronously."""
  189.         # Create a second output buffer if we need to compute brightness
  190.         buf = self._buf
  191.         if self.brightness < 1.0:
  192.             buf = bytearray(self._buf)
  193.             # Four empty bytes to start.
  194.             for i in range(START_HEADER_SIZE):
  195.                 buf[i] = 0x00
  196.             for i in range(START_HEADER_SIZE, self.end_header_index):
  197.                 buf[i] = self._buf[i] if i % 4 == 0 else int(self._buf[i] * self._brightness)
  198.             # Four 0xff bytes at the end.
  199.             for i in range(self.end_header_index, len(buf)):
  200.                 buf[i] = 0xff
  201.         if self._spi:
  202.             self._spi.write(buf)
复制代码
10.按钮控制函数
  1. BUTTON_PIN = 7
  2. button = Pin(BUTTON_PIN, Pin.IN, Pin.PULL_UP)
  3. prev_button_state = 0
  4. def button_state():
  5.     global button,prev_button_state
  6.     button_state = button.value()
  7.     if prev_button_state == 0 and button_state == 1:
  8.         print("The button is pressed")
  9.     if prev_button_state == 1 and button_state == 0:
  10.         print("The button is released")
  11.         prev_button_state = button_state
  12.         return True
  13.     prev_button_state = button_state
  14.     #print(prev_button_state)
  15.     return False
复制代码


11.主程序
  1. if __name__ == "__main__":
  2.     light_close()
  3.     audio_file = "audio.wav"
  4.     connect_wifi()#连接Wifi
  5.     play_audio("init.wav")
  6.    
  7.     while True:
  8.      if button_state():
  9.        light_on()
  10.        record(audio_file,10)#录音5秒
  11.        light_close()
  12.        result=speech(audio_file)#语音识别
  13.        if result!=False:
  14.          text=spark(result)#与大模型对话
  15.          tts(audio_file,text)#语音合成
  16.          play_audio(audio_file)#播放音频
  17.        else:
  18.            play_audio("again.wav")
复制代码
【完整程序】
  1. import machine
  2. import time
  3. import struct
  4. from machine import Pin, I2S,SoftSPI
  5. import micropython_dotstar as dotstar
  6. import network
  7. import urequests
  8. import ujson
  9. import ubinascii
  10. import gc
  11. # 定义I2S引脚
  12. blck_pin = Pin(14)
  13. lrc_pin = Pin(12)
  14. din_pin = Pin(15)
  15. # 初始化I2S
  16. audio_out = I2S(
  17.     1,
  18.     sck=blck_pin, ws=lrc_pin, sd=din_pin,
  19.     mode=I2S.TX,
  20.     bits=16,
  21.     format=machine.I2S.MONO,
  22.     rate=16000,
  23.     ibuf=4096
  24. )
  25. # I2S配置
  26. i2s = machine.I2S(
  27.     0,
  28.     sck=machine.Pin(17),
  29.     ws=machine.Pin(18),
  30.     sd=machine.Pin(16),
  31.     mode=machine.I2S.RX,
  32.     bits=16,
  33.     format=machine.I2S.MONO,
  34.     rate=16000,
  35.     ibuf=4096
  36. )
  37. #定义麦克风上LED
  38. spi = SoftSPI(sck=Pin(13), mosi=Pin(21), miso=Pin(3))
  39. dots = dotstar.DotStar(spi, 12, brightness=0.2)
  40. def connect_to_wifi(ssid, password):
  41.     wlan = network.WLAN(network.STA_IF)  # 创建WLAN对象
  42.     wlan.active(True)  # 激活接口
  43.     if not wlan.isconnected():  # 如果尚未连接到WiFi
  44.         print('正在连接到网络...')
  45.         wlan.connect(ssid, password)  # 连接到指定的WiFi网络
  46.         while not wlan.isconnected():  # 等待直到连接成功
  47.             pass
  48.     print('网络配置:', wlan.ifconfig())  # 打印网络配置信息
  49. def read_wav_header(filename):
  50.     with open(filename, 'rb') as f:
  51.         # Read the first 44 bytes (standard WAV header size)
  52.         header = f.read(44)
  53.         
  54.         # Parse the header fields
  55.         chunk_id = header[0:4]
  56.         chunk_size = int.from_bytes(header[4:8], 'little')
  57.         format = header[8:12]
  58.         subchunk1_id = header[12:16]
  59.         subchunk1_size = int.from_bytes(header[16:20], 'little')
  60.         audio_format = int.from_bytes(header[20:22], 'little')
  61.         num_channels = int.from_bytes(header[22:24], 'little')
  62.         sample_rate = int.from_bytes(header[24:28], 'little')
  63.         byte_rate = int.from_bytes(header[28:32], 'little')
  64.         block_align = int.from_bytes(header[32:34], 'little')
  65.         bits_per_sample = int.from_bytes(header[34:36], 'little')
  66.         subchunk2_id = header[36:40]
  67.         subchunk2_size = int.from_bytes(header[40:44], 'little')
  68.         
  69.         return {
  70.             'chunk_id': chunk_id,
  71.             'chunk_size': chunk_size,
  72.             'format': format,
  73.             'subchunk1_id': subchunk1_id,
  74.             'subchunk1_size': subchunk1_size,
  75.             'audio_format': audio_format,
  76.             'num_channels': num_channels,
  77.             'sample_rate': sample_rate,
  78.             'byte_rate': byte_rate,
  79.             'block_align': block_align,
  80.             'bits_per_sample': bits_per_sample,
  81.             'subchunk2_id': subchunk2_id,
  82.             'subchunk2_size': subchunk2_size
  83.         }
  84. def write_wav_header(file, sample_rate, bits_per_sample, channels):
  85.     num_channels = channels
  86.     bytes_per_sample = bits_per_sample // 8
  87.     byte_rate = sample_rate * num_channels * bytes_per_sample
  88.     block_align = num_channels * bytes_per_sample
  89.    
  90.     # Write the WAV file header
  91.     file.write(b'RIFF')
  92.     file.write(struct.pack('<I', 36))  # Chunk size (36 + SubChunk2Size)
  93.     file.write(b'WAVE')
  94.     file.write(b'fmt ')
  95.     file.write(struct.pack('<I', 16))  # Subchunk1Size (PCM header size)
  96.     file.write(struct.pack('<H', 1))   # AudioFormat (PCM)
  97.     file.write(struct.pack('<H', num_channels))  # NumChannels
  98.     file.write(struct.pack('<I', sample_rate))   # SampleRate
  99.     file.write(struct.pack('<I', byte_rate))     # ByteRate
  100.     file.write(struct.pack('<H', block_align))   # BlockAlign
  101.     file.write(struct.pack('<H', bits_per_sample))  # BitsPerSample
  102.     file.write(b'data')
  103.     file.write(struct.pack('<I', 0))  # Subchunk2Size (to be filled later)
  104. def record(filename,re_time):
  105.     audio_buffer = bytearray(4096)
  106.     sample_rate = 16000
  107.     bits_per_sample = 16
  108.     channels = 1
  109.     duration = re_time  # Record for 5 seconds
  110.     print("开始录音")
  111.     play_audio("start.wav")#播放音频
  112.     with open(filename, 'wb') as f:
  113.         write_wav_header(f, sample_rate, bits_per_sample, channels)
  114.         subchunk2_size = 0
  115.         
  116.         start_time = time.ticks_ms()
  117.         end_time = start_time + duration * 1000
  118.         
  119.         try:
  120.             while time.ticks_ms() < end_time and not(button_state()):
  121.                 num_bytes = i2s.readinto(audio_buffer)
  122.                 if num_bytes > 0:
  123.                     f.write(audio_buffer[:num_bytes])
  124.                     subchunk2_size += num_bytes
  125.                 time.sleep(0.01)
  126.         except KeyboardInterrupt:
  127.             print("Recording stopped")
  128.         
  129.         # Go back and update the Subchunk2Size in the WAV header
  130.         f.seek(40)
  131.         f.write(struct.pack('<I', subchunk2_size))
  132.         f.seek(4)
  133.         f.write(struct.pack('<I', 36 + subchunk2_size))
  134.     print("录音结束")
  135.     play_audio("end.wav")#播放音频
  136. def play_audio(filename):
  137.    
  138.    
  139.     with open(filename, 'rb') as f:
  140.         f.seek(44)  # 跳过头部信息
  141.         
  142.         # 播放音频数据
  143.         while True:
  144.             data = f.read(1024)  # 每次读取1024字节的数据
  145.             if not data:
  146.                 break
  147.             audio_out.write(data)
  148. def connect_wifi():
  149.     # 替换为你的WiFi SSID和密码
  150.     #ssid = "TP-LINK_CB88"
  151.     #password = "jiaoyan2"
  152.     ssid = "sxs"
  153.     password = "smj080823"
  154.     connect_to_wifi(ssid, password)
  155. def urlencode(params):#编码成 URL 编码格式的字符串
  156.     encoded_pairs = []
  157.     for key, value in params.items():
  158.         # 确保键和值都是字符串
  159.         key_str = str(key)
  160.         value_str = str(value)
  161.         # 手动实现简单的URL编码
  162.         encoded_key = key_str.replace(" ", "%20")
  163.         encoded_value = value_str.replace(" ", "%20")
  164.         encoded_pairs.append(f"{encoded_key}={encoded_value}")
  165.     return "&".join(encoded_pairs)
  166. def post_tts(filename,text, token):
  167.     # 设置请求参数
  168.     params = {
  169.         'tok': token,
  170.         'tex': text,  # 直接使用text,不需要quote_plus
  171.         'per': 5,#基础音库:度小宇=1,度小美=0,度逍遥(基础)=3,度丫丫=4,精品音库:度逍遥(精品)=5003,度小鹿=5118,度博文=106,度小童=110,度小萌=111,度米朵=103,度小娇=5
  172.         'spd': 5,#中语速
  173.         'pit': 5,#中语调
  174.         'vol': 9,#中音量
  175.         'aue': 6,#wav,3为mp3格式(默认); 4为pcm-16k;5为pcm-8k;6为wav(内容同pcm-16k);
  176.         'cuid': "ZZloekkfqvZFKhpVtFXGlAopgnHnHCgQ",#用户唯一标识
  177.         'lan': 'zh',
  178.         'ctp': 1  #客户端类型选择,web端填写固定值1
  179.     }
  180.     headers = {
  181.         'Content-Type': 'application/x-www-form-urlencoded',
  182.         'Accept': '*/*'
  183.     }
  184.     # 将参数编码,然后放入body,生成Request对象
  185.     data = urlencode(params).encode('utf-8')
  186.     # 发送POST请求
  187.     response = urequests.post("http://tsn.baidu.com/text2audio", headers=headers,data=data)
  188.     # 检查响应状态码
  189.     if response.status_code == 200:
  190.         # 将返回的音频数据写入文件
  191.         print("开始生成合成音频")
  192.         
  193.         gc.collect()  # 写入前收集垃圾
  194.         with open(filename, "wb") as f:
  195.             f.write(response.content)
  196.         gc.collect()  # 写入后收集垃圾
  197.         print("完成生成合成音频")
  198.     else:
  199.         print("Failed to retrieve the MP3 file")
  200. def tts(audio_file,text):#语音合成
  201.    
  202.    
  203.     token = '24.65578d0ef206e7de3a028e59691b2f1c.2592000.1735099609.282335-*******'
  204.     post_tts(audio_file,text,token)
  205. def spark(text):#大模型对话函数
  206.     #24.a56fc70c4012e7d9482f65b0eb896537.2592000.1735199722.282335-******
  207.     API_KEY="Yu184c3QPZ2tkH9HSuN9TmsB"
  208.     SECRET_KEY="OcxUg0slEFB3nPrgchjQrBdsxfZUBM5q"
  209.     access_token="24.b4f3fc416c50a274a1f37a3f95083616.2592000.1735201337.282335-*******"
  210.     #url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions?access_token="+get_access_token(API_KEY,SECRET_KEY)
  211.     url = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions?access_token="+access_token
  212.    
  213.     payload = {
  214.         "user_id": "yuntian365",
  215.         "messages": [
  216.             {
  217.                 "role": "user",
  218.                 "content": text
  219.             }
  220.         ],
  221.         "temperature": 0.95,
  222.         "top_p": 0.8,
  223.         "penalty_score": 1,
  224.         "enable_system_memory": False,
  225.         "disable_search": False,
  226.         "enable_citation": False,
  227.         "system": "请将回答控制在20字",
  228.         
  229.         "response_format": "text"
  230.     }
  231.     header = {
  232.         'Content-Type': 'application/json'
  233.     }
  234.     data_str = ujson.dumps(payload).encode('utf-8')
  235.     #print(data_str)
  236.    
  237.     response = urequests.post(url,headers=header,data=data_str)
  238.     response_json=response.json()
  239.     #print(response_json)
  240.     response_content=response_json['result']
  241.     decoded_str = response_content.encode('utf-8').decode('unicode_escape')
  242.     print(decoded_str)
  243.     return(decoded_str)
  244. def base64_encode(data):#base64编码
  245.     """Base64 encode bytes-like objects"""
  246.     return ubinascii.b2a_base64(data)[:-1]  # 去掉结果末尾的换行符
  247. def get_access_token(API_KEY,SECRET_KEY):
  248.    
  249.     url = "https://aip.baidubce.com/oauth/2.0/token"
  250.     params = {
  251.         'grant_type': 'client_credentials',
  252.         'client_id': API_KEY,
  253.         'client_secret': SECRET_KEY
  254.     }
  255.     data = urlencode(params).encode('utf-8')
  256.     response = urequests.post(url, data=data)
  257.     access_token=ujson.loads(response.text)['access_token']
  258.     print(access_token)
  259.     return access_token
  260. def baidu_speech_recognize(audio_file, token):#百度语音识别
  261.     #url = "http://vop.baidu.com/server_api"  #标准版
  262.     url = "https://vop.baidu.com/pro_api"     #极速版
  263.     headers = {
  264.         'Content-Type': 'application/json',
  265.         'Accept': 'application/json'
  266.     }
  267.     with open(audio_file, "rb") as f:
  268.         audio_data = f.read()
  269.     base64_data = base64_encode(audio_data).decode('utf-8')
  270.     data = {
  271.         'format': 'wav',
  272.         'rate': 16000,
  273.         'channel': 1,
  274.         'token': token,
  275.         'cuid': "ESP32",
  276.         "dev_pid": 80001,#极速版,标准版去掉
  277.         'speech': base64_data,
  278.         'len': len(audio_data)
  279.     }
  280.     response = urequests.post(url, headers=headers, json=data)
  281.     return ujson.loads(response.text)
  282. def speech(audio_file):#语音识别
  283.     # 替换为您的百度AI开放平台的API Key和Secret Key
  284.    
  285.     API_KEY = "22fDjayW1nVOIP5tIA******"
  286.     SECRET_KEY = "4qC1wtOfKySgRM80NRId6rZ6C******"
  287.     # 获取access_token
  288.     #token = get_access_token(API_KEY,SECRET_KEY)
  289.     token = '24.65578d0ef206e7de3a028e59691b2f1c.2592000.1735099609.282335-*********'
  290.     # 发送音频数据到百度语音识别API
  291.     result = baidu_speech_recognize(audio_file, token)
  292.     if result["err_no"]==0:
  293.         text=result["result"][0].encode('utf-8').decode('unicode_escape')
  294.         print(text)
  295.         return text
  296.     else:
  297.         return False
  298. def light_close():
  299.     n_dots = len(dots)
  300.     for dot in range(n_dots):
  301.         dots[dot] = (0, 0, 0)
  302. def light_on():
  303.     n_dots = len(dots)
  304.     for dot in range(n_dots):
  305.         dots[dot] = (255, 0, 0)
  306. BUTTON_PIN = 7
  307. button = Pin(BUTTON_PIN, Pin.IN, Pin.PULL_UP)
  308. prev_button_state = 0
  309. def button_state():
  310.     global button,prev_button_state
  311.     button_state = button.value()
  312.     if prev_button_state == 0 and button_state == 1:
  313.         print("The button is pressed")
  314.     if prev_button_state == 1 and button_state == 0:
  315.         print("The button is released")
  316.         prev_button_state = button_state
  317.         return True
  318.     prev_button_state = button_state
  319.     #print(prev_button_state)
  320.     return False
  321. if __name__ == "__main__":
  322.     light_close()
  323.     audio_file = "audio.wav"
  324.     connect_wifi()#连接Wifi
  325.     play_audio("init.wav")
  326.    
  327.     while True:
  328.      if button_state():
  329.        light_on()
  330.        record(audio_file,10)#录音5秒
  331.        light_close()
  332.        result=speech(audio_file)#语音识别
  333.        if result!=False:
  334.          text=spark(result)#与大模型对话
  335.          tts(audio_file,text)#语音合成
  336.          play_audio(audio_file)#播放音频
  337.        else:
  338.            play_audio("again.wav")
复制代码
【thonny编程软件】
创意项目——AI轻松熊对话伴侣图5

注意:esp32-s3刷入micropython固件:MicroPython v1.22.0-preview.289.gd014c8282.dirty on 2024-01-02; Generic ESP32S3 module with ESP32S3
固件下载网址:https://micropython.org/download/
创意项目——AI轻松熊对话伴侣图6

【演示视频】





您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

为本项目制作心愿单
购买心愿单
心愿单 编辑
[[wsData.name]]

硬件清单

  • [[d.name]]
btnicon
我也要做!
点击进入购买页面
上海智位机器人股份有限公司 沪ICP备09038501号-4

© 2013-2024 Comsenz Inc. Powered by Discuz! X3.4 Licensed

mail