使用 STM32CubeMX 部署 AI 模型

一、环境搭建

1.1 安装 X-CUBE-AI

打开 STM32CubeMX，在初始界面中选择 Help 选项
在弹窗中选择 Manage embedded software packages
选择最新版本的 X-CUBE-AI 进行安装

安装界面1

安装界面2

二、模型验证

在正式部署前，我们需要先验证模型能否在 MCU 上正常运行。

2.1 新建工程

新建工程并初始化好时钟树
配置 USART 外设（用于输出调试信息）

2.2 启用 X-CUBE-AI

在 CubeMX 中找到 X-CUBE-AI 选项并启用：

启用X-CUBE-AI

配置界面

选择 X-CUBE-AI 折叠选项中的 Core 即可激活 AI 功能。下方有三种工作模式可选：

模式	用途	适用场景
SystemPerformance	评估推理时间、CPU 负载等性能指标	部署前的性能分析
Validation	验证模型转换后的准确性	检查量化/压缩导致的精度损失
Application Template	提供业务代码框架	实际落地项目开发

💡 推荐流程：先使用 SystemPerformance 模式快速验证模型能否运行，再切换到 Application Template 编写业务代码。

2.3 配置 SystemPerformance 模式

选择 SystemPerformance 模式，并指定输出串口：

选择串口

2.4 添加 AI 模型

点击 + 号新建模型：

添加模型

配置参数说明：

参数	说明
network	AI 模型名称，支持自定义
Model type	模型格式：Keras(.h5)、TFLite(.tflite)、ONNX(.onnx)
Compression	压缩等级，用于减少模型体积
Optimization	优化等级，平衡空间和推理时间
Validation inputs	验证数据源，可选随机数据或自定义数据

点击 Analyze 验证模型能否在 MCU 上运行：

分析结果

本次使用的模型格式为 .tflite，需选择 TFLite 类型，然后点击 Browse 选择模型文件。

2.5 生成代码并验证

生成代码
编译并下载到单片机
打开串口查看输出

预期输出：

Running PerfTest on "sine_model" with random inputs (16 iterations)...
................

Results for "sine_model", 16 inferences @480MHz/240MHz (complexity: 325 MACC)
 duration     : 0.012 ms (average)
 CPU cycles   : 5833 (average)
 CPU Workload : 0% (duty cycle = 1s)
 cycles/MACC  : 17.94 (average for all layers)
 used stack   : DISABLED
 used heap    : DISABLED or NOT YET SUPPORTED
 observer res : -1 bytes used from the heap (5 c-nodes)

 Inference time by c-node
  kernel  : 0.008ms (time passed in the c-kernel fcts)
  user    : 0.001ms (time passed in the user cb)

 c_id  type                id       time (ms)
 ---------------------------------------------------
 0     NL                  0          0.000  10.29 %
 1     DENSE               1          0.002  27.84 %
 2     DENSE               2          0.002  32.58 %
 3     DENSE               3          0.001  21.04 %
 4     NL                  4          0.000   8.26 %
 -------------------------------------------------
                                      0.008 ms

三、模型部署

验证模型可用后，开始编写业务代码。

3.1 切换到 Application Template 模式

重新生成代码后，查看生成的文件结构：

生成文件

文件说明：

文件	作用	是否需要修改
`network.c/.h`	模型结构、接口函数	❌ 不修改
`network_data.c/.h`	权重(Weights)和偏置(Bias)参数	❌ 不修改
`network_data_params.c/.h`	AI 模型的数据仓库	❌ 不修改
`app_x-cube-ai.c`	业务逻辑代码	✅ 需要修改

3.2 编写业务代码

打开 app_x-cube-ai.c，按照下一章的教程进行修改。

四、代码编写教程

4.1 整体思路

X-CUBE-AI 生成的模板代码需要完成 3 个关键函数：

函数	作用	你需要做什么
`acquire_and_process_data()`	准备输入数据	把你的数据写入输入缓冲区
`ai_run()`	执行推理	无需修改（自动生成）
`post_process()`	处理输出	从输出缓冲区读取结果

4.2 了解模型

需要知道：

输入形状：模型需要什么输入？（如：1个float、28×28图像、等）
输出形状：模型输出什么？（如：1个float、10个类别概率、等）
数据类型：float32 还是 int8 量化？

对于 sine_model：

输入：1 个 float (x 值)
输出：1 个 float (sin(x) 预测值)

4.3 配置串口重定向（使 printf 输出到串口）

在 usart.c 或 main.c 中添加以下代码：

/* USER CODE BEGIN 0 */
#include <stdio.h>

// 重定向 printf 到 USART
int fputc(int ch, FILE *f)
{
  HAL_UART_Transmit(&huart1, (uint8_t *)&ch, 1, HAL_MAX_DELAY);
  return ch;
}
/* USER CODE END 0 */

⚠️ 重要：需要在项目设置中勾选 Use MicroLIB，否则 printf 无法正常工作。

4.4 添加必要的头文件和变量

在 /* USER CODE BEGIN includes */ 区域添加：

/* USER CODE BEGIN includes */
#include <math.h>           // 如果需要数学函数

#define PI 3.14159265358979323846f

// 定义数据指针（方便操作）
static float *input_data;   
static float *output_data;

// 你的测试变量
static float test_x = 0.0f;
/* USER CODE END includes */

4.5 修改输入函数 `acquire_and_process_data()`

data[0] 是指向输入缓冲区的指针，我们需要把数据填进去

int acquire_and_process_data(ai_i8* data[])
{
  // 1. 类型转换：把 ai_i8* 转成你需要的类型
  input_data = (float*)data[0];
  
  // 2. 填入数据
  input_data[0] = test_x;  // 对于 sine_model 只有1个输入
  
  return 0;  // 返回 0 表示成功
}

4.6 修改输出函数 `post_process()`

data[0] 是指向输出缓冲区的指针，推理结果在这里

int post_process(ai_i8* data[])
{
  // 1. 类型转换
  output_data = (float*)data[0];
  
  // 2. 读取结果
  float predicted_sin = output_data[0];
  
  // 3. 使用结果（打印、控制、存储等）
  float actual_sin = sinf(test_x);
  printf("x = %.4f, predicted = %.4f, actual = %.4f, error = %.6f\r\n",
         test_x, predicted_sin, actual_sin, predicted_sin - actual_sin);
  
  // 4. 决定是否继续循环
  test_x += 0.1f;
  if (test_x > 2.0f * PI) {
    return -1;  // 返回非0停止循环
  }
  
  return 0;  // 返回0继续循环
}

4.7 修改初始化和处理函数

可以优化 MX_X_CUBE_AI_Init() 和 MX_X_CUBE_AI_Process() 的输出信息：

void MX_X_CUBE_AI_Init(void)
{
    /* USER CODE BEGIN 5 */
  printf("\r\n=== Sine Model Initialization ===\r\n");

  if (ai_boostrap(data_activations0) != 0) {
    printf("AI bootstrap failed!\r\n");
    return;
  }
  
  printf("AI Network initialized successfully!\r\n");
    /* USER CODE END 5 */
}

void MX_X_CUBE_AI_Process(void)
{
    /* USER CODE BEGIN 6 */
  int res = -1;

  if (network) {
    test_x = 0.0f;  // 重置测试值
    
    printf("=== Starting Inference ===\r\n");

    do {
      res = acquire_and_process_data(data_ins);
      if (res == 0)
        res = ai_run();
      if (res == 0)
        res = post_process(data_outs);
    } while (res == 0);
    
    printf("=== Inference Complete ===\r\n");
  } else {
    printf("Error: Network not initialized!\r\n");
  }
    /* USER CODE END 6 */
}

4.8 数据流程图

┌─────────────────┐
│  你的原始数据    │  (传感器、图像、用户输入等)
└────────┬────────┘
         ▼
┌─────────────────┐
│ acquire_and_    │  填充 data_ins 缓冲区
│ process_data()  │  (类型转换 + 预处理)
└────────┬────────┘
         ▼
┌─────────────────┐
│    ai_run()     │  神经网络推理 (自动)
└────────┬────────┘
         ▼
┌─────────────────┐
│  post_process() │  从 data_outs 读取结果
│                 │  (解析 + 后处理)
└────────┬────────┘
         ▼
┌─────────────────┐
│    使用结果      │  (显示、控制、通信等)
└─────────────────┘

4.9 完整示例代码

以下是 sine_model 的完整 app_x-cube-ai.c 修改部分：

/* USER CODE BEGIN includes */
#include <math.h>

#define PI 3.14159265358979323846f

static float *input_data;
static float *output_data;
static float test_x = 0.0f;
/* USER CODE END includes */

/* USER CODE BEGIN 2 */
int acquire_and_process_data(ai_i8* data[])
{
  input_data = (float*)data[0];
  input_data[0] = test_x;
  return 0;
}

int post_process(ai_i8* data[])
{
  output_data = (float*)data[0];
  float predicted_sin = output_data[0];
  float actual_sin = sinf(test_x);
  
  printf("x = %.4f, predicted = %.4f, actual = %.4f, error = %.6f\r\n",
         test_x, predicted_sin, actual_sin, predicted_sin - actual_sin);
  
  test_x += 0.1f;
  if (test_x > 2.0f * PI) {
    test_x = 0.0f;
    return -1;
  }
  
  return 0;
}
/* USER CODE END 2 */

五、运行结果

5.1 输出

编译下载后，串口将输出：

=== Sine Model Initialization ===
AI Network initialized successfully!

=== Starting Inference ===
x = 0.0000, predicted = 0.0012, actual = 0.0000, error = 0.001200
x = 0.1000, predicted = 0.0998, actual = 0.0998, error = 0.000050
x = 0.2000, predicted = 0.1987, actual = 0.1987, error = 0.000030
...
x = 6.2000, predicted = -0.0831, actual = -0.0831, error = 0.000012
=== Inference Complete ===