# AXCL FAQ
## 模型benchmark
Benchmark 是了解硬件平台网络模型运行速度的最佳途径。以下数据基于 Raspberry Pi 5 Host 进行测试获取,仅供社区参考,不代表商业交付最终性能。
### 工况说明
- 更新时间:2024.11.22
- 工具链版本:Pulsar2 3.2-patch2
- 测试工具:axcl_run_model
- Batch Size:1 或 8
- 单位:IPS(Image/Second)
*由于不同 Host 其 memcopy、pcie 性能差异,因此 axcl_run_model 只统计网络模型在 Device 上的推理耗时*
### Vision Model
| Models | Input Size | Batch 1(IPS) | Batch 8(IPS) |
| ------------ | ---------- | ------------ | ------------ |
| Inceptionv1 | 224 | 1073 | 2494 |
| Inceptionv3 | 224 | 478 | 702 |
| MobileNetv1 | 224 | 1508 | 4854 |
| MobileNetv2 | 224 | 1366 | 5073 |
| ResNet18 | 224 | 1066 | 2254 |
| ResNet50 | 224 | 576 | 1045 |
| SqueezeNet11 | 224 | 1560 | 5961 |
| Swin-T | 224 | 342 | 507 |
| ViT-B/16 | 224 | 162 | 207 |
| YOLOv5s | 640 | 326 | 394 |
| YOLOv6s | 640 | 282 | 322 |
| YOLOv8s | 640 | 248 | 279 |
| YOLOv9s | 640 | 237 | |
| YOLOv10s | 640 | 298 | |
| YOLOv11n | 640 | 860 | |
| YOLOv11s | 640 | 305 | |
| YOLOv11m | 640 | 114 | |
| YOLOv11l | 640 | 87 | |
| YOLOv11x | 640 | 41 | |
### Audio Model
| Models | RTF |
| ------------- | ---- |
| Whisper-Tiny | 0.03 |
| Whisper-Small | 0.18 |
| MeloTTS | 0.04 |
### LLM
| Models | Prompt length(tokens) | TTFT(ms) | Generate(tokens/s) |
| ------------ | ----------------------- | ---------- | ------------------- |
| Qwen2.5-0.5B | 128 | 188 | 28 |
### VLM
| Models | Input Image | Image Encoder(ms) | Prompt length(tokens) | TTFT(ms) | Generate(tokens/s) |
| ------------ | ----------- | ------------------- | ----------------------- | ---------- | ------------------- |
| InternVL2-1B | 448*448 | 4200 | 320 | 425 | 29 |
## System Dump
axcl host sysdump 流程嵌入在axcl host fimware load 流程中;sysdump 有三个proc 节点文件:
```bash
/root # ls /proc/ax_proc/pcie/sysdump/
debug path size
/root # cat /proc/ax_proc/pcie/sysdump/debug
0
/root # cat /proc/ax_proc/pcie/sysdump/path
/opt
/root # cat /proc/ax_proc/pcie/sysdump/size
1073741824
```
:::{Note}
- path:用来指定 dump 文件的路径
- size:用来指定需要 dump 的大小
- debug: 用来开关 dump 功能
:::
这三个节点可以根据需要进行修改:
- 开启 dump:`echo 1 >debug` (默认为disabled)
- 指定 path:`echo -n /mnt >path` (默认 path:/opt 注意: 一定要加 `-n` 去除echo自动带的 `/n`;并且保证路径目录有足够大的空间存储 dump 文件,否则部分数据丢失)
- 指定 size:`echo 2097152 >size` (默认 size: 1073741824 注意:echo 时传十进制数值)
当需要进行 system dump 时,可按如下步骤进行操作:
1. 当子卡异常重启时,host 侧手动卸载 axcl_host.ko:
- arm host: `rmmod axcl_host`
- x86 host: `modprobe -r axcl_host`
2. 打开 sysdump debug 开关
```bash
echo 1 >/proc/ax_proc/pcie/sysdump/debug
```
3. 加载 axcl_host.ko
- arm host: insmod /soc/ko/axcl_host.ko
- x86 host: modprobe axcl_host
此时日志已经被 Dump 到 `/proc/ax_proc/pcie/sysdump/path` 指定的路径下了,比如默认的 `/opt`。
:::{Note}
以上操作卸载ko,会reset 所有子卡; 加载ko 会拉启所有子卡,并dump 有异常的子卡。
如果在业务执行过程中,某张子卡异常重启了,如只需dump 异常子卡数据,需要在应用代码中启动指定异常子卡,这个过程只会单独dump这张异常子卡数据,不会影响其它正常子卡运行。
:::
## 调整运行时库日志级别
- Host的AXCL日志默认路径: `/tmp/axcl/axcl_logs.txt`,Device侧的日志默认路径:`/opt/data/axclLog`
- Host和Device侧的AXCL 运行时库日志级别默认为`info`等级,支持通过json文件和API配置。
- json格式
```
{
"log": {
"host": {
"path": "/tmp/axcl/axcl_logs.txt",
"// ": "0: trace, 1: debug, 2: info, 3: warn, 4: error, 5: critical, 6: off",
"level": 2
},
"device": {
"// ": "0: trace, 1: debug, 2: info, 3: warn, 4: error, 5: critical, 6: off",
"level": 2
}
}
}
```
调用`axclInit`接口时,传入json的文件路径生效。
```c
axclError ret = axclInit("./axcl.json");
```
- `axclSetLogLevel`接口更改Host的运行时库日志
```c
axclSetLogLevel(3);
```
:::{Note}
json文件支持更改Host侧的日志存放路径,日志级别以及Device侧的日志级别,而`axclSetLogLevel`接口只能动态更改Host侧的日志级别。
:::
## 调整PCIe传输DMA内存大小
AXCL运行时库(`libaxcl_rt.so`) PCIe传输DMA内存从CMA分配,对每个进程需要3块`dma buf size`的内存,即总大小 = **3 x `dma buf size`** Bytes。
默认`dma buf size`大小为4MBytes,支持通过axcl.json配置,调用`axclInit`接口生效, json格式如下:
```json
{
"log": {
"host": {
"path": "/tmp/axcl/axcl_logs.txt",
"// ": "0: trace, 1: debug, 2: info, 3: warn, 4: error, 5: critical, 6: off",
"level": 2
},
"device": {
"// ": "0: trace, 1: debug, 2: info, 3: warn, 4: error, 5: critical, 6: off",
"level": 2
}
},
"dma buf size": "0x200000"
}
```
如上所示,将`dma buf size`调整为2MBytes。
:::{Note}
- 应用根据实际业务需求和内存容量更改该参数,**最小大小为1MBytes**(0x100000)。
- `dma buf size` **同时**修改Host和Device的PCIe CMA缓存分配。
:::
## 调整 sysdump 时间戳
在 sysdump 后,sysdump 文件名时间戳可能不对,需要进行以下设置:
- 确认 RTC 是否设置为本地时区:
使用以下命令检查 RTC 是否设置为本地时区:`timedatectl | grep RTC`
- 将 RTC 设置为本地时区:
使用以下命令将 RTC 设置为本地时区:`sudo timedatectl set-local-rtc 1`
- 同步系统时间到 RTC:
如果需要,您可以将系统时间同步到 RTC:`sudo hwclock --systohc`
## 驱动安装
### deb
#### 缺少 linux-header
deb 安装时可能出现以下安装失败信息,一般导致安装失败原因是缺少linux-header-$(uname -r) 目录文件,导致安装时pcie driver 编译失败

安装失败后,会有安装文件保存,进入到/usr/src/axcl/drv/pcie/driver 手动编译确认是否编译有问题,
:::{Note}
```bash
# 主控芯片是x86架构的编译方法
make host=x86 clean all install
# 主控芯片是arm64架构的编译方法
make host=arm64 clean all install
```
:::
或者查看/usr/src/axcl 目录下是否有out 目录确认编译是否有问题,如下:

- arm开发板: linux-header-$(uname -r) 一般由开发板厂商提供,可以去厂商官网查找说明,如下:

- x86 pc:linux-header-$(uname -r) 可以通过apt 下载,sudo apt install kernel-headers-$(uname -r)
:::{Note}
如果第一次安装失败,如果不把第一次安装的残留文件清除,可能会导致后面的安装失败,清除步骤如下:
```bash
sudo rm -rf /var/lib/dpkg/info/axclhost.*
sudo rm dpkg -r axclhost
```
:::
## 设备内存布局
### AX650N
```
0x100000000
| Linux OS | ramdisk | CMM |
```
- **DDR地址起始地址**:0x100000000
- **ramdisk**
- **分区大小** (`rootfs/card/Makefile`):
```bash
$(HOME_PATH)/tools/mkext4fs/make_ext4fs -l 128M $(BUILD_PATH)/out/$(PROJECT)/images/rootfs.ext4 $(BUILD_ROOT_DIR)/rootfs
$(HOME_PATH)/tools/mkext4fs/make_ext4fs -l 128M -s $(BUILD_PATH)/out/$(PROJECT)/images/rootfs_sparse.ext4 $(BUILD_ROOT_DIR)/rootfs
```
- **内核DTS ramdisk配置 ** (`kernel/linux/linux-5.15.73/arch/arm64/boot/dts/axera/AX650_card.dts`)
| 字段 | 说明 | 示例 |
| ---- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| reg | <起始地址高32位 起始地址低32位 大小高32位 大小低32位> | `<0x1 0x40000000 0x0 0x8000000>`
起始地址0x140000000, 大小 0x8000000 (128MB) |
| addr | <起始地址高32位 起始地址低32位> | `<0x1 0x40000000>` 起始地址0x140000000 |
| size | <大小高32位 大小低32位> | `<0x0 0x8000000>` 大小 0x8000000 (128MB) |
- **子卡固件rootfs的下载地址** (`tools/mkaxp/AX650X_card_pac.xml`)
Base: ramdisk的地址
```
ROOTFS
CODE
0x140000000
0x0
rootfs.ext4
Download ROOTFS image file
```
- **Makefile** (`build/projects/AX650_card.mak`)
- **OS_MEM**: OS+ramdisk的总大小
- **CMM_POOL_PARAM**: CMM的partiton分区名;flag(= 0);起始地址;partition总大小。 其中起始地址 = LinuxOS + ramdisk的偏移地址
#### 4+4 8G推荐配置
| Linux OS | ramdisk | CMM |
| :------: | :-----: | :----: |
| 1024MB | 128MB | 7040MB |
```
kernel/linux/linux-5.15.73/arch/arm64/boot/dts/axera/AX650_card.dts:
ramdisk_mem@140000000 {
compatible = "axera, ramdisk";
reg = <0x1 0x40000000 0x0 0x8000000>;
addr = <0x1 0x40000000>;
size = <0x0 0x8000000>;
no-map;
};
build/projects/AX650_card.mak:
# OS:RAMDISK:CMM
OS_MEM := mem=1152M
# cmm memory config
CMM_POOL_PARAM := anonymous,0,0x148000000,7040M
tools/mkaxp/AX650X_card_pac.xml:
ROOTFS
CODE
0x140000000
0x0
rootfs.ext4
Download ROOTFS image file
```
#### 2+2 4G推荐配置
| Linux OS | ramdisk | CMM |
| :------: | :-----: | :----: |
| 1024MB | 128MB | 2944MB |
```
kernel/linux/linux-5.15.73/arch/arm64/boot/dts/axera/AX650_card.dts:
ramdisk_mem@140000000 {
compatible = "axera, ramdisk";
reg = <0x1 0x40000000 0x0 0x8000000>;
addr = <0x1 0x40000000>;
size = <0x0 0x8000000>;
no-map;
};
build/projects/AX650_card.mak:
# OS:RAMDISK:CMM
OS_MEM := mem=1152M
# cmm memory config
CMM_POOL_PARAM := anonymous,0,0x148000000,2944M
tools/mkaxp/AX650X_card_pac.xml:
ROOTFS
CODE
0x140000000
0x0
rootfs.ext4
Download ROOTFS image file
```
## SDK 编译
### HOST
cd 进入目标文件夹,比如sample
- x86_x64: `make host=x86 clean && make host=x86 all install -j128`
- arm64 : `make host=arm64 clean && make host=arm64 all install -j128`
- ax650N : `make clean && make all install -j128`
- 编译输出路径:
```bash
/axcl/out$ tree -L 1
.
├── axcl_linux_arm64
├── axcl_linux_ax650
└── axcl_linux_x86
```
**PCIe drvier 编译示例**:
```bash
root@axcnshbussrv06p:~/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/drv$ make host=x86 clean && make host=x86 all install -j128
In subdir pcie...
make[1]: Entering directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/drv/pcie'
In subdir driver...
... ...
In subdir host_dev ...
make[3]: Entering directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/drv/pcie/driver/host_dev'
make ARCH=x86 CROSS_COMPILE= KCFLAGS="-DIS_THIRD_PARTY_PLATFORM" -C /lib/modules/5.4.0-150-generic/build M=/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev src=/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/drv/pcie/driver/host_dev O=/lib/modules/5.4.0-150-generic/build HOME_PATH=/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477 modules
make[4]: Entering directory '/usr/src/linux-headers-5.4.0-150-generic'
CC [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/ax_pcie_dev_host.o
CC [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/ax_pcie_opt.o
CC [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/ax_pcie_proc.o
CC [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/ax_pcie_msg_transfer.o
CC [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/version.o
LD [M] /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/drv/pcie/driver/host_dev/ax_pcie_host_dev.o
Building modules, stage 2.
MODPOST 1 modules
CC [M]
```
**sample编译示例:**
```bash
root@axcnshbussrv06p:~/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample$ make host=x86 clean && make host=x86 all install -j128
In subdir runtime...
make[1]: Entering directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/runtime'
make[1]: Leaving directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/runtime'
... ...
In subdir transcode...
make[2]: Entering directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/ppl/transcode'
INSTALL /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/sample/ppl/transcode/axcl_sample_transcode launch_transcode.sh to /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/out/axcl_linux_x86/bin/
make[2]: Leaving directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/ppl/transcode'
Install /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/ppl success!!
make[1]: Leaving directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/ppl'
In subdir x86app...
make[1]: Entering directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/x86app'
INSTALL /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/build/out/axcl_linux_x86/objs/sample/x86app/axcl_demo /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/x86app/bin/* to /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/out/axcl_linux_x86/bin/axcl_demo
make[1]: Leaving directory '/home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample/x86app'
Install /home/root/customer/AX650_SDK_V2.18.0_20241202130208_NO4477/axcl/sample success!!
```
### DEVICE
1. `cd build` 进入SDK根目录下的build目录,注意不是axcl/build
2. `make p=AX650_card clean all install -j128`
3. deb, rpm生成路径:`build/out`