[i18n] print_printable_section [i18n] print_click_to_print.

Latest

1: 快速开始
2: 应用部署

2.1: Docker Compose 容器部署
2.2: DaemonSet 云原生集群部署
2.3: Systemd 物理机部署

3: 源码编译
4: 配置指南
5: 核心特性

5.1: 内核全景观测
5.2: 异常事件诊断
5.3: 全自动化追踪
5.4: 持续 Profiling
5.5: 硬件故障诊断

6: 应用实践

6.1: 存储服务
6.2: 内核事件订阅
6.3: 性能剖析
6.4: 网络丢包

7: 开发手册

7.1: 采集模式
7.2: 自定义指标
7.3: 自定义事件
7.4: 自定义追踪
7.5: 集成测试

8: 常见问题

9: 贡献

9.1: 源码贡献

10: 变更日志

简介

HUATUO（华佗） 是由滴滴开源并依托 CCF （中国计算机学会） 孵化的操作系统深度可观测项目，专注为复杂云原生通用计算，AI 计算，裸金属基础服务等提供操作系统内核级深度观测能力。该项目核心成员为一群开源技术爱好者，基础技术研究者。

内核版本

理论支持 4.18 之后的所有版本，主要测试内核、和操作系统发行版如下：

HUATUO	内核版本	操作系统发行版
1.0	4.18.x	CentOS 8.x
1.0	5.4.x	OpenCloudOS V8/Ubuntu 20.04
1.0	5.10.x	OpenEuler 22.03/Anolis OS 8.10
1.0	5.15.x	Ubuntu 22.04
1.0	6.6.x	OpenEuler 24.03/Anolis OS 23.3/OpenCloudOS V9
1.0	6.8.x	Ubuntu 24.04
1.0	6.14.x	Fedora 42

联系我们

微信群（备注姓名+单位）和公众号：

1 - 快速开始

为帮助大家快速体验、部署 HUATUO, 该文档分别从极速体验，容器启动，编译部署三部分说明。

1. 极速体验

你可以直接登陆示例网站访问前端监控大盘示例，如内核指标、异常事件、火焰图等（账户：huatuo 密码：huatuo1024）。

2. 容器启动

HUATUO 组件数据流示意图

2.1 Docker 启动

通过 docker 启动已经编译好的容器镜像（注意：该方式默认关闭了获取容器信息功能，和 ES 存储功能）。

启动容器：

$ docker run --privileged --cgroupns=host --network=host -v /sys:/sys -v /proc:/proc -v /run:/run huatuo/huatuo-bamai:latest

获取指标：打开另外一个终端，通过 curl 获取。

$ curl -s localhost:19704/metrics

查看异常事件 (Events, AutoTracing)：HUATUO 会将采集到的内核异常事件信息在 ES （已关闭），和本地目录 huatuo-local 分别存储。注意：通常该路径下没有任何文件（正常状态的系统不会触发事件采集），你可以通过构造异常场景或者修改配置文件阈值产生事件。

2.2 Docker Compose 启动

通过 docker compose，可以快速地在本地搭建部署一套完整的环境。该命令拉取最新镜像，启动 elasticsearch, prometheus, grafana，huatuo-bamai 等组件。命令执行成功后，打开浏览器访问 http://localhost:3000 即可浏览监控大盘（grafana 默认管理员账户：admin 密码：admin；系统正常状态不会触发 Events, AutoTracing）。

$ docker compose --project-directory ./build/docker up

HUATUO 组件之 huatuo-bamai 运行示意图

3. 编译部署

3.1 编译

为隔离开发者本地环境和简化编译流程，我们提供容器化编译，你可以直接通过 docker build，构建完成的镜像（包含底层采集器 huatuo-bamai、bpf obj、工具等）。在项目根目录运行：

$ docker build --network host -t huatuo/huatuo-bamai:latest .

3.2 运行

运行容器：

$ docker run --privileged --cgroupns=host --network=host -v /sys:/sys -v /proc:/proc -v /run:/run huatuo/huatuo-bamai:latest

或从容器 /home/huatuo-bamai 路径下拷贝出所有文件后本地手动运行：

$ ./huatuo-bamai --region example --config huatuo-bamai.conf

注意：可使用 systemd/supervisord/k8s-DaemonSet 等方式托管运行。

3.3 配置

配置容器信息 HUATUO 通过调用 kubelet 接口获取POD/容器信息。你可以根据实际环境配置访问接口和证书，KubeletAuthorizedPort = 0, KubeletReadOnlyPort = 0 表示禁用该功能。
```
  [Pod]
    KubeletClientCertPath = "/etc/kubernetes/pki/apiserver-kubelet-client.crt,/etc/kubernetes/pki/apiserver-kubelet-client.key"
```

配置存储

指标存储 (Metric): 所有的指标都存储在 prometheus，你可以通过访问 :19704/metrics 接口获取指标。

异常事件存储 (Events, AutoTracing): 所有的内核事件，和 Autotracing 事件都存储在 ES。注意：如果配置为空表示不启动 ES 存储，只在本地目录 huatuo-local 存储事件。

ES 存储配置如下：

[Storage.ES]
    Address = "http://127.0.0.1:9200"
    Username = "elastic"
    Password = "huatuo-bamai"
    Index = "huatuo_bamai"

本地存储配置如下：

# tracer's record data
# Path: all but the last element of path for per tracer
# RotationSize: the maximum size in Megabytes of a record file before it gets rotated for per subsystem
# MaxRotation: the maximum number of old log files to retain for per subsystem
[Storage.LocalFile]
    Path = "huatuo-local"
    RotationSize = 100
    MaxRotation = 10

事件阈值所有的内核事件采集 Events 和 AutoTracing 都可以配置触发阈值。默认的阈值都是在实际生产环境反复验证后的经验数据，你可以根据自身需求，在 huatuo-bamai.conf 中修改阈值。
资源限制为保障物理机稳定性，我们对采集器进行了资源限制，其中 LimitInitCPU 表示采集器启动阶段占用的 CPU 资源，LimitCPU/LimitMem 表示采集器启动成功后常态占用的资源限制：
```
[RuntimeCgroup]
    LimitInitCPU = 0.5
    LimitCPU = 2.0
    LimitMem = 2048
```

2 - 应用部署

HUATUO (华佗) 社区提供多种部署方式，具体如下：

2.1 - Docker Compose 容器部署

镜像下载

镜像仓库地址：https://hub.docker.com/r/huatuo/huatuo-bamai/tags

使用 Docker 启动容器

$ docker run --privileged --cgroupns=host --network=host -v /sys:/sys -v /proc:/proc -v /run:/run huatuo/huatuo-bamai:latest

⚠️：注意：此方式使用容器内置的默认配置文件，该配置不会连接 kubelet 与 Elasticsearch。

使用 Docker Compose 启动容器

通过 Docker Compose 可在本地快速搭建一套完整环境，自行管理采集器、Elasticsearch、Prometheus、Grafana 等组件。

$ docker compose --project-directory ./build/docker up

Docker Compose 安装方法请参阅 https://docs.docker.com/compose/install/linux/。

2.2 - DaemonSet 云原生集群部署

本文介绍如何通过 Kubernetes DaemonSet 将华佗采集器部署到云原生集群。

1. 获取配置文件

$ curl -L -o huatuo-bamai.conf https://github.com/ccfos/huatuo/raw/main/huatuo-bamai.conf

2. 修改配置文件

根据实际部署环境修改配置文件，例如调整存储后端、Pod 信息获取方式等配置项，详见《配置指南》。

3. 创建 ConfigMap

$ kubectl delete configmap huatuo-bamai-config
$ kubectl create configmap huatuo-bamai-config --from-file=./huatuo-bamai.conf

4. 部署采集器

$ kubectl apply -f https://github.com/ccfos/huatuo/blob/main/build/huatuo-daemonset.minimal.yaml

注意事项：

huatuo-daemonset.minimal.yaml 中容器镜像默认使用 huatuo-bamai:latest 标签。若需用于生产环境，请将其替换为指定的发行版本镜像。
若使用 huatuo-bamai:latest 进行测试，请确保该标签指向最新镜像（可通过 docker image rm huatuo/huatuo-bamai:latest 删除旧镜像后重新拉取）。

2.3 - Systemd 物理机部署

HUATUO（华佗）的 RPM 发行版可通过 OpenCloudOS 镜像仓库获取，当前仅支持 v2.1.0 版本。

1. 下载 RPM 包

OpenCloudOS 镜像站提供了 HUATUO 的 RPM 安装包，可按需选择对应架构下载：

wget https://mirrors.opencloudos.tech/epol/9/Everything/x86_64/os/Packages/huatuo-bamai-2.1.0-2.oc9.x86_64.rpm  
wget https://mirrors.opencloudos.tech/epol/9/Everything/aarch64/os/Packages/huatuo-bamai-2.1.0-2.oc9.aarch64.rpm

2. 安装 RPM 包

sudo rpm -ivh huatuo-bamai*.rpm

3. 修改配置

根据实际部署环境编辑配置文件 /etc/huatuo-bamai/huatuo-bamai.conf，详细配置项说明请参见《配置指南》。

4. 启动 HUATUO 服务

sudo systemctl start huatuo-bamai
sudo systemctl enable huatuo-bamai

完整的安装指引请参阅 https://mp.weixin.qq.com/s/Gmst4_FsbXUIhuJw1BXNnQ

3 - 源码编译

1. 容器编译

可以执行如下命令，完成编译，静态代码检查。

$ sh build/build-run-testing-image.sh

或者单独执行：

1. 准备编译环境

$ docker build --network host -t huatuo/huatuo-bamai-dev:latest -f ./Dockerfile.devel .

2. 启动编译容器

$ docker run -it --privileged --cgroupns=host --network=host -v $(pwd):/go/huatuo-bamai huatuo/huatuo-bamai-dev:latest sh

3. 进入容器编译

$ make

2. 镜像发布

通过 docker build 方式能够快速的发布，最新二进制容器镜像。

docker build --network host -t huatuo/huatuo-bamai:latest .

3. 物理机编译

3.1 安装依赖

Ubuntu 24.04:

apt install make git clang libbpf-dev linux-tools-common curl capnproto

Fedora 40:

dnf install make git clang libbpf-devel bpftool curl capnproto capnproto-devel glibc-static

go install mvdan.cc/gofumpt@v0.8.0
go install mvdan.cc/sh/v3/cmd/shfmt@v3.11.0
go install golang.org/x/tools/cmd/goimports@v0.36.0
go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.62.2
go install github.com/vektra/mockery/v2@v2.53.6
go install capnproto.org/go/capnp/v3/capnpc-go@v3.1.0-alpha.2

3.2 编译

$ make

4. BPF 调试编译

BPF 代码中可以使用 bpf_dbg() / bpf_dbg_msg() 宏（定义于 bpf/include/bpf_dbg.h）在内核态打印调试信息，便于排查 eBPF 程序的运行逻辑。该功能采用两级开关，默认完全关闭，对生产路径零开销。

4.1 在 BPF 代码中埋点

#include "bpf_dbg.h"

// 在每个需要调试的 BPF 程序所在的 .c 文件中声明调试 map（map 名与下方一致）
BPF_DBG_MAP(native_cpu);

SEC("perf_event")
int prog(void *ctx)
{
        // 仅打印一条消息
        bpf_dbg_msg(ctx, native_cpu, "enter prog");

        // 打印消息并附带最多 3 个 u64 参数
        bpf_dbg(ctx, native_cpu, "pid and addr", pid, addr, 0);
        return 0;
}

BPF_DEBUG=0（默认）时，上述宏会展开为空操作，调试 perf event array、栈上事件结构、bpf_ktime_get_ns、bpf_perf_event_output 等都不会生成，verifier 也看不到它们，.o 体积更小，加载时不消耗额外 fd。

4.2 编译时开启（第一级：DEBUG_BPF）

通过 BPF_DEBUG=1 将 -DDEBUG_BPF 传给 clang，把调试代码编译进 BPF 对象：

$ make BPF_DEBUG=1            # 或单独编译 BPF：make BPF_DEBUG=1 bpf-build

4.3 运行时开启（第二级：log-bpf-debug）

即使已编译进对象，调试输出在运行时默认仍被抑制。需要在启动 profiler 时加上 --log-bpf-debug 才会真正打开（当前仅 native profiler 生效）：

$ ./profiler --type cpu --language native --log-bpf-debug ...

其原理是：加载 BPF 对象时通过 bpf.NewDbg(true) 创建的 BpfDbg 实例，在 LoadBpf 前把 bpf_dbg_enabled 常量改写为 1；未改写时 verifier 会把 if (bpf_dbg_enabled) 当作死代码消除。每个 BPF 对象持有独立的 BpfDbg，调试开关互不影响。

4.4 输出内容

调试事件由用户态以 Debug 级别日志打印，每条包含：

file：触发埋点的 BPF 源文件名（__FILE_NAME__）
line：源文件行号
ts：事件时间戳（由 bpf_ktime_get_ns 转换为 UTC 墙钟时间）
msg：埋点传入的消息字符串
args：可选，最多 3 个 u64 参数（全为 0 时省略）

示例：

bpf_dbg: file=native_cpu_profiler.c line=120 ts=2026-01-11T08:30:00.123456Z msg=enter prog args=[0x1f4 0xffff8881 0x0 0x0]

注意：两级开关需同时满足才会有输出——BPF_DEBUG=1 编译且运行时带 --log-bpf-debug。

4 - 配置指南

1. 文档概述

huatuo-bamai 作为 HUATUO 的核心采集器（bpf-based metrics and anomaly inspector），其配置文件用于定义数据采集范围、探针启用策略、指标输出格式、异常检测规则、以及日志行为等。

配置文件包含全局黑名单、日志、运行时资源限制、存储配置以及自动追踪（AutoTracing）等多个 section。每个配置项均附带详细注释，明确说明用途、默认值及注意事项。本文档针对配置文件中的每一个配置项提供中文的详细解释，帮助用户准确理解和安全定制配置。

注意：配置文件中多数参数以 # 注释形式提供默认值，实际启用时需移除 # 并根据环境调整。修改后需重启 huatuo-bamai 进程生效。生产环境建议遵循最小化原则，避免过度开启高开销特性。

2. 全局黑名单

# The global blacklist for tracing and metrics
BlackList = ["netdev_hw", "metax_gpu"]

BlackList：全局追踪与指标黑名单。

用于排除特定模块或追踪和指标采集，避免无关噪声或高开销探针。例如 [“netdev_hw”, “metax_gpu”]，即全局禁用网络设备硬件层（netdev_hw）和 Metax GPU 相关的追踪与指标。

说明：添加黑名单项可有效降低资源消耗，尤其在特定硬件环境中；支持数组格式，可根据实际业务扩展。

3. 日志配置

# Log Configuration
#
# - Level
# The log level for huatuo-bamai: Debug, Info, Warn, Error, Panic.
# Default: Info
#
# - File
# Store logs to where the logging file is. If it is empty, don't write log
# to any file.
# Default: empty
#
[Log]
	# Level = "Info"
	# File = ""

Level：日志级别。

可选值包括 Debug、Info、Warn、Error、Panic。默认值为 Info。

说明：控制 huatuo-bamai 的日志输出详细程度。生产环境推荐使用 Info 或 Warn 以减少日志量；Debug 级别仅用于故障排查，会产生大量输出。
File：日志文件路径。

指定日志写入的文件路径。若为空字符串，则不写入文件（仅输出到标准输出或系统日志）。默认值为空。

说明：在容器化部署中，建议配置具体路径进行持久化。

4. 运行时资源限制

# Runtime resource limit
#
# - LimitInitCPU
# During the huatuo-bamai startup, the CPU of process are restricted from use.
# Default is 0.5 CPU.
#
# - LimitCPU
# The CPU resource restricted once the process starts.
# Default is 2.0 CPU.
#
# - LimitMem
# The memory resource limited for huatuo-bamai process.
# Default is 2048MB.
#
[RuntimeCgroup]
	# LimitInitCPU = 0.5
	# LimitCPU = 2.0
	# LimitMem = 2048

LimitInitCPU：启动阶段 CPU 限制。

huatuo-bamai 进程启动期间允许使用的 CPU 核数限制。默认值为 0.5 CPU。

说明：防止启动过程占用过多 CPU 资源影响宿主机业务，单位为 CPU 核心数（支持小数）。
LimitCPU：运行时 CPU 限制。

进程正常运行后允许使用的 CPU 资源上限。默认值为 2.0 CPU。

说明：根据节点规模和业务负载调整，推荐在高密度容器环境中适当降低以保障业务稳定性。
LimitMem：内存资源限制。

huatuo-bamai 进程可使用的最大内存量。默认值为 2048 MB。

说明：单位为 MB，用于通过 cgroup 限制内存占用，防止 OOM（Out Of Memory）风险。生产环境可根据实际采集规模适当增加。

5. 存储配置

5.1 ElasticSearch/OpenSearch 存储

# Storage configuration
[Storage]
    # Elasticsearch and OpenSearch Storage
    #
    # Disable ES/OS storage if one of Address, Username, Password is empty.
    # Store the tracing and events data of linux kernel to ES/OS.
    #
    # - Address
    # Default address is :9200 of localhost. Port 9200 is used for all API calls
    # over HTTP. This includes search and aggregations, monitoring and anything
    # else that uses a HTTP or HTTPS request. All client libraries will use this port to
    # talk to Elasticsearch or OpenSearch.
    # e.g.
    # http://127.0.0.1:9200
    # https://127.0.0.1:9200
    #
    # Default: :9200
    #
    # - Index
    # Elasticsearch or OpenSearch index, a logical namespace that holds a collection of
    # documents for huatuo-bamai.
    # Default: huatuo_bamai
    #
    # - Username
    # - Password
    # There is no default username and password.
    #
    [Storage.ES]
        # Address = "http://127.0.0.1:9200"
        # Index = "huatuo_bamai"
        Username = "elastic"
        Password = "huatuo-bamai"

Address：ElasticSearch/OpenSearch 存储服务地址。

默认值为 http://127.0.0.1:9200。

说明：用于存储内核追踪和事件数据。如果 Address、Username 或 Password 中任一项为空，则禁用 ES/OS 存储。支持 HTTP/HTTPS 协议。
Index：索引名称。

默认值为 huatuo_bamai。

说明：索引是 ElasticSearch/OpenSearch 文档的逻辑命名空间，用于组织 huatuo-bamai 产生的追踪与事件数据。
Username：用户名。

无默认值（示例中使用 elastic）。

说明：用于 Basic Auth 认证。
Password：认证密码。

无默认值（示例中使用 huatuo-bamai）。

说明：配合用户名进行安全认证。生产环境强烈建议使用强密码并结合 TLS 加密传输。

整体说明：ES/OS 存储用于持久化内核追踪和事件数据，便于后续检索与分析。如果用户不关心 Linux 内核事件、Autotracing 数据则可以关闭该配置。

5.2 本地文件存储

# LocalFile Storage
#
# Store data to local directory for troubleshooting on the host machine.
#
# - Path
# The directory for storing data. If the Path is empty, LocalFile will be disabled.
# Default: "huatuo-local"
#
# - RotationSize
# The maximum size in Megabytes of a record file before it gets rotated
# for per linux kernel tracer.
# Default: 100MB
#
# - MaxRotation
# The maximum number of old log files to retain for per tracer.
# Default: 10
#
[Storage.LocalFile]
	# Path = "huatuo-local"
	# RotationSize = 100
	# MaxRotation = 10

Path：本地数据存储目录。

默认值为 huatuo-local。若路径为空，则禁用本地文件存储。

说明：用于在宿主机本地保存数据，主要用于现场故障排查。推荐配置为绝对路径。
RotationSize：单文件轮转大小。

每个追踪器记录文件在达到该大小时进行轮转。默认值为 100 MB。

说明：单位为 MB，防止单个文件过大导致磁盘占用失控。
MaxRotation：最大保留轮转文件数。

每个追踪器最多保留的历史文件数量。默认值为 10。

说明：超过数量后自动删除最早文件，控制磁盘空间使用。

6. 自动追踪配置

自动追踪模块是 HUATUO 的智能特性之一，可根据阈值自动触发特定性能追踪，减少人工干预。

6.1 CPUIdle 自动追踪 — 容器突发高 CPU 使用场景

# Autotracing configuration 
[AutoTracing]
    # cpuidle
    #
    # For a high cpu usage all of a sudden in containers.
    #
    # - UserThreshold
    # User CPU usage threshold, when cpu usage reaches this threshold, cpu
    # performance tracing will be triggered.
    # Default: 75%
    #
    # - SysThreshold
    # System CPU usage threshold, when reaching this threshold, cpu performance
    # tracing will be triggered.
    # Default: 45%
    #
    # - UsageThreshold
    # The total cpu usage (system + user cpu usage) threshold, when reaching
    # this threshold, cpu performance tracing will be triggered.
    # Default: 45%
    #
    # - DeltaUserThreshold
    # The range of this user cpu changes within a short period of time.
    # Default: 45%
    #
    # - DeltaSysThreshold
    # The range of this system cpu changes within a short period of time.
    # Default: 20%
    #
    # - DeltaUsageThreshold
    # The range of this cpu usage changes within a short period of time.
    # Default: 55%
    #
    # - Interval
    # The sample interval of the cpu usage for all containers.
    # Default: 10s
    #
    # - IntervalTracing
    # Time since last run. Avoid frequently executing this tracing to prevent
    # damage to the system.
    # Default: 1800s
    #
    # - RunTracingToolTimeout
    # The executing time of this tracing program.
    # Default: 10s
    # 
    # NOTE:
    # Running this performance tool, when:
    # 1. UserThreshold and DeltaUserThreshold are true, or
    # 2. SysThreshold and DeltaSysThreshold are true, or
    # 3. UsageThreshold and DeltaUsageThreshold
    #
    [AutoTracing.CPUIdle]
        # UserThreshold = 75
        # SysThreshold = 45
        # UsageThreshold = 90
        # DeltaUserThreshold = 45
        # DeltaSysThreshold = 20
        # DeltaUsageThreshold = 55
        # Interval = 10
        # IntervalTracing = 1800
        # RunTracingToolTimeout = 10

UserThreshold：用户态 CPU 使用率阈值（%）。

默认 75%。当容器用户态 CPU 使用率达到该值时，可能触发 CPU 性能追踪。
SysThreshold：系统态 CPU 使用率阈值（%）。

默认 45%。当系统态 CPU 使用率达到该值时，可能触发追踪。
UsageThreshold：总 CPU 使用率阈值（用户态 + 系统态，%）。

默认 90%（注释中示例）。总 CPU 使用率达到该阈值时触发追踪。
DeltaUserThreshold：用户态 CPU 短期变化幅度阈值（%）。

默认 45%。短时间内用户态 CPU 使用率变化超过该值时触发。
DeltaSysThreshold：系统态 CPU 短期变化幅度阈值（%）。

默认 20%。短时间内系统态 CPU 使用率变化超过该值时触发。
DeltaUsageThreshold：总 CPU 使用率短期变化幅度阈值（%）。

默认 55%。短时间内总 CPU 使用率变化超过该值时触发。
Interval：CPU 使用率采样间隔（秒）。

默认 10s。对所有容器进行 CPU 使用率采样的周期。
IntervalTracing：连续运行间隔（秒）。

默认 1800s（30 分钟）。两次自动追踪之间的最小间隔，防止频繁执行对系统造成压力。
RunTracingToolTimeout：单次性能追踪执行超时时间（秒）。默认 10s。控制追踪程序的最长运行时间，避免长时间占用资源。

触发逻辑说明：当满足以下任一条件时触发追踪：

UserThreshold 与 DeltaUserThreshold 同时满足；或
SysThreshold 与 DeltaSysThreshold 同时满足；或
UsageThreshold 与 DeltaUsageThreshold 同时满足。

Filter 容器过滤：通过 Included/Excluded 规则数组控制监控范围。

    # 每条规则包含 Field（过滤字段）和 Pattern（正则）
    # Field: container_host_namespace | container_hostname | container_qos
    #
    # [[AutoTracing.CPUIdle.Filter.Excluded]]
    #     Field = "container_qos"
    #     Pattern = "besteffort"
    # [[AutoTracing.CPUIdle.Filter.Included]]
    #     Field = "container_host_namespace"
    #     Pattern = "^application-"

Filter：容器过滤规则。使用 [[double-bracket]] 语法定义多条规则，每条含 Field（过滤字段）和 Pattern（正则）。过滤逻辑：
- 无规则：监控所有容器
- 仅 Excluded：黑名单，排除匹配的容器
- 仅 Included：白名单，仅监控匹配的容器
- 两者并存：匹配 Included 且不匹配 Excluded
默认无规则，监控所有容器。

6.2 CPUSys 自动追踪 — 宿主机突发高系统 CPU 使用场景

# cpusys
#
# For a high system cpu usage all of a sudden on host machine.
#
# - SysThreshold
# System CPU usage threshold, when reaching this threshold, cpu performance
# tracing will be triggered.
# Default: 45%
#
# - DeltaSysThreshold
# The range of system cpu changes within a short period of time.
# Default: 20%
#
# - Interval
# The sample interval of the cpu usage for host machine.
# Default: 10s
#
# - RunTracingToolTimeout
# The executing time of this tracing program.
# Default: 10s
#
# NOTE:
# Running this performance tool, when:
# SysThreshold and DeltaSysThreshold are true.
#
[AutoTracing.CPUSys]
	# SysThreshold = 45
	# DeltaSysThreshold = 20
	# Interval = 10
	# RunTracingToolTimeout = 10

SysThreshold：系统态 CPU 使用率阈值（%）。

默认 45%。
DeltaSysThreshold：系统态 CPU 短期变化幅度阈值（%）。

默认 20%。
Interval：宿主机 CPU 使用率采样间隔（秒）。

默认 10s。
RunTracingToolTimeout：单次追踪执行超时时间（秒）。默认 10s。

触发逻辑：当 SysThreshold 与 DeltaSysThreshold 同时满足时触发。

6.3 Dload 自动追踪 — 容器 D 状态任务剖析

# dload
#
# linux tasks D state profiling for containers.
#
# - ThresholdLoad
# The loadavg threshold value, when reaching this threshold, dload profiling
# is triggered.
# Default: 5
#
# - Interval
# The sample interval of the load for all containers.
# Default: 10s
#
# - IntervalTracing
# Time since last run. Avoid frequently executing this tracing to prevent
# damage to the system.
# Default: 1800s
#
[AutoTracing.Dload]
	# ThresholdLoad = 5
	# Interval = 10
	# IntervalTracing = 1800

ThresholdLoad：容器的系统负载平均值（loadavg）阈值。

默认 5。当 loadavg 达到该值时，触发 D 状态（不可中断睡眠）任务剖析。

说明：用于诊断容器中大量进程进入 D 状态的场景。
Interval：监控间隔（秒）。

默认 10。 Dload 监控的周期。
IntervalTracing：连续运行间隔（秒）。

默认 1800s（30 分钟）。两次自动追踪之间的最小间隔，防止频繁执行对系统造成压力。

6.4 IOTracing 自动追踪 — 容器 IO 性能剖析

# iotracing
#
# io profiling for containers.
#
# - WbpsThreshold
# Max write bytes per second, when reaching this threshold, iotracing is triggered.
# Please note that if it is an NVMe device, it must also meet the UtilThreshold.
# Default: 1500 MB/s
#
# - RbpsThreshold
# Max read bytes per second, when reaching this threshold, iotracing is triggered.
# Please note that if it is an NVMe device, it must also meet the UtilThreshold.
# Default: 2000 MB/s
#
# - UtilThreshold
# Disk utilization, Percentage of time the disk is busy. If this is consistently
# above 80-90%, the disk may be a bottleneck.
# Default: 90%
#
# - AwaitThreshold
# Await (Average IO wait time in ms): High values indicate slow disk response times.
# Default: 100ms
#
# - RunTracingToolTimeout
# The executing time of this tracing tool.
# Default: 10s
#
# - MaxProcDump
# The number of processes displayed by iotracing tool.
# Default: 10
#
# - MaxFilesPerProcDump
# The number of files per process displayed by iotracing tool.
# Default: 5
#
[AutoTracing.IOTracing]
	# WbpsThreshold = 1500
	# RbpsThreshold = 2000
	# UtilThreshold = 90
	# AwaitThreshold = 100
	# RunTracingToolTimeout = 10
	# MaxProcDump = 10
	# MaxFilesPerProcDump = 5

WbpsThreshold：每秒最大写字节数阈值（MB/s）。

默认 1500 MB/s。达到该值时可能触发 IO 追踪（NVMe 设备需同时满足 UtilThreshold）。
RbpsThreshold：每秒最大读字节数阈值（MB/s）。

默认 2000 MB/s。类似写字节，达到阈值时触发。
UtilThreshold：磁盘利用率阈值（%）。

默认 90%。磁盘忙碌时间百分比，持续高于 80-90% 可能成为瓶颈。
AwaitThreshold：平均 IO 等待时间阈值（ms）。

默认 100ms。高值表示磁盘响应缓慢。
RunIOTracingTimeout：IO 追踪工具执行超时时间（秒）。

默认 10s。
MaxProcDump：IO 追踪显示的最大进程数。

默认 10。控制输出中展示的进程数量。
MaxFilesPerProcDump：每个进程显示的最大文件数。

默认 5。控制每个进程关联文件的展示数量。

说明：IOTracing 用于容器 IO 热点诊断，特别关注高负载磁盘场景。

6.5 内存突发自动追踪

该模块用于检测宿主机内存使用量突发增长场景，并在触发时自动捕获内核上下文，便于诊断内存压力事件。

# memory burst
#
# If there is a memory used burst on the host, capture this kernel context.
#
# - Interval
# The sample interval of the memory used.
# Default: 10s
#
# - DeltaMemoryBurst
# A certain percentage of memory burst used. 100% that means, e.g.,
# memory used increased from 200MB to 400MB.
# Default: 100%
#
# - DeltaAnonThreshold
# A certain percentage of anon memory burst used. 100% that means, e.g.,
# anon memory used increased from 200MB to 400MB.
# Default: 70%
#
# - IntervalTracing
# Time since last run. Avoid frequently executing this tracing
# to prevent damage to the system.
# Default: 1800s
#
# - DumpProcessMaxNum
# How many processes to dump when this event is triggered.
# Default: 10
#
[AutoTracing.MemoryBurst]
	# DeltaMemoryBurst = 100
	# DeltaAnonThreshold = 70
	# Interval = 10
	# IntervalTracing = 1800
	# SlidingWindowLength = 60
	# DumpProcessMaxNum = 10

DeltaMemoryBurst：内存使用量突发增长百分比阈值。

默认 100%。表示内存使用量在采样窗口内增长的比例（例如从 200MB 增长到 400MB 即 100%）。达到该阈值时可能触发内存突发追踪。

说明：用于捕获整体内存使用量的急剧上升场景。
DeltaAnonThreshold：匿名页内存突发增长百分比阈值。

默认 70%。匿名内存（anonymous memory）增长比例阈值，匿名页是内存压力诊断的重要指标。

说明：重点监控易导致 OOM 或 swap 的匿名内存突发。
Interval：内存使用量采样间隔（秒）。

默认 10s。对宿主机内存使用情况进行周期性采样的时间间隔。

说明：采样频率影响检测灵敏度与开销。
IntervalTracing：连续运行最小间隔（秒）。

默认 1800s（30 分钟）。两次内存突发追踪之间的冷却时间，避免频繁执行对系统造成额外压力。

说明：防止追踪工具被过度触发。
DumpProcessMaxNum：触发事件时转储的最大进程数。

默认 10。当内存突发事件触发时，最多转储多少个相关进程的详细信息（包括内存占用、调用栈等）。

说明：控制输出数据量，避免单次事件产生过多诊断信息。

6.6 已知问题过滤（IssuesList）

# IssuesList for known issue filtering in autotracing
IssuesList = []

IssuesList：已知问题过滤器。格式 [["问题名称", "正则"], ...]。采集到的堆栈匹配正则时标记为对应问题名称，默认 []。当前用于 dload 追踪。

示例：IssuesList = [["known_issue1", "softlockup"], ["known_issue2", "alloc_pages.*failed"]]

注意：当前仅支持 dload 追踪的已知问题过滤，其他事件暂不支持。

7. 事件追踪配置

该 section 负责内核关键事件的捕获与延迟监控，包括软中断、内存回收、网络接收延迟、网卡事件及丢包监控等，是 HUATUO 内核级异常上下文采集的核心模块。

7.1 软中断禁用追踪

# linux kernel events capturing configuration
[EventTracing]
	# softirq
	#
	# tracing the softirq disabled events of linux kernel.
	#
	# - DisabledThreshold
	# When the disable duration of softirq exceeds the threshold, huatuo-bamai
	# will collect kernel context.
	# Default: 10000000 in nanoseconds, 10ms
	#
	[EventTracing.Softirq]
		# DisabledThreshold = 10000000

DisabledThreshold：软中断禁用持续时间阈值（纳秒）。默认 10000000 ns（10ms）。当内核软中断被禁用时间超过该阈值时，huatuo-bamai 将自动采集内核上下文。说明：软中断长时间禁用可能导致网络、定时器等延迟，适合诊断中断风暴或高负载场景。

7.2 内存回收阻塞追踪

# memreclaim
#
# The memory reclaim may block the process, if one process is blocked
# for a long time, reporting the events to userspace.
#
# - BlockedThreshold
# The blocked time when memory reclaiming.
# Default: 900000000ns, 900ms
#
[EventTracing.MemoryReclaim]
	# BlockedThreshold = 900000000

BlockedThreshold：内存回收阻塞时间阈值（纳秒）。默认 900000000 ns（900ms）。当单个进程因内存回收（reclaim）被阻塞超过该时间时，向用户态上报事件并捕获上下文。说明：内存回收阻塞是导致进程卡顿的常见原因，尤其在内存紧张的云原生环境中。

7.3 网络接收延迟追踪

# networking rx latency
#
# linux net stack rx latency for every tcp skbs.
#
# - Driver2NetRx
# The latency from driver to net rx, e.g., netif_receive_skb.
# Default: 5ms
#
# - Driver2TCP
# The latency from driver to tcp rx, e.g., tcp_v4_rcv.
# Default: 10ms
#
# - Driver2Userspace
# The latency from driver to userspace copy data, e.g., skb_copy_datagram_iovec.
# Default: 115ms
#
# - ExcludedContainerQos
# Blacklist: skip containers whose qos level matches.
# Values: "guaranteed", "burstable", "besteffort" (case-insensitive).
# Default: [].
#
# - ExcludedHostNetnamespace
# Don't care the skbs, packets in the host net namespace.
# Default: true
#
[EventTracing.NetRxLatency]
	# Driver2NetRx = 5
	# Driver2TCP = 10
	# Driver2Userspace = 115
	# ExcludedContainerQos = []
	ExcludedContainerQos = ["besteffort"]
	# ExcludedHostNetnamespace = true

Driver2NetRx：从驱动到网络层接收的延迟阈值（毫秒）。

默认 5ms。例如 netif_receive_skb 等函数的延迟监控阈值。
Driver2TCP：从驱动到 TCP 协议栈接收的延迟阈值（毫秒）。

默认 10ms。例如 tcp_v4_rcv 等函数的延迟监控。
Driver2Userspace：从驱动到用户态数据拷贝的延迟阈值（毫秒）。

默认 115ms。例如 skb_copy_datagram_iovec 等函数的延迟监控。
ExcludedContainerQos：排除的容器 QoS 级别，黑名单模式。

默认 [""]。不监控指定 QoS 级别的容器网络接收延迟（对应 Kubernetes Pod QoS：Guaranteed、Burstable、BestEffort，大小写不敏感）。

说明：通常排除 BestEffort 容器以减少噪声。
ExcludedHostNetnamespace：是否排除宿主机网络命名空间。

默认 true。不监控宿主机 net namespace 中的 skb 数据包延迟。

说明：聚焦容器网络流量，减少无关宿主机数据干扰。

7.4 网卡事件监控

# netdev events
#
# monitor the net device events.
#
# - DeviceList
# The net devices we take care of.
# Default: [] is empty, meaning no devices.
#
[EventTracing.Netdev]
	DeviceList = ["eth0", "eth1", "bond4", "lo"]

DeviceList：需要监控的网卡设备完整匹配正则列表。"eth0" 等字面量名称保持精确匹配，"bond[0-9]+" 等模式可匹配多块网卡。

默认示例包含 “eth0”, “eth1”, “bond4”, “lo”。为空列表时表示不监控任何设备。监控网络设备的物理链路状态事件等。

说明：精确指定感兴趣的网络接口，支持 bond、lo 等。

7.5 丢包监控（[EventTracing.Dropwatch]）

# dropwatch
#
# monitor packets dropped events in the Linux kernel.
#
# - ExcludedNeighInvalidate
# Don't care of neigh_invalidate drop events.
# Default: true
#
[EventTracing.Dropwatch]
	# ExcludedNeighInvalidate = true

ExcludedNeighInvalidate：是否排除邻居表无效化（neigh_invalidate）导致的丢包事件。

默认 true。

说明：邻居表相关丢包通常为正常行为，排除可减少误报。

7.6 硬件错误事件追踪（EventTracing.Ras）

# ras
#
# Hardware error event tracing (RAS: Reliability, Availability, Serviceability).
# Captures MCE, EDAC, ACPI/GHES, PCIe AER, and MCE threshold (THR) events via eBPF.
#
# - MceThrBackoff
# Minimum interval in seconds between consecutive MCE threshold (THR) event saves.
# THR events are fired by the local-APIC threshold interrupt and can storm at high
# frequency; this cooldown prevents flooding storage with redundant records.
# Default: 1800s (30 minutes)
#
[EventTracing.Ras]
    # MceThrBackoff = 1800

MceThrBackoff：MCE 阈值中断（THR）事件存储的最小间隔时间（秒）。

默认 1800s（30 分钟）。

说明：THR 事件由 CPU 本地 APIC 阈值中断触发，在硬件出现纠正性错误时可能以极高频率产生。该冷却时间用于防止存储系统被大量重复记录淹没，同时保证关键事件仍能被捕获。调低该值可获得更实时的事件记录，但需注意存储压力；在错误频发的环境中建议适当调高。

7.8 已知问题过滤（IssuesList）

# IssuesList for known issue filtering in event tracing
IssuesList = []

IssuesList：已知问题过滤器。格式和用法同 AutoTracing 的 IssuesList。匹配事件上下文，标记为对应问题名称，默认 []。

示例：IssuesList = [["known_issue1", "comm=ignored_process"]]

注意：当前仅支持 net_rx_latency 事件的过滤，其他事件暂不支持。

8. 指标采集器配置

该 section 定义各类系统与网络指标的采集规则。所有 Included/Excluded 字段底层共用同一套过滤逻辑（正则表达式）：

无规则：全部采集
仅 Excluded：黑名单，匹配即跳过
仅 Included：白名单，仅采集匹配项
两者并存：必须匹配 Included 且不匹配 Excluded

8.1 网卡统计

# Metric Collector
[MetricCollector]
	# Netdev statistic
	#
	# - EnableNetlink
	# Use netlink instead of procfs net/dev to get netdev statistic.
	# Only support the host environment to use `netlink` now.
	# Default is "false".
	#
	# - DeviceIncluded
	# Accept special devices in netdev statistic.
	# Default: "" (empty), meaning include all.
	#
	# - DeviceExcluded
	# Exclude special devices in netdev statistic.
	# Default: "" (empty), meaning exclude nothing.
	#
	# Filter logic see MetricCollector section header.
	#
	[MetricCollector.NetdevStats]
		# EnableNetlink = false
		# DeviceIncluded = ""
		DeviceExcluded = "^(lo)|(docker\\w*)|(veth\\w*)$"

EnableNetlink：是否使用 netlink 而非 procfs 获取网卡统计。

默认 false。仅宿主机环境支持 netlink。

说明：netlink 方式通常更高效，但需内核支持。
DeviceIncluded：需要纳入统计的网卡设备正则。默认空（全部采集）。
DeviceExcluded：需排除的网卡设备正则。如：排除 lo、docker、veth 等虚拟接口。

8.2 网卡 DCB（Data Center Bridging）采集

# netdev dcb, DCB (Data Center Bridging)
#
# Collecting the DCB PFC (Priority-based Flow Control).
#
# - DeviceList
# The net devices we take care of.
# Default: [] is empty, meaning no devices.
#
[MetricCollector.NetdevDCB]
	DeviceList = ["eth0", "eth1"]

DeviceList：需要采集 DCB（优先流控 PFC）信息的网卡完整匹配正则列表。

默认空。

说明：主要用于数据中心网络环境下的优先级流控监控。

8.3 网卡硬件统计

# netdev hardware statistic
#
# Collecting the hardware statistic of net devices, e.g, rx_dropped.
#
# - DeviceList
# The net devices we take care of.
# Default: [] is empty, meaning no devices.
#
[MetricCollector.NetdevHW]
	DeviceList = ["eth0", "eth1"]

DeviceList：需要采集硬件层统计（如 rx_dropped）的网卡完整匹配正则列表。

默认空。

说明：聚焦硬件丢包、错误等底层指标。

8.4 Qdisc（队列规则）采集

# Qdisc
#
# - DeviceIncluded / DeviceExcluded
# Same as above.
#
[MetricCollector.Qdisc]
	# DeviceIncluded = ""
	DeviceExcluded = "^(lo)|(docker\\w*)|(veth\\w*)$"

DeviceIncluded / DeviceExcluded：同 MetricCollector 描述的过滤逻辑。

说明：用于诊断流量整形、调度延迟等问题。

8.5 vmstat 指标采集

# vmstat
#
# This metric supports host vmstat and cgroup vmstat.
# - IncludedOnHost / ExcludedOnHost: same filter logic, for host /proc/vmstat.
# - IncludedOnContainer / ExcludedOnContainer: same, for cgroup containers memory.stat.
#
[MetricCollector.Vmstat]
	IncludedOnHost = "allocstall|nr_active_anon|nr_active_file|nr_boost_pages|nr_dirty|nr_free_pages|nr_inactive_anon|nr_inactive_file|nr_kswapd_boost|nr_mlock|nr_shmem|nr_slab_reclaimable|nr_slab_unreclaimable|nr_unevictable|nr_writeback|numa_pages_migrated|pgdeactivate|pgrefill|pgscan_direct|pgscan_kswapd|pgsteal_direct|pgsteal_kswapd"
	ExcludedOnHost = "total"
	IncludedOnContainer = "active_anon|active_file|dirty|inactive_anon|inactive_file|pgdeactivate|pgrefill|pgscan_direct|pgscan_kswapd|pgsteal_direct|pgsteal_kswapd|shmem|unevictable|writeback|pgscan_globaldirect|pgscan_globalkswapd|pgscan_cswapd|pgsteal_cswapd|pgsteal_globaldirect|pgsteal_globalkswapd"
	ExcludedOnContainer = "total"

IncludedOnHost / ExcludedOnHost：宿主机 /proc/vmstat 的过滤字段正则。
IncludedOnContainer / ExcludedOnContainer：容器 cgroup memory.stat 的过滤字段正则。

说明：精细控制 vmstat 指标采集，支持主机与容器差异化配置，避免采集无关字段。

8.6 其他指标采集

# MemoryEvents/Netstat/MountPointStat
#
# - Included / Excluded: same as above.
# - MountPointsIncluded: whitelist only (no Excluded), same logic.
#
[MetricCollector.MemoryEvents]
	Included = "watermark_inc|watermark_dec"
	# Excluded = ""
[MetricCollector.Netstat]
	# Excluded = ""
	# Included = ""

# MountPointStat
[MetricCollector.MountPointStat]
	MountPointsIncluded = "(^/home$)|(^/$)|(^/boot$)"

Included / Excluded（MemoryEvents、Netstat）：同上过滤逻辑。
MountPointsIncluded：采集挂载点统计的路径正则。默认示例含 /、/home、/boot。

说明：用于监控关键文件系统使用情况。

9. Pod 配置

该 section 用于从 kubelet 获取 Pod 信息，实现容器与 Pod 级别的标签关联和指标隔离。

# Pod Configuration
#
# Configure these parameters for fetching pods from kubelet.
#
# - KubeletReadOnlyPort
# The KubeletReadOnlyPort is kubelet read-only port for the Kubelet to serve on with
# no authentication/authorization. The port number must be between 1 and 65535, inclusive.
# Setting this field to 0 disables fetching pods from kubelet read-only service.
# Default: 10255
#
# - KubeletAuthorizedPort
# The port is the HTTPs port of the kubelet. The port number must be between 1 and 65535,
# inclusive. Setting this field to 0 disables fetching pods from kubelet HTTPS port.
# Default: 10250
#
# - KubeletClientCertPath
# https://kubernetes.io/docs/setup/best-practices/certificates/
#
# Client certificate and private key file name. One file or two files:
# "/path/to/xxx-kubelet-client.crt,/path/to/xxx-kubelet-client.key",
# "/path/to/kubelet-client-current.pem"
#
# You can disable this kubelet fetching pods, for bare metal service, by
# KubeletReadOnlyPort = 0, and KubeletAuthorizedPort = 0.
#
[Pod]
	KubeletClientCertPath = "/etc/kubernetes/pki/apiserver-kubelet-client.crt,/etc/kubernetes/pki/apiserver-kubelet-client.key"

KubeletReadOnlyPort：kubelet 只读端口。

默认 10255。用于无认证方式从 kubelet 获取 Pod 列表。设置为 0 时禁用该方式。

说明：端口范围 1-65535，适合测试或非安全环境。
KubeletAuthorizedPort：kubelet HTTPS 授权端口。

默认 10250。用于安全方式（证书认证）从 kubelet 获取 Pod 信息。设置为 0 时禁用。

说明：生产环境推荐使用该端口结合证书认证。
KubeletClientCertPath：kubelet 客户端证书及私钥路径。

支持格式："/path/to/xxx-kubelet-client.crt,/path/to/xxx-kubelet-client.key" 或单文件 PEM 格式。

说明：参考 Kubernetes 证书最佳实践，用于 HTTPS 端口的 mTLS 认证。在裸金属或非 Kubernetes 环境中可通过将两个端口设为 0 来禁用 Pod 获取功能。

10. 事件监听配置

该 section 用于控制 POST /v1/events/watch SSE 流式接口的运行行为，外部客户端可通过该接口实时订阅内核事件数据流。

# Events Watch Configuration
#
# Controls the behavior of the POST /v1/events/watch SSE streaming API,
# which allows external clients to subscribe to kernel events in real-time.
#
# - MaxClients
# Maximum number of concurrent clients allowed to hold an open /v1/events/watch
# connection. Once the limit is reached, new requests are rejected with HTTP 429
# (Too Many Requests) until an existing client disconnects.
# Default: 100
#
# - KeepAliveInterval
# Interval in seconds at which the server sends an SSE comment ping to each
# connected client. The ping keeps the HTTP connection alive through load
# balancers and proxies that would otherwise time out idle connections.
# If writing the ping fails three consecutive times the server treats the
# client as gone and closes the connection.
# Default: 30s
#
[EventsWatch]
    # MaxClients = 100
    # KeepAliveInterval = 30

MaxClients：最大并发客户端连接数。

默认 100。允许同时持有 /v1/events/watch 长连接的客户端上限。当连接数达到该值时，新请求将以 HTTP 429（Too Many Requests）被拒绝，直到已有客户端断开连接后方可接入。

说明：根据节点资源和实际订阅方数量合理调整。每个长连接会占用一个 goroutine 和一个订阅通道（缓冲 256 条），连接数过多时注意内存压力。
KeepAliveInterval：探活心跳间隔（秒）。

默认 30s。服务端每隔该时间向已连接客户端发送一条 SSE 注释行（": ping"）以维持 HTTP 长连接，防止负载均衡器或代理因连接空闲而超时断开。

说明：若服务端连续 3 次写入探活消息（或事件数据）均失败，则视为客户端已断开并主动关闭连接，释放相关资源。建议该值不超过上游代理的 idle timeout，生产环境常见值为 15–60s。

11. 命令行参数

huatuo-bamai 支持以下命令行参数：

huatuo-bamai --region <region> [选项]

参数	说明	默认值
`--config`	配置文件名	`huatuo-bamai.conf`
`--config-dir`	配置文件目录	`conf`
`--bpf-dir`	BPF 对象文件目录	`bpf`
`--tools-bin-dir`	追踪工具二进制目录	`bin`
`--region`	部署区域（必填）	-
`--disable-kubelet`	禁用 kubelet Pod 获取	`false`
`--disable-storage`	禁用存储后端	`false`
`--disable-cgroup`	禁用自身 cgroup 资源限制	`false`
`--disable-tracing`	禁用指定追踪模块（可多次指定）	-
`--log-debug`	强制设置日志级别为 Debug	`false`
`--dry-run`	仅加载测试，启动后优雅退出	`false`
`--procfs-prefix`	procfs 挂载点前缀	-

12. 配置覆盖原则

当同一配置项同时存在于命令行参数和配置文件时，遵循以下优先级：

命令行参数 > 配置文件 > 内置默认值

具体规则：

日志级别：--log-debug > 配置文件 [Log] Level > 内置默认值 Info
- --log-debug 具有最高优先级，无论配置文件中 Level 为何值均强制设为 Debug
- 配置文件中显式设置 Level 时覆盖内置默认值
- 均未设置时使用默认值 Info
追踪黑名单：--disable-tracing 与配置文件 BlackList 合并（两者互补，非覆盖）
其他布尔开关（--disable-kubelet、--disable-storage、--disable-cgroup）：命令行显式设置时覆盖配置文件

13. 配置最佳实践与注意事项

资源控制：生产环境优先调整 RuntimeCgroup 中的 CPU 和内存限制，避免影响业务容器。
存储选择：小规模部署可优先使用 LocalFile 进行本地排查；大规模集群推荐配置 Elasticsearch 实现集中存储与查询。
自动追踪调优：根据业务负载特征调整阈值，过低阈值会导致频繁触发，过高则可能遗漏问题。建议在测试环境逐步验证。
安全性：ES 配置中请使用强密码，并考虑启用 HTTPS；避免在配置文件中硬编码敏感信息。
兼容性：配置参数受内核版本、硬件环境影响，建议结合 HUATUO 官方文档验证。

通过合理配置 huatuo-bamai.conf，可充分发挥 HUATUO 在内核级异常检测与智能追踪方面的优势，有效提升云原生系统的可观测性和故障诊断效率。如需针对特定场景的深度定制，欢迎提供更多环境细节进一步讨论。

5 - 核心特性

5.1 - 内核全景观测

当前版本支持的指标:

CPU 系统

调度延迟

如下指标可以观测进程调度延迟状态，即一个进程从变得可运行的时刻（即被放进运行队列），到它真正开始在 CPU 上执行的这段时间。

# HELP huatuo_bamai_runqlat_container_latency cpu run queue latency for the containers
# TYPE huatuo_bamai_runqlat_container_latency gauge
huatuo_bamai_runqlat_container_latency{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev",zone="0"} 226
huatuo_bamai_runqlat_container_latency{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev",zone="1"} 0
huatuo_bamai_runqlat_container_latency{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev",zone="2"} 0
huatuo_bamai_runqlat_container_latency{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev",zone="3"} 0

# HELP huatuo_bamai_runqlat_latency cpu run queue latency for the host
# TYPE huatuo_bamai_runqlat_latency gauge
huatuo_bamai_runqlat_latency{host="hostname",region="dev",zone="0"} 35100
huatuo_bamai_runqlat_latency{host="hostname",region="dev",zone="1"} 0
huatuo_bamai_runqlat_latency{host="hostname",region="dev",zone="2"} 0
huatuo_bamai_runqlat_latency{host="hostname",region="dev",zone="3"} 0

指标	意义	单位	对象	取值	标签
runqlat_container_latency	进程调度延迟计数： zone0, 0~10ms zone1, 10-20ms zone2, 20-50ms zone3, 50+ms	计数	容器	eBPF	container_host, container_hostnamespace, container_level, container_name, container_type, host, region, zone
runqlat_latency	进程调度延迟计数： zone0, 0~10ms zone1, 10-20ms zone2, 20-50ms zone3, 50+ms	计数	物理机	eBPF	host, region, zone

中断延迟

系统中各类软中断在不同CPU上的响应延迟指标（当前只采集了 NET_RX/NET_TX）。

# HELP huatuo_bamai_softirq_latency softirq latency
# TYPE huatuo_bamai_softirq_latency gauge
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_RX",zone="0"} 125
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_RX",zone="1"} 2
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_RX",zone="2"} 0
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_RX",zone="3"} 0
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_TX",zone="0"} 0
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_TX",zone="1"} 0
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_TX",zone="2"} 0
huatuo_bamai_softirq_latency{cpuid="0",host="hostname",region="dev",type="NET_TX",zone="3"} 0
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_RX",zone="0"} 110
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_RX",zone="1"} 0
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_RX",zone="2"} 1
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_RX",zone="3"} 0
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_TX",zone="0"} 0
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_TX",zone="1"} 0
huatuo_bamai_softirq_latency{cpuid="1",host="hostname",region="dev",type="NET_TX",zone="2"} 0

指标	意义	单位	对象	取值	标签
softirq_latency	软中断响应延迟在不同 zone 的计数： zone0, 0-10us zone1, 10-100us zone2, 100-1000us zone3, 1+ms	计数	物理机	eBPF	cpuid, host, region, type, zone

资源利用率

通过如下指标可以观测，物理机，容器的 CPU 资源使用情况，prometheus 指标格式：

# HELP huatuo_bamai_cpu_util_sys cpu sys for the host
# TYPE huatuo_bamai_cpu_util_sys gauge
huatuo_bamai_cpu_util_sys{host="hostname",region="dev"} 6.268857848549965e-06
# HELP huatuo_bamai_cpu_util_total cpu total for the host
# TYPE huatuo_bamai_cpu_util_total gauge
huatuo_bamai_cpu_util_total{host="hostname",region="dev"} 1.7736934944144352e-05
# HELP huatuo_bamai_cpu_util_usr cpu usr for the host
# TYPE huatuo_bamai_cpu_util_usr gauge
huatuo_bamai_cpu_util_usr{host="hostname",region="dev"} 1.1468077095594387e-05

# HELP huatuo_bamai_cpu_util_container_sys cpu sys for the containers
# TYPE huatuo_bamai_cpu_util_container_sys gauge
huatuo_bamai_cpu_util_container_sys{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1.6708593420881415e-07
# HELP huatuo_bamai_cpu_util_container_total cpu total for the containers
# TYPE huatuo_bamai_cpu_util_container_total gauge
huatuo_bamai_cpu_util_container_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 3.379584661890774e-07
# HELP huatuo_bamai_cpu_util_container_usr cpu usr for the containers
# TYPE huatuo_bamai_cpu_util_container_usr gauge
huatuo_bamai_cpu_util_container_usr{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1.7087253017325962e-07

指标	意义	单位	对象	标签
cpu_util_sys	CPU 内核态利用率	%	物理机	host, region
cpu_util_usr	CPU 用户态利用率	%	物理机	host, region
cpu_util_total	CPU 总利用率	%	物理机	host, region
cpu_util_container_sys	CPU 内核态利用率	%	容器	container_host,container_hostnamespace,container_level,container_name,container_type,host,region
cpu_util_container_usr	CPU 用户态利用率	%	容器	container_host,container_hostnamespace,container_level,container_name,container_type,host,region
cpu_util_container_total	CPU 总利用率	%	容器	container_host,container_hostnamespace,container_level,container_name,container_type,host,region

资源配置

通过如下指标可以了解容器 CPU 资源配置情况，prometheus 指标格式：

# HELP huatuo_bamai_cpu_util_container_cores cpu core number for the containers
# TYPE huatuo_bamai_cpu_util_container_cores gauge
huatuo_bamai_cpu_util_container_cores{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="Burstable",container_name="coredns",container_type="Normal",host="hostname",region="dev"} 6

指标	意义	单位	对象	标签
cpu_util_container_cores	CPU 核心数	个	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

资源争抢

这些指标体现了容器争抢，被限制等状态，prometheus 指标格式：

# HELP huatuo_bamai_cpu_stat_container_nr_throttled throttle nr for the containers
# TYPE huatuo_bamai_cpu_stat_container_nr_throttled gauge
huatuo_bamai_cpu_stat_container_nr_throttled{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_cpu_stat_container_throttled_time throttle time for the containers
# TYPE huatuo_bamai_cpu_stat_container_throttled_time gauge
huatuo_bamai_cpu_stat_container_throttled_time{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
cpu_stat_container_nr_throttled	当前 cgroup 被 throttled 限制的次数	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
cpu_stat_container_throttled_time	当前 cgroup 被 throttled 限制的总时间	纳秒	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

Ref:

此外，滴滴内核支持如下争抢指标，未来会开放：

# HELP huatuo_bamai_cpu_stat_container_wait_rate wait rate for the containers
# TYPE huatuo_bamai_cpu_stat_container_wait_rate gauge
huatuo_bamai_cpu_stat_container_wait_rate{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_cpu_stat_container_throttle_wait_rate throttle wait rate for the containers
# TYPE huatuo_bamai_cpu_stat_container_throttle_wait_rate gauge
huatuo_bamai_cpu_stat_container_throttle_wait_rate{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_cpu_stat_container_inner_wait_rate inner wait rate for the containers
# TYPE huatuo_bamai_cpu_stat_container_inner_wait_rate gauge
huatuo_bamai_cpu_stat_container_inner_wait_rate{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_cpu_stat_container_exter_wait_rate exter wait rate for the containers
# TYPE huatuo_bamai_cpu_stat_container_exter_wait_rate gauge
huatuo_bamai_cpu_stat_container_exter_wait_rate{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

资源突发

如下指标体现了容器出现资源突发使用状态：

# HELP huatuo_bamai_cpu_stat_container_nr_bursts burst nr for the containers
# TYPE huatuo_bamai_cpu_stat_container_nr_bursts gauge
huatuo_bamai_cpu_stat_container_nr_bursts{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_cpu_stat_container_burst_time burst time for the containers
# TYPE huatuo_bamai_cpu_stat_container_burst_time gauge
huatuo_bamai_cpu_stat_container_burst_time{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
cpu_stat_container_burst_time	所有在各个周期中超过 quota 部分所累计使用的真实墙钟时间	纳秒	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
cpu_stat_container_nr_bursts	发生超额使用的周期数量	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

资源负载

这些指标体现物理机、容器负载状态。

# HELP huatuo_bamai_loadavg_load1 system load average, 1 minute
# TYPE huatuo_bamai_loadavg_load1 gauge
huatuo_bamai_loadavg_load1{host="hostname",region="dev"} 0.3
# HELP huatuo_bamai_loadavg_load15 system load average, 15 minutes
# TYPE huatuo_bamai_loadavg_load15 gauge
huatuo_bamai_loadavg_load15{host="hostname",region="dev"} 0.22
# HELP huatuo_bamai_loadavg_load5 system load average, 5 minutes
# TYPE huatuo_bamai_loadavg_load5 gauge
huatuo_bamai_loadavg_load5{host="hostname",region="dev"} 0.2
# HELP huatuo_bamai_loadavg_container_nr_running nr_running of container
# TYPE huatuo_bamai_loadavg_container_nr_running gauge
huatuo_bamai_loadavg_container_nr_running{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_loadavg_container_nr_uninterruptible nr_uninterruptible of container
# TYPE huatuo_bamai_loadavg_container_nr_uninterruptible gauge
huatuo_bamai_loadavg_container_nr_uninterruptible{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签	备注
loadavg_load1	系统过去 1 分钟的平均负载	计数	物理机	host, region
loadavg_load5	系统过去 5 分钟的平均负载	计数	物理机	host, region
loadavg_load15	系统过去 15 分钟的平均负载	计数	物理机	host, region
loadavg_container_container_nr_running	容器中运行的任务数量	计数	容器	host, region	只支持 cgroup v1
loadavg_container_container_nr_uninterruptible	容器中不可中断任务的数量	计数	容器	host, region	只支持 cgroup v1

内存系统

资源回收

系统内存回收行为可能导致进程被阻塞。通过这些指标可以了解系统内存状态。

# HELP huatuo_bamai_memory_free_allocpages_stall time stalled in alloc pages
# TYPE huatuo_bamai_memory_free_allocpages_stall gauge
huatuo_bamai_memory_free_allocpages_stall{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_free_compaction_stall time stalled in memory compaction
# TYPE huatuo_bamai_memory_free_compaction_stall gauge
huatuo_bamai_memory_free_compaction_stall{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_reclaim_container_directstall counter of cgroup reclaim when try_charge
# TYPE huatuo_bamai_memory_reclaim_container_directstall gauge
huatuo_bamai_memory_reclaim_container_directstall{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	取值	标签
memory_free_allocpages_stall	系统在分配内存页过程中的耗时计数	纳秒	物理机	eBPF	host, region
memory_free_compaction_stall	系统在规整内存页过程中的耗时计数	纳秒	物理机	eBPF	host, region
memory_reclaim_container_directstall	容器直接内存事件次数	计数	容器	eBPF	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

注意：memory_others_container_directstall_time、memory_others_container_asyncreclaim_time、memory_others_container_local_direct_reclaim_time 指标读取的是滴滴云定制内核提供的 memory cgroup 扩展接口（memory.directstall_stat、memory.asynreclaim_stat、memory.local_direct_reclaim_time）。主线内核及常见发行版内核不提供这些接口，因此这些指标不会输出，属预期行为，无需额外加载内核模块。在标准内核上观测容器直接回收（direct reclaim）行为，请使用上表基于 eBPF 实现的 memory_reclaim_container_directstall。

资源状态

通过如下指标可以了解整体系统、容器的内存状态。

# HELP huatuo_bamai_memory_vmstat_container_active_anon cgroup memory.stat active_anon
# TYPE huatuo_bamai_memory_vmstat_container_active_anon gauge
huatuo_bamai_memory_vmstat_container_active_anon{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1.47456e+07
# HELP huatuo_bamai_memory_vmstat_container_active_file cgroup memory.stat active_file
# TYPE huatuo_bamai_memory_vmstat_container_active_file gauge
huatuo_bamai_memory_vmstat_container_active_file{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 2.3617536e+07
# HELP huatuo_bamai_memory_vmstat_container_file_dirty cgroup memory.stat file_dirty
# TYPE huatuo_bamai_memory_vmstat_container_file_dirty gauge
huatuo_bamai_memory_vmstat_container_file_dirty{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_file_writeback cgroup memory.stat file_writeback
# TYPE huatuo_bamai_memory_vmstat_container_file_writeback gauge
huatuo_bamai_memory_vmstat_container_file_writeback{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_inactive_anon cgroup memory.stat inactive_anon
# TYPE huatuo_bamai_memory_vmstat_container_inactive_anon gauge
huatuo_bamai_memory_vmstat_container_inactive_anon{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_inactive_file cgroup memory.stat inactive_file
# TYPE huatuo_bamai_memory_vmstat_container_inactive_file gauge
huatuo_bamai_memory_vmstat_container_inactive_file{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 65536
# HELP huatuo_bamai_memory_vmstat_container_pgdeactivate cgroup memory.stat pgdeactivate
# TYPE huatuo_bamai_memory_vmstat_container_pgdeactivate gauge
huatuo_bamai_memory_vmstat_container_pgdeactivate{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_pgrefill cgroup memory.stat pgrefill
# TYPE huatuo_bamai_memory_vmstat_container_pgrefill gauge
huatuo_bamai_memory_vmstat_container_pgrefill{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_pgscan_direct cgroup memory.stat pgscan_direct
# TYPE huatuo_bamai_memory_vmstat_container_pgscan_direct gauge
huatuo_bamai_memory_vmstat_container_pgscan_direct{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_pgscan_kswapd cgroup memory.stat pgscan_kswapd
# TYPE huatuo_bamai_memory_vmstat_container_pgscan_kswapd gauge
huatuo_bamai_memory_vmstat_container_pgscan_kswapd{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_pgsteal_direct cgroup memory.stat pgsteal_direct
# TYPE huatuo_bamai_memory_vmstat_container_pgsteal_direct gauge
huatuo_bamai_memory_vmstat_container_pgsteal_direct{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_pgsteal_kswapd cgroup memory.stat pgsteal_kswapd
# TYPE huatuo_bamai_memory_vmstat_container_pgsteal_kswapd gauge
huatuo_bamai_memory_vmstat_container_pgsteal_kswapd{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_shmem cgroup memory.stat shmem
# TYPE huatuo_bamai_memory_vmstat_container_shmem gauge
huatuo_bamai_memory_vmstat_container_shmem{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_shmem_thp cgroup memory.stat shmem_thp
# TYPE huatuo_bamai_memory_vmstat_container_shmem_thp gauge
huatuo_bamai_memory_vmstat_container_shmem_thp{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_container_unevictable cgroup memory.stat unevictable
# TYPE huatuo_bamai_memory_vmstat_container_unevictable gauge
huatuo_bamai_memory_vmstat_container_unevictable{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
memory_vmstat_container_active_file	活跃的文件内存数	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_active_anon	活跃的匿名内存数	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_inactive_file	非活跃的文件内存数	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_inactive_anon	非活跃的匿名内存数	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_file_dirty	已修改且还未写入磁盘的文件内存大小	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_file_writeback	已修改且正等待写入磁盘的文件内存大小	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_dirty	已修改且还未写入磁盘的内存大小	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_writeback	已修改且正等待写入磁盘的文件，匿名内存大小	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgdeactivate	将页面从 active LRU 移动到 inactive LRU 的数量	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgrefill	在 active LRU 链表上被扫描的页面总数	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgscan_direct	直接回收时，在 inactive LRU 上扫描过的页面总数	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgscan_kswapd	kswapd 在 inactive LRU 链表上扫描过的页面总数	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgsteal_direct	直接回收时，成功从 inactive LRU 回收的页面总数	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_pgsteal_kswapd	kswapd 成功从 inactive LRU 回收的页面总数	页数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_vmstat_container_unevictable	不可回收的页面字节数	字节, Bytes	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

物理机内存资源指标：

# HELP huatuo_bamai_memory_vmstat_allocstall_device /proc/vmstat allocstall_device
# TYPE huatuo_bamai_memory_vmstat_allocstall_device gauge
huatuo_bamai_memory_vmstat_allocstall_device{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_allocstall_dma /proc/vmstat allocstall_dma
# TYPE huatuo_bamai_memory_vmstat_allocstall_dma gauge
huatuo_bamai_memory_vmstat_allocstall_dma{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_allocstall_dma32 /proc/vmstat allocstall_dma32
# TYPE huatuo_bamai_memory_vmstat_allocstall_dma32 gauge
huatuo_bamai_memory_vmstat_allocstall_dma32{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_allocstall_movable /proc/vmstat allocstall_movable
# TYPE huatuo_bamai_memory_vmstat_allocstall_movable gauge
huatuo_bamai_memory_vmstat_allocstall_movable{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_allocstall_normal /proc/vmstat allocstall_normal
# TYPE huatuo_bamai_memory_vmstat_allocstall_normal gauge
huatuo_bamai_memory_vmstat_allocstall_normal{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_nr_active_anon /proc/vmstat nr_active_anon
# TYPE huatuo_bamai_memory_vmstat_nr_active_anon gauge
huatuo_bamai_memory_vmstat_nr_active_anon{host="hostname",region="dev"} 155449
# HELP huatuo_bamai_memory_vmstat_nr_active_file /proc/vmstat nr_active_file
# TYPE huatuo_bamai_memory_vmstat_nr_active_file gauge
huatuo_bamai_memory_vmstat_nr_active_file{host="hostname",region="dev"} 212425
# HELP huatuo_bamai_memory_vmstat_nr_dirty /proc/vmstat nr_dirty
# TYPE huatuo_bamai_memory_vmstat_nr_dirty gauge
huatuo_bamai_memory_vmstat_nr_dirty{host="hostname",region="dev"} 19047
# HELP huatuo_bamai_memory_vmstat_nr_dirty_background_threshold /proc/vmstat nr_dirty_background_threshold
# TYPE huatuo_bamai_memory_vmstat_nr_dirty_background_threshold gauge
huatuo_bamai_memory_vmstat_nr_dirty_background_threshold{host="hostname",region="dev"} 379858
# HELP huatuo_bamai_memory_vmstat_nr_dirty_threshold /proc/vmstat nr_dirty_threshold
# TYPE huatuo_bamai_memory_vmstat_nr_dirty_threshold gauge
huatuo_bamai_memory_vmstat_nr_dirty_threshold{host="hostname",region="dev"} 760646
# HELP huatuo_bamai_memory_vmstat_nr_free_pages /proc/vmstat nr_free_pages
# TYPE huatuo_bamai_memory_vmstat_nr_free_pages gauge
huatuo_bamai_memory_vmstat_nr_free_pages{host="hostname",region="dev"} 3.20535e+06
# HELP huatuo_bamai_memory_vmstat_nr_inactive_anon /proc/vmstat nr_inactive_anon
# TYPE huatuo_bamai_memory_vmstat_nr_inactive_anon gauge
huatuo_bamai_memory_vmstat_nr_inactive_anon{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_nr_inactive_file /proc/vmstat nr_inactive_file
# TYPE huatuo_bamai_memory_vmstat_nr_inactive_file gauge
huatuo_bamai_memory_vmstat_nr_inactive_file{host="hostname",region="dev"} 428518
# HELP huatuo_bamai_memory_vmstat_nr_mlock /proc/vmstat nr_mlock
# TYPE huatuo_bamai_memory_vmstat_nr_mlock gauge
huatuo_bamai_memory_vmstat_nr_mlock{host="hostname",region="dev"} 6821
# HELP huatuo_bamai_memory_vmstat_nr_shmem /proc/vmstat nr_shmem
# TYPE huatuo_bamai_memory_vmstat_nr_shmem gauge
huatuo_bamai_memory_vmstat_nr_shmem{host="hostname",region="dev"} 541
# HELP huatuo_bamai_memory_vmstat_nr_shmem_hugepages /proc/vmstat nr_shmem_hugepages
# TYPE huatuo_bamai_memory_vmstat_nr_shmem_hugepages gauge
huatuo_bamai_memory_vmstat_nr_shmem_hugepages{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_nr_shmem_pmdmapped /proc/vmstat nr_shmem_pmdmapped
# TYPE huatuo_bamai_memory_vmstat_nr_shmem_pmdmapped gauge
huatuo_bamai_memory_vmstat_nr_shmem_pmdmapped{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_nr_slab_reclaimable /proc/vmstat nr_slab_reclaimable
# TYPE huatuo_bamai_memory_vmstat_nr_slab_reclaimable gauge
huatuo_bamai_memory_vmstat_nr_slab_reclaimable{host="hostname",region="dev"} 22322
# HELP huatuo_bamai_memory_vmstat_nr_slab_unreclaimable /proc/vmstat nr_slab_unreclaimable
# TYPE huatuo_bamai_memory_vmstat_nr_slab_unreclaimable gauge
huatuo_bamai_memory_vmstat_nr_slab_unreclaimable{host="hostname",region="dev"} 24168
# HELP huatuo_bamai_memory_vmstat_nr_unevictable /proc/vmstat nr_unevictable
# TYPE huatuo_bamai_memory_vmstat_nr_unevictable gauge
huatuo_bamai_memory_vmstat_nr_unevictable{host="hostname",region="dev"} 6839
# HELP huatuo_bamai_memory_vmstat_nr_writeback /proc/vmstat nr_writeback
# TYPE huatuo_bamai_memory_vmstat_nr_writeback gauge
huatuo_bamai_memory_vmstat_nr_writeback{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_nr_writeback_temp /proc/vmstat nr_writeback_temp
# TYPE huatuo_bamai_memory_vmstat_nr_writeback_temp gauge
huatuo_bamai_memory_vmstat_nr_writeback_temp{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_numa_pages_migrated /proc/vmstat numa_pages_migrated
# TYPE huatuo_bamai_memory_vmstat_numa_pages_migrated gauge
huatuo_bamai_memory_vmstat_numa_pages_migrated{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgdeactivate /proc/vmstat pgdeactivate
# TYPE huatuo_bamai_memory_vmstat_pgdeactivate gauge
huatuo_bamai_memory_vmstat_pgdeactivate{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgrefill /proc/vmstat pgrefill
# TYPE huatuo_bamai_memory_vmstat_pgrefill gauge
huatuo_bamai_memory_vmstat_pgrefill{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgscan_direct /proc/vmstat pgscan_direct
# TYPE huatuo_bamai_memory_vmstat_pgscan_direct gauge
huatuo_bamai_memory_vmstat_pgscan_direct{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgscan_direct_throttle /proc/vmstat pgscan_direct_throttle
# TYPE huatuo_bamai_memory_vmstat_pgscan_direct_throttle gauge
huatuo_bamai_memory_vmstat_pgscan_direct_throttle{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgscan_kswapd /proc/vmstat pgscan_kswapd
# TYPE huatuo_bamai_memory_vmstat_pgscan_kswapd gauge
huatuo_bamai_memory_vmstat_pgscan_kswapd{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgsteal_direct /proc/vmstat pgsteal_direct
# TYPE huatuo_bamai_memory_vmstat_pgsteal_direct gauge
huatuo_bamai_memory_vmstat_pgsteal_direct{host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_vmstat_pgsteal_kswapd /proc/vmstat pgsteal_kswapd
# TYPE huatuo_bamai_memory_vmstat_pgsteal_kswapd gauge
huatuo_bamai_memory_vmstat_pgsteal_kswapd{host="hostname",region="dev"} 0

页面状态与 LRU 分布, Page state & LRU

指标	意义	单位	对象	标签
nr_free_pages	空闲页面总数（伙伴系统可直接分配）。	页面	物理机	host, region
nr_inactive_anon	非活跃匿名页面数	页面	物理机	host, region
nr_inactive_file	活跃文件页面数	页面	物理机	host, region
nr_active_anon	活跃匿名页面数	页面	物理机	host, region
nr_active_file	活跃文件页面数	页面	物理机	host, region
nr_unevictable	不可回收页面数（mlocked、hugetlbfs 等）	页面	物理机	host, region
nr_mlock	被 mlock() 锁定的页面数	页面	物理机	host, region
nr_shmem	tmpfs / shmem 使用的页面数	页面	物理机	host, region
nr_slab_reclaimable	可回收的 slab 缓存对象	页面	物理机	host, region
nr_slab_unreclaimable	不可回收的 slab 缓存对象	页面	物理机	host, region

脏页与写回控制, Dirty & writeback thresholds

指标	意义	单位	对象	标签
nr_dirty	当前脏页数	页面	物理机	host, region
nr_writeback	正在写回的页面数	页面	物理机	host, region
nr_dirty_threshold	脏页达到此阈值时开始强制写回（dirty_background_ratio / dirty_ratio 决定）	页面	物理机	host, region
nr_dirty_background_threshold	后台写回开始的阈值	页面	物理机	host, region
nr_dirty_background_threshold	后台写回开始的阈值	页面	物理机	host, region

页面错误与换页, Page fault & swapping

指标	意义	单位	对象	标签
pgfault	总缺页异常次数	计数	物理机	host, region
pgmajfault	主缺页异常次数	计数	物理机	host, region
pgpgin	从块设备读入的页面数	页面	物理机	host, region
pgpgout	写出到块设备的页面数	页面	物理机	host, region
pswpin/pswpout	换入/换出的页面数（swap）	页面	物理机	host, region

回收与扫描, Reclaim & scanning

指标	意义	单位	对象	标签
pgscan_kswapd/direct/khugepaged	kswapd/直接回收/khugepaged 扫描的页面数	页面数	物理机	host, region
pgsteal_kswapd/direct/khugepaged	回收成功的页面数	页面数	物理机	host, region

透明大页, THP

指标	意义	单位	对象	标签
thp_fault_alloc	缺页时成功分配 THP 的次数	计数	物理机	host, region
thp_fault_fallback	缺页时分配 THP 失败而回落普通页的次数	计数	物理机	host, region
thp_collapse_alloc	khugepaged 折叠成 THP 的成功次数	计数	物理机	host, region
thp_collapse_alloc_failed	khugepaged 折叠 THP 的失败次数	计数	物理机	host, region

NUMA 相关统计, NUMA balancing & allocation

指标	意义	单位	对象	标签
numa_hit	进程希望从某个节点分配内存，并且成功在该节点上分配到的页面总数。	计数	物理机	host, region
numa_miss	进程原本希望从其他节点分配，但由于目标节点内存不足等原因，最终在本节点分配成功的页面数。	计数	物理机	host, region
numa_foreign	进程原本希望从本节点分配内存，但最终在其他节点分配成功的页面数。	计数	物理机	host, region
numa_local	进程在本地节点上成功分配到的页面总数。	计数	物理机	host, region
numa_other	进程在远程节点上分配到的页面总数。	计数	物理机	host, region
numa_pages_migrated	由于自动 NUMA 平衡而成功迁移的页面总数	计数	物理机	host, region

Ref:

资源事件

容器级别的内存事件指标。

# HELP huatuo_bamai_memory_events_container_high memory events high
# TYPE huatuo_bamai_memory_events_container_high gauge
huatuo_bamai_memory_events_container_high{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_events_container_low memory events low
# TYPE huatuo_bamai_memory_events_container_low gauge
huatuo_bamai_memory_events_container_low{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_events_container_max memory events max
# TYPE huatuo_bamai_memory_events_container_max gauge
huatuo_bamai_memory_events_container_max{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_events_container_oom memory events oom
# TYPE huatuo_bamai_memory_events_container_oom gauge
huatuo_bamai_memory_events_container_oom{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_events_container_oom_group_kill memory events oom_group_kill
# TYPE huatuo_bamai_memory_events_container_oom_group_kill gauge
huatuo_bamai_memory_events_container_oom_group_kill{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_memory_events_container_oom_kill memory events oom_kill
# TYPE huatuo_bamai_memory_events_container_oom_kill gauge
huatuo_bamai_memory_events_container_oom_kill{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
memory_events_container_low	使用量低于 memory.low，但由于系统内存压力大，仍被主动回收的次数。说明 memory.low 被过度承诺。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_events_container_high	内存使用量超过 memory.high（软限制），导致进程被节流并强制走直接回收的次数。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_events_container_max	内存使用量达到或即将超过 memory.max（硬限制），触发内存分配失败检查的次数。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_events_container_oom	内存使用量达到 memory.max 限制，导致内存分配失败，进入 OOM 路径的次数。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_events_container_oom_kill	cgroup 内因达到内存限制而被 OOM killer 杀死的进程数。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
memory_events_container_oom_group_kill	整个 cgroup 被 OOM killer 杀死的次数。	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

Buddyinfo

展示 Buddy 分配器（内核页分配器核心算法）在每个 NUMA 节点（Node）和每个内存区域（Zone）中的空闲内存块分布情况。

# HELP huatuo_bamai_memory_buddyinfo_blocks buddy info
# TYPE huatuo_bamai_memory_buddyinfo_blocks gauge
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="0",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="0",region="dev",zone="DMA32"} 3
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="0",region="dev",zone="Normal"} 7
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="1",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="1",region="dev",zone="DMA32"} 1
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="1",region="dev",zone="Normal"} 36
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="10",region="dev",zone="DMA"} 2
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="10",region="dev",zone="DMA32"} 743
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="10",region="dev",zone="Normal"} 2265
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="2",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="2",region="dev",zone="DMA32"} 3
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="2",region="dev",zone="Normal"} 10
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="3",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="3",region="dev",zone="DMA32"} 2
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="3",region="dev",zone="Normal"} 224
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="4",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="4",region="dev",zone="DMA32"} 1
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="4",region="dev",zone="Normal"} 376
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="5",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="5",region="dev",zone="DMA32"} 1
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="5",region="dev",zone="Normal"} 165
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="6",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="6",region="dev",zone="DMA32"} 3
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="6",region="dev",zone="Normal"} 118
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="7",region="dev",zone="DMA"} 0
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="7",region="dev",zone="DMA32"} 4
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="7",region="dev",zone="Normal"} 172
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="8",region="dev",zone="DMA"} 1
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="8",region="dev",zone="DMA32"} 4
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="8",region="dev",zone="Normal"} 35
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="9",region="dev",zone="DMA"} 2
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="9",region="dev",zone="DMA32"} 4
huatuo_bamai_memory_buddyinfo_blocks{host="hostname",node="0",order="9",region="dev",zone="Normal"} 25

指标	意义	单位	对象	取值	标签
memory_buddyinfo_blocks	buddy 内存页空闲情况。	内存页	物理机	procfs	host, node, order, region, zone

网络系统

TCP 内存

如下指标描述 TCP 协议栈占用系统内存状态。

# HELP huatuo_bamai_tcp_memory_limit_pages tcp memory pages limit
# TYPE huatuo_bamai_tcp_memory_limit_pages gauge
huatuo_bamai_tcp_memory_limit_pages{host="hostname",region="dev"} 380526
# HELP huatuo_bamai_tcp_memory_usage_bytes tcp memory bytes usage
# TYPE huatuo_bamai_tcp_memory_usage_bytes gauge
huatuo_bamai_tcp_memory_usage_bytes{host="hostname",region="dev"} 0
# HELP huatuo_bamai_tcp_memory_usage_pages tcp memory pages usage
# TYPE huatuo_bamai_tcp_memory_usage_pages gauge
huatuo_bamai_tcp_memory_usage_pages{host="hostname",region="dev"} 0
# HELP huatuo_bamai_tcp_memory_usage_percent tcp memory usage percent
# TYPE huatuo_bamai_tcp_memory_usage_percent gauge
huatuo_bamai_tcp_memory_usage_percent{host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
tcp_memory_limit_pages	系统可使用的 TCP 总内存大小	内存页	物理机	host, region
tcp_memory_usage_bytes	系统已使用的 TCP 内存大小	字节	物理机	host, region
tcp_memory_usage_pages	系统已使用的 TCP 内存大小	内存页	物理机	host, region
tcp_memory_usage_percent	系统已使用的 TCP 内存百分比（相对 TCP 内存总限制）	%	物理机	host, region

邻居项

如下指标描述邻居项使用状态。

# HELP huatuo_bamai_arp_container_entries arp entries in container netns
# TYPE huatuo_bamai_arp_container_entries gauge
huatuo_bamai_arp_container_entries{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_arp_entries host init namespace
# TYPE huatuo_bamai_arp_entries gauge
huatuo_bamai_arp_entries{host="hostname",region="dev"} 5
# HELP huatuo_bamai_arp_total all entries in arp_cache for containers and host netns
# TYPE huatuo_bamai_arp_total gauge
huatuo_bamai_arp_total{host="hostname",region="dev"} 12

指标	意义	单位	对象	标签
arp_entries	宿主机网络命名空间 arp 条目数量	计数	宿主命名空间	host, region
arp_total	物理机所有网络命名空间 arp 条目数量总和	计数	物理机	host, region
arp_container_entries	容器网络命名空间 arp 条目数量	计数	容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

Qdisc

Qdisc 是内核网络子系统重要模块。通过观测该模块，可以清楚的看到网络报文处理，延迟情况。

# HELP huatuo_bamai_netdev_qdisc_backlog Number of bytes currently in queue to be sent.
# TYPE huatuo_bamai_netdev_qdisc_backlog gauge
huatuo_bamai_netdev_qdisc_backlog{device="ens2",host="hostname",kind="fq_codel",region="dev"} 0
# HELP huatuo_bamai_netdev_qdisc_bytes_total Number of bytes sent.
# TYPE huatuo_bamai_netdev_qdisc_bytes_total counter
huatuo_bamai_netdev_qdisc_bytes_total{device="ens2",host="hostname",kind="fq_codel",region="dev"} 2.578235443e+09
# HELP huatuo_bamai_netdev_qdisc_current_queue_length Number of packets currently in queue to be sent.
# TYPE huatuo_bamai_netdev_qdisc_current_queue_length gauge
huatuo_bamai_netdev_qdisc_current_queue_length{device="ens2",host="hostname",kind="fq_codel",region="dev"} 0
# HELP huatuo_bamai_netdev_qdisc_drops_total Number of packet drops.
# TYPE huatuo_bamai_netdev_qdisc_drops_total counter
huatuo_bamai_netdev_qdisc_drops_total{device="ens2",host="hostname",kind="fq_codel",region="dev"} 0
# HELP huatuo_bamai_netdev_qdisc_overlimits_total Number of packet overlimits.
# TYPE huatuo_bamai_netdev_qdisc_overlimits_total counter
huatuo_bamai_netdev_qdisc_overlimits_total{device="ens2",host="hostname",kind="fq_codel",region="dev"} 0
# HELP huatuo_bamai_netdev_qdisc_packets_total Number of packets sent.
# TYPE huatuo_bamai_netdev_qdisc_packets_total counter
huatuo_bamai_netdev_qdisc_packets_total{device="ens2",host="hostname",kind="fq_codel",region="dev"} 6.867714e+06
# HELP huatuo_bamai_netdev_qdisc_requeues_total Number of packets dequeued, not transmitted, and requeued.
# TYPE huatuo_bamai_netdev_qdisc_requeues_total counter
huatuo_bamai_netdev_qdisc_requeues_total{device="ens2",host="hostname",kind="fq_codel",region="dev"} 0

指标	意义	单位	对象	标签
qdisc_backlog	后备排队待发送的包数	字节	物理机	device, host, kind, region
qdisc_current_queue_length	当前排队的包量	计数	物理机	device, host, kind, region
qdisc_overlimits_total	超限次数	计数	物理机	device, host, kind, region
qdisc_requeues_total	由于网卡/驱动暂时无法发送而被重新入队的次数	计数	物理机	device, host, kind, region
qdisc_drops_total	主动丢弃的包数（因队列满、限速策略等原因）	计数	物理机	device, host, kind, region
qdisc_bytes_total	已发送的包量	字节	物理机	device, host, kind, region
qdisc_packets_total	已发送的包数	计数	物理机	device, host, kind, region

硬件丢包

网络设备硬件接收方向丢包数。

# HELP huatuo_bamai_netdev_hw_rx_dropped count of packets dropped at hardware level
# TYPE huatuo_bamai_netdev_hw_rx_dropped gauge
huatuo_bamai_netdev_hw_rx_dropped{device="eth0",driver="mlx5_core",host="hostname",region="dev"} 0

指标	意义	单位	对象	取值	标签
netdev_hw_rx_dropped	网卡硬件接收方向丢包	计数	物理机	eBPF	device, driver, host, region

网络设备

# HELP huatuo_bamai_netdev_container_receive_bytes_total Network device statistic receive_bytes.
# TYPE huatuo_bamai_netdev_container_receive_bytes_total counter
huatuo_bamai_netdev_container_receive_bytes_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 6.4400018e+07
# HELP huatuo_bamai_netdev_container_receive_compressed_total Network device statistic receive_compressed.
# TYPE huatuo_bamai_netdev_container_receive_compressed_total counter
huatuo_bamai_netdev_container_receive_compressed_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_dropped_total Network device statistic receive_dropped.
# TYPE huatuo_bamai_netdev_container_receive_dropped_total counter
huatuo_bamai_netdev_container_receive_dropped_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_errors_total Network device statistic receive_errors.
# TYPE huatuo_bamai_netdev_container_receive_errors_total counter
huatuo_bamai_netdev_container_receive_errors_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_fifo_total Network device statistic receive_fifo.
# TYPE huatuo_bamai_netdev_container_receive_fifo_total counter
huatuo_bamai_netdev_container_receive_fifo_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_frame_total Network device statistic receive_frame.
# TYPE huatuo_bamai_netdev_container_receive_frame_total counter
huatuo_bamai_netdev_container_receive_frame_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_multicast_total Network device statistic receive_multicast.
# TYPE huatuo_bamai_netdev_container_receive_multicast_total counter
huatuo_bamai_netdev_container_receive_multicast_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_receive_packets_total Network device statistic receive_packets.
# TYPE huatuo_bamai_netdev_container_receive_packets_total counter
huatuo_bamai_netdev_container_receive_packets_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 693155
# HELP huatuo_bamai_netdev_container_transmit_bytes_total Network device statistic transmit_bytes.
# TYPE huatuo_bamai_netdev_container_transmit_bytes_total counter
huatuo_bamai_netdev_container_transmit_bytes_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 6.2347911e+07
# HELP huatuo_bamai_netdev_container_transmit_carrier_total Network device statistic transmit_carrier.
# TYPE huatuo_bamai_netdev_container_transmit_carrier_total counter
huatuo_bamai_netdev_container_transmit_carrier_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_colls_total Network device statistic transmit_colls.
# TYPE huatuo_bamai_netdev_container_transmit_colls_total counter
huatuo_bamai_netdev_container_transmit_colls_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_compressed_total Network device statistic transmit_compressed.
# TYPE huatuo_bamai_netdev_container_transmit_compressed_total counter
huatuo_bamai_netdev_container_transmit_compressed_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_dropped_total Network device statistic transmit_dropped.
# TYPE huatuo_bamai_netdev_container_transmit_dropped_total counter
huatuo_bamai_netdev_container_transmit_dropped_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_errors_total Network device statistic transmit_errors.
# TYPE huatuo_bamai_netdev_container_transmit_errors_total counter
huatuo_bamai_netdev_container_transmit_errors_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_fifo_total Network device statistic transmit_fifo.
# TYPE huatuo_bamai_netdev_container_transmit_fifo_total counter
huatuo_bamai_netdev_container_transmit_fifo_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netdev_container_transmit_packets_total Network device statistic transmit_packets.
# TYPE huatuo_bamai_netdev_container_transmit_packets_total counter
huatuo_bamai_netdev_container_transmit_packets_total{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",device="eth0",host="hostname",region="dev"} 660218

指标	意义	单位	对象	标签
netdev_receive_bytes_total	成功接收的总字节数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_packets_total	成功接收的数据包总数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_compressed_total	接收到的已压缩数据包数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_frame_total	接收帧错误数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_errors_total	接收错误总数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_dropped_total	由于各种原因被内核或驱动丢弃的接收包数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_receive_fifo_total	接收FIFO/环形缓冲区溢出错误数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_bytes_total	成功发送的总字节数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_packets_total	成功发送的数据包总数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_errors_total	发送错误总数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_dropped_total	发送过程中被丢弃的包数（队列满、策略丢弃等）	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_fifo_total	发送FIFO/环形缓冲区错误数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_carrier_total	载波错误次数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netdev_transmit_compressed_total	发送的已压缩数据包数	计数	物理机或者容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

TCP

# HELP huatuo_bamai_netstat_container_TcpExt_ArpFilter statistic TcpExtArpFilter.
# TYPE huatuo_bamai_netstat_container_TcpExt_ArpFilter gauge
huatuo_bamai_netstat_container_TcpExt_ArpFilter{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_BusyPollRxPackets statistic TcpExtBusyPollRxPackets.
# TYPE huatuo_bamai_netstat_container_TcpExt_BusyPollRxPackets gauge
huatuo_bamai_netstat_container_TcpExt_BusyPollRxPackets{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_DelayedACKLocked statistic TcpExtDelayedACKLocked.
# TYPE huatuo_bamai_netstat_container_TcpExt_DelayedACKLocked gauge
huatuo_bamai_netstat_container_TcpExt_DelayedACKLocked{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_DelayedACKLost statistic TcpExtDelayedACKLost.
# TYPE huatuo_bamai_netstat_container_TcpExt_DelayedACKLost gauge
huatuo_bamai_netstat_container_TcpExt_DelayedACKLost{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_DelayedACKs statistic TcpExtDelayedACKs.
# TYPE huatuo_bamai_netstat_container_TcpExt_DelayedACKs gauge
huatuo_bamai_netstat_container_TcpExt_DelayedACKs{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 4650
# HELP huatuo_bamai_netstat_container_TcpExt_EmbryonicRsts statistic TcpExtEmbryonicRsts.
# TYPE huatuo_bamai_netstat_container_TcpExt_EmbryonicRsts gauge
huatuo_bamai_netstat_container_TcpExt_EmbryonicRsts{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_IPReversePathFilter statistic TcpExtIPReversePathFilter.
# TYPE huatuo_bamai_netstat_container_TcpExt_IPReversePathFilter gauge
huatuo_bamai_netstat_container_TcpExt_IPReversePathFilter{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_ListenDrops statistic TcpExtListenDrops.
# TYPE huatuo_bamai_netstat_container_TcpExt_ListenDrops gauge
huatuo_bamai_netstat_container_TcpExt_ListenDrops{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_ListenOverflows statistic TcpExtListenOverflows.
# TYPE huatuo_bamai_netstat_container_TcpExt_ListenOverflows gauge
huatuo_bamai_netstat_container_TcpExt_ListenOverflows{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_LockDroppedIcmps statistic TcpExtLockDroppedIcmps.
# TYPE huatuo_bamai_netstat_container_TcpExt_LockDroppedIcmps gauge
huatuo_bamai_netstat_container_TcpExt_LockDroppedIcmps{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_OfoPruned statistic TcpExtOfoPruned.
# TYPE huatuo_bamai_netstat_container_TcpExt_OfoPruned gauge
huatuo_bamai_netstat_container_TcpExt_OfoPruned{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_OutOfWindowIcmps statistic TcpExtOutOfWindowIcmps.
# TYPE huatuo_bamai_netstat_container_TcpExt_OutOfWindowIcmps gauge
huatuo_bamai_netstat_container_TcpExt_OutOfWindowIcmps{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_PAWSActive statistic TcpExtPAWSActive.
# TYPE huatuo_bamai_netstat_container_TcpExt_PAWSActive gauge
huatuo_bamai_netstat_container_TcpExt_PAWSActive{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_PAWSEstab statistic TcpExtPAWSEstab.
# TYPE huatuo_bamai_netstat_container_TcpExt_PAWSEstab gauge
huatuo_bamai_netstat_container_TcpExt_PAWSEstab{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_PFMemallocDrop statistic TcpExtPFMemallocDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_PFMemallocDrop gauge
huatuo_bamai_netstat_container_TcpExt_PFMemallocDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_PruneCalled statistic TcpExtPruneCalled.
# TYPE huatuo_bamai_netstat_container_TcpExt_PruneCalled gauge
huatuo_bamai_netstat_container_TcpExt_PruneCalled{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_RcvPruned statistic TcpExtRcvPruned.
# TYPE huatuo_bamai_netstat_container_TcpExt_RcvPruned gauge
huatuo_bamai_netstat_container_TcpExt_RcvPruned{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_SyncookiesFailed statistic TcpExtSyncookiesFailed.
# TYPE huatuo_bamai_netstat_container_TcpExt_SyncookiesFailed gauge
huatuo_bamai_netstat_container_TcpExt_SyncookiesFailed{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_SyncookiesRecv statistic TcpExtSyncookiesRecv.
# TYPE huatuo_bamai_netstat_container_TcpExt_SyncookiesRecv gauge
huatuo_bamai_netstat_container_TcpExt_SyncookiesRecv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_SyncookiesSent statistic TcpExtSyncookiesSent.
# TYPE huatuo_bamai_netstat_container_TcpExt_SyncookiesSent gauge
huatuo_bamai_netstat_container_TcpExt_SyncookiesSent{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedChallenge statistic TcpExtTCPACKSkippedChallenge.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedChallenge gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedChallenge{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedFinWait2 statistic TcpExtTCPACKSkippedFinWait2.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedFinWait2 gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedFinWait2{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedPAWS statistic TcpExtTCPACKSkippedPAWS.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedPAWS gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedPAWS{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSeq statistic TcpExtTCPACKSkippedSeq.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSeq gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSeq{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSynRecv statistic TcpExtTCPACKSkippedSynRecv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSynRecv gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedSynRecv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedTimeWait statistic TcpExtTCPACKSkippedTimeWait.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedTimeWait gauge
huatuo_bamai_netstat_container_TcpExt_TCPACKSkippedTimeWait{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAOBad statistic TcpExtTCPAOBad.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAOBad gauge
huatuo_bamai_netstat_container_TcpExt_TCPAOBad{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAODroppedIcmps statistic TcpExtTCPAODroppedIcmps.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAODroppedIcmps gauge
huatuo_bamai_netstat_container_TcpExt_TCPAODroppedIcmps{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAOGood statistic TcpExtTCPAOGood.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAOGood gauge
huatuo_bamai_netstat_container_TcpExt_TCPAOGood{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAOKeyNotFound statistic TcpExtTCPAOKeyNotFound.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAOKeyNotFound gauge
huatuo_bamai_netstat_container_TcpExt_TCPAOKeyNotFound{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAORequired statistic TcpExtTCPAORequired.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAORequired gauge
huatuo_bamai_netstat_container_TcpExt_TCPAORequired{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortFailed statistic TcpExtTCPAbortFailed.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortFailed gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortFailed{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortOnClose statistic TcpExtTCPAbortOnClose.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortOnClose gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortOnClose{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortOnData statistic TcpExtTCPAbortOnData.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortOnData gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortOnData{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortOnLinger statistic TcpExtTCPAbortOnLinger.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortOnLinger gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortOnLinger{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortOnMemory statistic TcpExtTCPAbortOnMemory.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortOnMemory gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortOnMemory{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAbortOnTimeout statistic TcpExtTCPAbortOnTimeout.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAbortOnTimeout gauge
huatuo_bamai_netstat_container_TcpExt_TCPAbortOnTimeout{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAckCompressed statistic TcpExtTCPAckCompressed.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAckCompressed gauge
huatuo_bamai_netstat_container_TcpExt_TCPAckCompressed{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPAutoCorking statistic TcpExtTCPAutoCorking.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPAutoCorking gauge
huatuo_bamai_netstat_container_TcpExt_TCPAutoCorking{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPBacklogCoalesce statistic TcpExtTCPBacklogCoalesce.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPBacklogCoalesce gauge
huatuo_bamai_netstat_container_TcpExt_TCPBacklogCoalesce{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 3
# HELP huatuo_bamai_netstat_container_TcpExt_TCPBacklogDrop statistic TcpExtTCPBacklogDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPBacklogDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPBacklogDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPChallengeACK statistic TcpExtTCPChallengeACK.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPChallengeACK gauge
huatuo_bamai_netstat_container_TcpExt_TCPChallengeACK{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredDubious statistic TcpExtTCPDSACKIgnoredDubious.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredDubious gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredDubious{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredNoUndo statistic TcpExtTCPDSACKIgnoredNoUndo.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredNoUndo gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredNoUndo{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredOld statistic TcpExtTCPDSACKIgnoredOld.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredOld gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKIgnoredOld{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoRecv statistic TcpExtTCPDSACKOfoRecv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoRecv gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoRecv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoSent statistic TcpExtTCPDSACKOfoSent.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoSent gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKOfoSent{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKOldSent statistic TcpExtTCPDSACKOldSent.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKOldSent gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKOldSent{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecv statistic TcpExtTCPDSACKRecv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecv gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecvSegs statistic TcpExtTCPDSACKRecvSegs.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecvSegs gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKRecvSegs{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDSACKUndo statistic TcpExtTCPDSACKUndo.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDSACKUndo gauge
huatuo_bamai_netstat_container_TcpExt_TCPDSACKUndo{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDeferAcceptDrop statistic TcpExtTCPDeferAcceptDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDeferAcceptDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPDeferAcceptDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDelivered statistic TcpExtTCPDelivered.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDelivered gauge
huatuo_bamai_netstat_container_TcpExt_TCPDelivered{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 3.28098e+06
# HELP huatuo_bamai_netstat_container_TcpExt_TCPDeliveredCE statistic TcpExtTCPDeliveredCE.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPDeliveredCE gauge
huatuo_bamai_netstat_container_TcpExt_TCPDeliveredCE{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActive statistic TcpExtTCPFastOpenActive.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActive gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActive{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActiveFail statistic TcpExtTCPFastOpenActiveFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActiveFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenActiveFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenBlackhole statistic TcpExtTCPFastOpenBlackhole.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenBlackhole gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenBlackhole{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenCookieReqd statistic TcpExtTCPFastOpenCookieReqd.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenCookieReqd gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenCookieReqd{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenListenOverflow statistic TcpExtTCPFastOpenListenOverflow.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenListenOverflow gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenListenOverflow{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassive statistic TcpExtTCPFastOpenPassive.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassive gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassive{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveAltKey statistic TcpExtTCPFastOpenPassiveAltKey.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveAltKey gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveAltKey{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveFail statistic TcpExtTCPFastOpenPassiveFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastOpenPassiveFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFastRetrans statistic TcpExtTCPFastRetrans.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFastRetrans gauge
huatuo_bamai_netstat_container_TcpExt_TCPFastRetrans{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFromZeroWindowAdv statistic TcpExtTCPFromZeroWindowAdv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFromZeroWindowAdv gauge
huatuo_bamai_netstat_container_TcpExt_TCPFromZeroWindowAdv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPFullUndo statistic TcpExtTCPFullUndo.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPFullUndo gauge
huatuo_bamai_netstat_container_TcpExt_TCPFullUndo{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHPAcks statistic TcpExtTCPHPAcks.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHPAcks gauge
huatuo_bamai_netstat_container_TcpExt_TCPHPAcks{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 616667
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHPHits statistic TcpExtTCPHPHits.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHPHits gauge
huatuo_bamai_netstat_container_TcpExt_TCPHPHits{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 9913
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayCwnd statistic TcpExtTCPHystartDelayCwnd.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayCwnd gauge
huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayCwnd{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayDetect statistic TcpExtTCPHystartDelayDetect.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayDetect gauge
huatuo_bamai_netstat_container_TcpExt_TCPHystartDelayDetect{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainCwnd statistic TcpExtTCPHystartTrainCwnd.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainCwnd gauge
huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainCwnd{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainDetect statistic TcpExtTCPHystartTrainDetect.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainDetect gauge
huatuo_bamai_netstat_container_TcpExt_TCPHystartTrainDetect{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPKeepAlive statistic TcpExtTCPKeepAlive.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPKeepAlive gauge
huatuo_bamai_netstat_container_TcpExt_TCPKeepAlive{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 20
# HELP huatuo_bamai_netstat_container_TcpExt_TCPLossFailures statistic TcpExtTCPLossFailures.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPLossFailures gauge
huatuo_bamai_netstat_container_TcpExt_TCPLossFailures{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPLossProbeRecovery statistic TcpExtTCPLossProbeRecovery.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPLossProbeRecovery gauge
huatuo_bamai_netstat_container_TcpExt_TCPLossProbeRecovery{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPLossProbes statistic TcpExtTCPLossProbes.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPLossProbes gauge
huatuo_bamai_netstat_container_TcpExt_TCPLossProbes{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_netstat_container_TcpExt_TCPLossUndo statistic TcpExtTCPLossUndo.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPLossUndo gauge
huatuo_bamai_netstat_container_TcpExt_TCPLossUndo{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPLostRetransmit statistic TcpExtTCPLostRetransmit.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPLostRetransmit gauge
huatuo_bamai_netstat_container_TcpExt_TCPLostRetransmit{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMD5Failure statistic TcpExtTCPMD5Failure.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMD5Failure gauge
huatuo_bamai_netstat_container_TcpExt_TCPMD5Failure{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMD5NotFound statistic TcpExtTCPMD5NotFound.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMD5NotFound gauge
huatuo_bamai_netstat_container_TcpExt_TCPMD5NotFound{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMD5Unexpected statistic TcpExtTCPMD5Unexpected.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMD5Unexpected gauge
huatuo_bamai_netstat_container_TcpExt_TCPMD5Unexpected{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMTUPFail statistic TcpExtTCPMTUPFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMTUPFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPMTUPFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMTUPSuccess statistic TcpExtTCPMTUPSuccess.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMTUPSuccess gauge
huatuo_bamai_netstat_container_TcpExt_TCPMTUPSuccess{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressures statistic TcpExtTCPMemoryPressures.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressures gauge
huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressures{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressuresChrono statistic TcpExtTCPMemoryPressuresChrono.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressuresChrono gauge
huatuo_bamai_netstat_container_TcpExt_TCPMemoryPressuresChrono{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqFailure statistic TcpExtTCPMigrateReqFailure.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqFailure gauge
huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqFailure{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqSuccess statistic TcpExtTCPMigrateReqSuccess.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqSuccess gauge
huatuo_bamai_netstat_container_TcpExt_TCPMigrateReqSuccess{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPMinTTLDrop statistic TcpExtTCPMinTTLDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPMinTTLDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPMinTTLDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPOFODrop statistic TcpExtTCPOFODrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPOFODrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPOFODrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPOFOMerge statistic TcpExtTCPOFOMerge.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPOFOMerge gauge
huatuo_bamai_netstat_container_TcpExt_TCPOFOMerge{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPOFOQueue statistic TcpExtTCPOFOQueue.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPOFOQueue gauge
huatuo_bamai_netstat_container_TcpExt_TCPOFOQueue{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPOrigDataSent statistic TcpExtTCPOrigDataSent.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPOrigDataSent gauge
huatuo_bamai_netstat_container_TcpExt_TCPOrigDataSent{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 2.675557e+06
# HELP huatuo_bamai_netstat_container_TcpExt_TCPPLBRehash statistic TcpExtTCPPLBRehash.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPPLBRehash gauge
huatuo_bamai_netstat_container_TcpExt_TCPPLBRehash{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPPartialUndo statistic TcpExtTCPPartialUndo.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPPartialUndo gauge
huatuo_bamai_netstat_container_TcpExt_TCPPartialUndo{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPPureAcks statistic TcpExtTCPPureAcks.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPPureAcks gauge
huatuo_bamai_netstat_container_TcpExt_TCPPureAcks{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 2.095262e+06
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRcvCoalesce statistic TcpExtTCPRcvCoalesce.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRcvCoalesce gauge
huatuo_bamai_netstat_container_TcpExt_TCPRcvCoalesce{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 3
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRcvCollapsed statistic TcpExtTCPRcvCollapsed.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRcvCollapsed gauge
huatuo_bamai_netstat_container_TcpExt_TCPRcvCollapsed{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRcvQDrop statistic TcpExtTCPRcvQDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRcvQDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPRcvQDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRenoFailures statistic TcpExtTCPRenoFailures.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRenoFailures gauge
huatuo_bamai_netstat_container_TcpExt_TCPRenoFailures{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRenoRecovery statistic TcpExtTCPRenoRecovery.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRenoRecovery gauge
huatuo_bamai_netstat_container_TcpExt_TCPRenoRecovery{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRenoRecoveryFail statistic TcpExtTCPRenoRecoveryFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRenoRecoveryFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPRenoRecoveryFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRenoReorder statistic TcpExtTCPRenoReorder.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRenoReorder gauge
huatuo_bamai_netstat_container_TcpExt_TCPRenoReorder{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDoCookies statistic TcpExtTCPReqQFullDoCookies.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDoCookies gauge
huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDoCookies{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDrop statistic TcpExtTCPReqQFullDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPReqQFullDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPRetransFail statistic TcpExtTCPRetransFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPRetransFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPRetransFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSACKDiscard statistic TcpExtTCPSACKDiscard.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSACKDiscard gauge
huatuo_bamai_netstat_container_TcpExt_TCPSACKDiscard{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSACKReneging statistic TcpExtTCPSACKReneging.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSACKReneging gauge
huatuo_bamai_netstat_container_TcpExt_TCPSACKReneging{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSACKReorder statistic TcpExtTCPSACKReorder.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSACKReorder gauge
huatuo_bamai_netstat_container_TcpExt_TCPSACKReorder{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSYNChallenge statistic TcpExtTCPSYNChallenge.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSYNChallenge gauge
huatuo_bamai_netstat_container_TcpExt_TCPSYNChallenge{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackFailures statistic TcpExtTCPSackFailures.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackFailures gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackFailures{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackMerged statistic TcpExtTCPSackMerged.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackMerged gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackMerged{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackRecovery statistic TcpExtTCPSackRecovery.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackRecovery gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackRecovery{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackRecoveryFail statistic TcpExtTCPSackRecoveryFail.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackRecoveryFail gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackRecoveryFail{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackShiftFallback statistic TcpExtTCPSackShiftFallback.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackShiftFallback gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackShiftFallback{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSackShifted statistic TcpExtTCPSackShifted.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSackShifted gauge
huatuo_bamai_netstat_container_TcpExt_TCPSackShifted{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSlowStartRetrans statistic TcpExtTCPSlowStartRetrans.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSlowStartRetrans gauge
huatuo_bamai_netstat_container_TcpExt_TCPSlowStartRetrans{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRTOs statistic TcpExtTCPSpuriousRTOs.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRTOs gauge
huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRTOs{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRtxHostQueues statistic TcpExtTCPSpuriousRtxHostQueues.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRtxHostQueues gauge
huatuo_bamai_netstat_container_TcpExt_TCPSpuriousRtxHostQueues{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPSynRetrans statistic TcpExtTCPSynRetrans.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPSynRetrans gauge
huatuo_bamai_netstat_container_TcpExt_TCPSynRetrans{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPTSReorder statistic TcpExtTCPTSReorder.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPTSReorder gauge
huatuo_bamai_netstat_container_TcpExt_TCPTSReorder{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPTimeWaitOverflow statistic TcpExtTCPTimeWaitOverflow.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPTimeWaitOverflow gauge
huatuo_bamai_netstat_container_TcpExt_TCPTimeWaitOverflow{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPTimeouts statistic TcpExtTCPTimeouts.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPTimeouts gauge
huatuo_bamai_netstat_container_TcpExt_TCPTimeouts{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPToZeroWindowAdv statistic TcpExtTCPToZeroWindowAdv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPToZeroWindowAdv gauge
huatuo_bamai_netstat_container_TcpExt_TCPToZeroWindowAdv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPWantZeroWindowAdv statistic TcpExtTCPWantZeroWindowAdv.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPWantZeroWindowAdv gauge
huatuo_bamai_netstat_container_TcpExt_TCPWantZeroWindowAdv{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPWinProbe statistic TcpExtTCPWinProbe.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPWinProbe gauge
huatuo_bamai_netstat_container_TcpExt_TCPWinProbe{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPWqueueTooBig statistic TcpExtTCPWqueueTooBig.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPWqueueTooBig gauge
huatuo_bamai_netstat_container_TcpExt_TCPWqueueTooBig{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TCPZeroWindowDrop statistic TcpExtTCPZeroWindowDrop.
# TYPE huatuo_bamai_netstat_container_TcpExt_TCPZeroWindowDrop gauge
huatuo_bamai_netstat_container_TcpExt_TCPZeroWindowDrop{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TW statistic TcpExtTW.
# TYPE huatuo_bamai_netstat_container_TcpExt_TW gauge
huatuo_bamai_netstat_container_TcpExt_TW{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 720624
# HELP huatuo_bamai_netstat_container_TcpExt_TWKilled statistic TcpExtTWKilled.
# TYPE huatuo_bamai_netstat_container_TcpExt_TWKilled gauge
huatuo_bamai_netstat_container_TcpExt_TWKilled{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TWRecycled statistic TcpExtTWRecycled.
# TYPE huatuo_bamai_netstat_container_TcpExt_TWRecycled gauge
huatuo_bamai_netstat_container_TcpExt_TWRecycled{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 2461
# HELP huatuo_bamai_netstat_container_TcpExt_TcpDuplicateDataRehash statistic TcpExtTcpDuplicateDataRehash.
# TYPE huatuo_bamai_netstat_container_TcpExt_TcpDuplicateDataRehash gauge
huatuo_bamai_netstat_container_TcpExt_TcpDuplicateDataRehash{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_netstat_container_TcpExt_TcpTimeoutRehash statistic TcpExtTcpTimeoutRehash.
# TYPE huatuo_bamai_netstat_container_TcpExt_TcpTimeoutRehash gauge
huatuo_bamai_netstat_container_TcpExt_TcpTimeoutRehash{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
netstat_TcpExt_ArpFilter	因 ARP 过滤规则而被丢弃的数据包数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_BusyPollRxPackets	通过 busy polling 机制接收到的数据包数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_DelayedACKLocked	由于用户态进程锁住了 socket，而无法发送 delayed ACK 的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_DelayedACKLost	延迟 ACK 丢失导致重传的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_DelayedACKs	尝试发送 delayed ACK 的次数，包括未成功发送的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_EmbryonicRsts	在 SYN_RECV 状态收到带 RST/SYN 标记的包个数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_ListenDrops	因全连接队列满丢弃的连接总数（含ListenOverflows）	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_ListenOverflows	表示在 TCP 监听队列中发生的溢出次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_OfoPruned	乱序队列因内存不足被修剪的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_OutOfWindowIcmps	收到的与当前 TCP 窗口无关的 ICMP 错误报文数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_PruneCalled	因内存不足触发缓存清理的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_RcvPruned	接收队列因内存不足被修剪（丢弃数据包）的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_SyncookiesFailed	验证失败的 SYN cookie 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_SyncookiesRecv	表示接收的 SYN cookie 的数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_SyncookiesSent	表示发送的 SYN cookie 的数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPACKSkippedChallenge	在处理 Challenge ACK 过程中跳过的其他 ACK 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPACKSkippedFinWait2	在 FIN-WAIT-2 状态下跳过的 ACK 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPACKSkippedPAWS	因 PAWS 检查失败而跳过的 ACK 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPACKSkippedSeq	因为序列号检查而跳过的 ACK 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPACKSkippedTimeWait	在 TIME-WAIT 状态下跳过的 ACK 数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPAbortOnClose	用户态程序在缓冲区内还有数据时关闭连接的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPAbortOnData	收到未知数据导致被关闭的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPAbortOnLinger	在LINGER状态下等待超时后中止连接的数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPAbortOnMemory	因内存问题关闭连接的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPAbortOnTimeout	因各种计时器的重传次数超过上限而关闭连接的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPLossFailures	丢失数据包而进行恢复失败的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPLossProbeRecovery	检测到丢失的数据包恢复的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPLossProbes	TCP 检测到丢失的数据包数量，通常用于检测网络拥塞或丢包	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPLossUndo	在恢复过程中检测到丢失而撤销的次数	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region
netstat_TcpExt_TCPLostRetransmit	丢包重传的数量	计数	宿主，容器	container_host, container_hostnamespace, container_level, container_name, container_type, host, region

备注：TcpExt 扩展指标非常多，可按需参考官方文档。

Ref:

https://www.kernel.org/doc/html/latest/networking/snmp_counter.html

Socket

# HELP huatuo_bamai_sockstat_container_FRAG_inuse Number of FRAG sockets in state inuse.
# TYPE huatuo_bamai_sockstat_container_FRAG_inuse gauge
huatuo_bamai_sockstat_container_FRAG_inuse{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_FRAG_memory Number of FRAG sockets in state memory.
# TYPE huatuo_bamai_sockstat_container_FRAG_memory gauge
huatuo_bamai_sockstat_container_FRAG_memory{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_RAW_inuse Number of RAW sockets in state inuse.
# TYPE huatuo_bamai_sockstat_container_RAW_inuse gauge
huatuo_bamai_sockstat_container_RAW_inuse{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_TCP_alloc Number of TCP sockets in state alloc.
# TYPE huatuo_bamai_sockstat_container_TCP_alloc gauge
huatuo_bamai_sockstat_container_TCP_alloc{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 171
# HELP huatuo_bamai_sockstat_container_TCP_inuse Number of TCP sockets in state inuse.
# TYPE huatuo_bamai_sockstat_container_TCP_inuse gauge
huatuo_bamai_sockstat_container_TCP_inuse{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 1
# HELP huatuo_bamai_sockstat_container_TCP_orphan Number of TCP sockets in state orphan.
# TYPE huatuo_bamai_sockstat_container_TCP_orphan gauge
huatuo_bamai_sockstat_container_TCP_orphan{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_TCP_tw Number of TCP sockets in state tw.
# TYPE huatuo_bamai_sockstat_container_TCP_tw gauge
huatuo_bamai_sockstat_container_TCP_tw{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 75
# HELP huatuo_bamai_sockstat_container_UDPLITE_inuse Number of UDPLITE sockets in state inuse.
# TYPE huatuo_bamai_sockstat_container_UDPLITE_inuse gauge
huatuo_bamai_sockstat_container_UDPLITE_inuse{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_UDP_inuse Number of UDP sockets in state inuse.
# TYPE huatuo_bamai_sockstat_container_UDP_inuse gauge
huatuo_bamai_sockstat_container_UDP_inuse{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 0
# HELP huatuo_bamai_sockstat_container_sockets_used Number of IPv4 sockets in use.
# TYPE huatuo_bamai_sockstat_container_sockets_used gauge
huatuo_bamai_sockstat_container_sockets_used{container_host="coredns-855c4dd65d-8v5kg",container_hostnamespace="kube-system",container_level="burstable",container_name="coredns",container_type="normal",host="hostname",region="dev"} 7
# HELP huatuo_bamai_sockstat_sockets_used Number of IPv4 sockets in use.
# TYPE huatuo_bamai_sockstat_sockets_used gauge
huatuo_bamai_sockstat_sockets_used{host="hostname",region="dev"} 409

指标	意义	单位	对象
sockstat_sockets_used	系统层面当前正在使用的 socket 描述符总数	计数	系统
sockstat_TCP_inuse	当前处于 TCP 连接状态（如 ESTABLISHED、LISTEN 等，除 TIME_WAIT 外）的 socket 数量	计数	宿主，容器
sockstat_TCP_orphan	通常表示应用已关闭但 TCP 连接仍未结束	计数	宿主，容器
sockstat_TCP_tw	当前处于 TIME_WAIT 状态的 TCP socket 数量	计数	宿主，容器
sockstat_TCP_alloc	当前已分配的 TCP socket 对象总数	计数	宿主，容器
sockstat_TCP_mem	TCP 套接字当前占用的内核内存页数	内存页	系统
sockstat_UDP_inuse	当前已绑定了本地端口的 UDP socket 数量	计数	宿主，容器

IO

iolatency 用来统计磁盘 I/O 延迟分布。可以把它理解成“把一次磁盘请求拆成几个阶段，再分别看每个阶段耗时多久”。

q2c：从请求进入队列到完成，反映整个 I/O 生命周期延迟
d2c：从驱动层下发到完成，更接近磁盘和驱动本身的耗时
freeze：磁盘冻结事件次数

队列

这些指标都会自动带上公共标签 host 和 region。其中容器维度指标还会固定带上 container_host、container_name、container_type、container_level、container_hostnamespace 标签。

# HELP huatuo_bamai_iolatency_blkdisk_d2c the disk d2c latency
# TYPE huatuo_bamai_iolatency_blkdisk_d2c gauge
huatuo_bamai_iolatency_blkdisk_d2c{disk="253:1",host="hostname",region="dev",zone="0"} 3
# HELP huatuo_bamai_iolatency_blkdisk_q2c the disk q2c latency
# TYPE huatuo_bamai_iolatency_blkdisk_q2c gauge
huatuo_bamai_iolatency_blkdisk_q2c{disk="253:1",host="hostname",region="dev",zone="0"} 3
# HELP huatuo_bamai_iolatency_container_blkdisk_d2c container blkio d2c latency
# TYPE huatuo_bamai_iolatency_container_blkdisk_d2c gauge
huatuo_bamai_iolatency_container_blkdisk_d2c{container_host="etcd-hostname",container_hostnamespace="kube-system",container_level="burstable",container_name="etcd",container_type="normal",disk="253:1",host="hostname",region="dev",zone="5"} 2
# HELP huatuo_bamai_iolatency_container_blkdisk_q2c container blkio q2c latency
# TYPE huatuo_bamai_iolatency_container_blkdisk_q2c gauge
huatuo_bamai_iolatency_container_blkdisk_q2c{container_host="etcd-hostname",container_hostnamespace="kube-system",container_level="burstable",container_name="etcd",container_type="normal",disk="253:1",host="hostname",region="dev",zone="5"} 2

指标	意义	单位	对象	标签
iolatency_blkdisk_q2c	宿主机磁盘整体 I/O 生命周期延迟统计，从入队到完成。分桶为：zone0 20-30ms，zone1 30-50ms，zone2 50-100ms，zone3 100-200ms，zone4 200-400ms，zone5 400ms+	计数	宿主	host, region, disk, zone
iolatency_blkdisk_d2c	宿主机磁盘驱动到完成阶段的延迟统计，更接近设备处理耗时。分桶为：zone0 20-30ms，zone1 30-50ms，zone2 50-100ms，zone3 100-200ms，zone4 200-400ms，zone5 400ms+	计数	宿主	host, region, disk, zone
iolatency_container_blkdisk_q2c	容器触发的整体 I/O 生命周期延迟统计，从入队到完成。分桶为：zone0 20-30ms，zone1 30-50ms，zone2 50-100ms，zone3 100-200ms，zone4 200-400ms，zone5 400ms+	计数	容器	host, region, container_host, container_name, container_type, container_level, container_hostnamespace, zone
iolatency_container_blkdisk_d2c	容器触发的驱动到完成阶段延迟统计。分桶为：zone0 20-30ms，zone1 30-50ms，zone2 50-100ms，zone3 100-200ms，zone4 200-400ms，zone5 400ms+	计数	容器	host, region, container_host, container_name, container_type, container_level, container_hostnamespace, zone

硬件

# HELP huatuo_bamai_iolatency_blkdisk_freeze the disk freeze event count
# TYPE huatuo_bamai_iolatency_blkdisk_freeze gauge
huatuo_bamai_iolatency_blkdisk_freeze{disk="253:1",host="hostname",region="dev"} 0

指标	意义	单位	对象	标签
iolatency_blkdisk_freeze	宿主机磁盘 freeze 事件次数	计数	宿主	host, region, disk

通用系统

Soft Lockup

# HELP huatuo_bamai_softlockup_total softlockup counter
# TYPE huatuo_bamai_softlockup_total counter
huatuo_bamai_softlockup_total{host="hostname",region="dev"} 0

指标	意义	单位	对象	取值	标签
softlockup_total	系统 softlockup 事件计数	计数	物理机	BPF

HungTask

# HELP huatuo_bamai_hungtask_total hungtask counter
# TYPE huatuo_bamai_hungtask_total counter
huatuo_bamai_hungtask_total{host="hostname",region="dev"} 0

指标	意义	单位	对象	取值	标签
hungtask_total	系统 hungtask 事件计数	计数	物理机	BPF

GPU

当前版本支持的 GPU 平台:

MetaX

指标	描述	单位	统计纬度	指标来源
metax_gpu_sdk_info	GPU SDK 信息	-	version	sml.GetSDKVersion
metax_gpu_driver_info	GPU 驱动信息	-	version	sml.GetGPUVersion with driver unit
metax_gpu_info	GPU 基本信息	-	gpu
metax_gpu_board_power_watts	GPU 板级功耗	瓦特（W）	gpu	sml.ListGPUBoardWayElectricInfos
metax_gpu_pcie_link_speed_gt_per_second	GPU PCIe 当前链路速率	GT/s	gpu	sml.GetGPUPcieLinkInfo
metax_gpu_pcie_link_width_lanes	GPU PCIe 当前链路宽度	链路宽度（通道数）	gpu	sml.GetGPUPcieLinkInfo
metax_gpu_pcie_receive_bytes_per_second	GPU PCIe 接收吞吐率	Bps	gpu	sml.GetGPUPcieThroughputInfo
metax_gpu_pcie_transmit_bytes_per_second	GPU PCIe 发送吞吐率	Bps	gpu	sml.GetGPUPcieThroughputInfo
metax_gpu_metaxlink_link_speed_gt_per_second	GPU MetaXLink 当前链路速率	GT/s	gpu, metaxlink	sml.ListGPUMetaXLinkLinkInfos
metax_gpu_metaxlink_link_width_lanes	GPU MetaXLink 当前链路宽度	链路宽度（通道数）	gpu, metaxlink	sml.ListGPUMetaXLinkLinkInfos
metax_gpu_metaxlink_receive_bytes_per_second	GPU MetaXLink 接收吞吐率	Bps	gpu, metaxlink	sml.ListGPUMetaXLinkThroughputInfos
metax_gpu_metaxlink_transmit_bytes_per_second	GPU MetaXLink 发送吞吐率	Bps	gpu, metaxlink	sml.ListGPUMetaXLinkThroughputInfos
metax_gpu_metaxlink_receive_bytes_total	GPU MetaXLink 接收数据总量	字节	gpu, metaxlink	sml.ListGPUMetaXLinkTrafficStatInfos
metax_gpu_metaxlink_transmit_bytes_total	GPU MetaXLink 发送数据总量	字节	gpu, metaxlink	sml.ListGPUMetaXLinkTrafficStatInfos
metax_gpu_metaxlink_aer_errors_total	GPU MetaXLink AER 错误次数	计数	gpu, metaxlink, error_type	sml.ListGPUMetaXLinkAerErrorsInfos
metax_gpu_status	GPU 状态	-	gpu, die	sml.GetDieStatus
metax_gpu_temperature_celsius	GPU 温度	摄氏度	gpu, die	sml.GetDieTemperature
metax_gpu_utilization_percent	GPU 利用率（0–100）	%	gpu, die, ip	sml.GetDieUtilization
metax_gpu_memory_total_bytes	显存总容量	字节	gpu, die	sml.GetDieMemoryInfo
metax_gpu_memory_used_bytes	已使用显存容量	字节	gpu, die	sml.GetDieMemoryInfo
metax_gpu_clock_mhz	GPU 时钟频率	兆赫兹（MHz）	gpu, die, ip	sml.ListDieClocks
metax_gpu_clocks_throttling	GPU 时钟降频原因	-	gpu, die, reason	sml.GetDieClocksThrottleStatus
metax_gpu_dpm_performance_level	GPU DPM 性能等级	-	gpu, die, ip	sml.GetDieDPMPerformanceLevel
metax_gpu_ecc_memory_errors_total	GPU ECC 内存错误次数	计数	gpu, die, memory_type, error_type	sml.GetDieECCMemoryInfo
metax_gpu_ecc_memory_retired_pages_total	GPU ECC 内存退役页数	计数	gpu, die	sml.GetDieECCMemoryInfo

5.2 - 异常事件诊断

🎯 关于 HUATUO（华佗）

HUATUO（华佗）是由滴滴开源并依托 CCF（中国计算机学会）孵化的操作系统深度观测项目，专注为云原生通用计算、AI 计算、云服务、基础服务等提供操作系统内核级深度观测能力。

📖 概述

HUATUO 基于 eBPF 技术，对 Linux 内核中的 CPU 调度、内存子系统、网络协议栈、硬件错误等核心子系统实施实时异常事件观测。当内核触发 softlockup、OOM、硬件 MCE 等异常状态时，eBPF 程序通过挂钩（hook）内核函数（kprobe）或内核 tracepoint，在事件发生的第一时间采集进程信息、内核调用栈、网络上下文等现场数据，并经由 perf event 环形缓冲区传递至用户态处理程序，最终持久化至 Elasticsearch 或本地磁盘文件。

相比传统的基于内核日志（dmesg/syslog）采集方案，eBPF 事件观测具备更低的数据丢失风险——不会因内核日志缓冲区满溢而丢失关键事件；同时可捕获不会写入内核日志的短暂性异常（如软中断关闭时间过长）；并提供容器级别的事件关联信息，满足云原生场景下的精准定位需求。

当前支持 11 类事件的持续观测，覆盖 CPU 调度健康状态（softirq_tracing、softlockup、hungtask）、内存压力（oom、memory_reclaim_events）、网络协议栈（dropwatch、net_rx_latency、netdev_events、netdev_bonding_lacp、netdev_txqueue_timeout）以及硬件可靠性（ras）等方面。

🎯 场景

Kubernetes 容器内存故障定位：在容器频繁 OOM 重启场景下，oom 事件同时记录被 OOM Killer 终止的进程（victim）与触发 OOM 的进程（trigger）的 memcg cgroup 指针及容器 ID，结合时序数据可快速定位内存资源争抢的根因容器，降低人工排查容器日志的时间成本。

AI 训练集群硬件故障感知：在 GPU 训练服务器上，ras 事件持续采集 MCE（Machine Check Exception）、EDAC 内存控制器错误和 PCIe AER（Advanced Error Reporting）错误，对错误进行严重程度分级（Corrected / UncorrectedRecoverable / UncorrectedFatal），在训练任务中断前提前感知硬件老化或单点故障，减少因硬件故障导致的训练任务损失。

网络性能毛刺分析：dropwatch 观测 TCP 协议栈丢包行为（含 syn_flood、listen_overflow 等类型），net_rx_latency 检测单个数据包从网卡驱动到用户态的完整接收路径延迟，按阶段（网卡到内核、内核到 TCP、TCP 到用户态）分别设置阈值（默认 5ms / 10ms / 115ms），精准定位造成业务超时的网络层位置，提升网络问题根因定位效率。

主机调度健康观测：softirq_tracing（软中断关闭时间，默认阈值 10ms）、softlockup（CPU 无法调度，约 1 秒）、hungtask（D 状态进程任务挂起）三类事件联合覆盖 CPU 调度路径的异常状态，当系统出现卡顿、响应超时等现象时，自动保留内核调用栈等诊断信息，支持在故障消失后的离线分析。

🚀 使用

配置参数

各事件可通过以下参数进行调优，参数均提供默认值，无需配置即可运行：

参数	默认值	说明
`softirq.disabled_threshold`	`10000000`（10ms，纳秒）	软中断关闭时间触发阈值
`memory_reclaim.blocked_threshold`	`900000000`（900ms，纳秒）	直接内存回收时间触发阈值
`net_rx_latency.driver2net_rx`	`5`（ms）	从网卡驱动到 `__netif_receive_skb` 的延迟阈值
`net_rx_latency.driver2tcp`	`10`（ms）	从网卡驱动到 `tcp_v4_rcv` 的延迟阈值
`net_rx_latency.driver2userspace`	`115`（ms）	从网卡驱动到用户态拷贝（`skb_copy_datagram_iovec`）的延迟阈值
`net_rx_latency.excluded_host_netnamespace`	`true`	是否过滤宿主机网络命名空间（默认仅观测容器）
`net_rx_latency.excluded_container_qos`	`[]`	需要排除的容器 QoS 级别列表
`dropwatch.excluded_neigh_invalidate`	`true`	是否过滤 `neigh_invalidate` 引起的邻居表丢包噪声
`netdev.device_list`	`[]`	需要监控链路状态的网卡设备名称列表
`ras.mce_thr_backoff`	`1800`（秒）	MCE 阈值中断（THR）事件上报冷却时间，防止中断风暴
`issues_list`	`[]`	已知问题过滤规则列表（用于 net_rx_latency）

事件列表

事件名称（tracer_name）	探针类型	触发条件	典型场景
`softirq_tracing`	kprobe	软中断关闭时间 > 阈值（默认 10ms）	系统卡顿、网络延迟、调度延迟
`softlockup`	kprobe	CPU 长时间无法调度（约 1 秒）	系统软锁死、响应异常
`hungtask`	kprobe	D 状态进程任务挂起	瞬时批量 D 进程、IO 阻塞
`oom`	kprobe	OOM Killer 触发	容器/宿主机内存耗尽
`memory_reclaim_events`	kprobe	容器进程直接回收时间 > 阈值（默认 900ms）	内存压力导致业务卡顿
`ras`	tracepoint	CPU/MEM/PCIe 硬件错误	硬件故障感知
`dropwatch`	kprobe	TCP 协议栈丢包	协议栈丢包导致业务毛刺
`net_rx_latency`	kprobe	协议栈接收延迟超分段阈值	接收延迟引起业务超时
`netdev_events`	netlink	网卡链路状态变化	网卡物理链路故障
`netdev_bonding_lacp`	kprobe	LACP 协议状态变化（仅 802.3ad 模式环境）	物理机与交换机故障边界界定
`netdev_txqueue_timeout`	kprobe	网卡发送队列超时	网卡发送队列硬件故障

通用字段说明

所有事件数据均包含以下通用字段：

hostname：物理机 hostname
region：物理机所在可用区
uploaded_time：数据上传时间
container_id：如果事件关联容器，则记录的容器 ID
container_hostname：如果事件关联容器，则记录的容器 hostname
container_host_namespace：如果事件关联容器，则记录容器的 K8s 命名空间
container_type：容器类型，例如 normal 普通容器，sidecar 边车容器等
container_qos：容器 QoS 级别
tracer_name：事件名称（如 softirq_tracing、oom 等）
tracer_id：此次的 tracing ID
tracer_time：触发 tracing 时间
tracer_type：触发类型（手动触发或自动触发）
tracer_data：特定事件私有数据（详见各事件说明）

1. softirq_tracing 软中断关闭

功能描述 检测内核关闭软中断时间过长时触发，记录关闭软中断期间的内核调用栈、当前进程信息等关键数据，帮助分析中断相关延迟问题。过滤器自动排除 ksoftirqd 和 swapper 进程产生的噪声事件。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "uploaded_time": "2025-06-11T16:05:16.251152703+08:00",
    "hostname": "***",
    "tracer_data": {
        "offtime": 237328905,
        "threshold": 10000000,
        "comm": "***-agent",
        "pid": 688073,
        "cpu": 1,
        "now": 5532940660025295,
        "stack": "scheduler_tick/..."
    },
    "tracer_time": "2025-06-11 16:05:16.251 +0800",
    "tracer_type": "auto",
    "time": "2025-06-11 16:05:16.251 +0800",
    "region": "***",
    "tracer_name": "softirq_tracing"
}

字段含义解释

comm：触发事件的进程名称
stack：关闭软中断期间的内核调用栈
now：事件发生时的单调时钟时间戳（纳秒）
offtime：软中断关闭的持续时间（纳秒）
cpu：发生事件的 CPU 编号
threshold：触发阈值（纳秒），超过该值则记录事件
pid：触发事件的进程 ID

2. dropwatch 协议栈丢包

功能描述 检测内核网络协议栈中的丢包行为，输出丢包时的内核调用栈、网络五元组、TCP 状态等信息。支持识别 4 种丢包类型：common_drop（通用丢包）、syn_flood（SYN 洪泛）、listen_overflow_handshake1（半连接队列溢出）、listen_overflow_handshake3（全连接队列溢出）。过滤器默认排除 neigh_invalidate 邻居表过期丢包和 bnxt 驱动发送侧丢包等已知噪声。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "type": "common_drop",
        "comm": "kubelet",
        "pid": 1687046,
        "saddr": "10.79.68.62",
        "daddr": "10.134.72.4",
        "sport": 8080,
        "dport": 49000,
        "src_hostname": "<nil>",
        "dest_hostname": "<nil>",
        "max_ack_backlog": 128,
        "seq": 1009085774,
        "ack_seq": 689410995,
        "pkt_len": 1460,
        "sk_state": "ESTABLISHED",
        "stack": "kfree_skb/...",
        "netdev_queue_mapping": 3,
        "netdev_linkstatus": ["linkStatusUp"],
        "netdev_name": "eth0",
        "netdev_ifindex": 2,
        "net_cookie": 123456789
    }
}

字段含义解释

type：丢包类型（common_drop / syn_flood / listen_overflow_handshake1 / listen_overflow_handshake3）
comm：触发丢包的进程名称
pid：进程 ID
saddr / daddr：源 IP / 目的 IP 地址
sport / dport：源端口 / 目的端口
src_hostname / dest_hostname：源/目的 IP 的反向 DNS 解析结果
max_ack_backlog：socket 最大 accept 队列长度
seq / ack_seq：TCP 序列号 / 确认序列号
pkt_len：数据包长度（字节）
sk_state：丢包时 TCP 连接状态
stack：丢包发生时的内核调用栈
netdev_queue_mapping：网卡队列索引
netdev_linkstatus：网卡链路状态标志列表
netdev_name：网卡设备名称
netdev_ifindex：网卡接口索引
net_cookie：网络命名空间标识符

3. net_rx_latency 协议栈延迟

功能描述 检测协议栈接收方向（网卡驱动 → 内核协议栈 → 用户态主动收包）的分段延迟事件。在接收路径上设置三个观测点，任意阶段延迟超过对应阈值（默认：网卡到内核 5ms、内核到 TCP 10ms、TCP 到用户态 115ms）时触发，记录网络五元组、TCP 序列号、延迟位置及延迟时间。支持所有 TCP 状态的报文观测，不限于 ESTABLISHED 状态，可捕获 SYN、FIN、TIME_WAIT 等非 ESTABLISHED 状态下的接收延迟事件。默认过滤宿主机网络命名空间，仅观测容器网络。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "comm": "nginx",
        "pid": 2921092,
        "lat_stage": "RX_STAGE_USERCOPY",
        "lat_ms": 95973,
        "tcp_state": "ESTABLISHED",
        "tcp_saddr": "10.156.248.76",
        "tcp_daddr": "10.134.72.4",
        "tcp_sport": 9213,
        "tcp_dport": 49000,
        "tcp_seq": 1009085774,
        "tcp_ack_seq": 689410995,
        "net_namespace_cookie": 123456789,
        "net_namespace_inode": 402653184,
        "pkt_len": 26064
    }
}

字段含义解释

comm：触发事件的进程名称
pid：触发事件的进程 ID
lat_stage：延迟发生的阶段（RX_STAGE_NETIF 网卡到内核 / RX_STAGE_TCPV4 内核到 TCP / RX_STAGE_USERCOPY TCP 到用户态）
lat_ms：实际延迟时间（毫秒）
tcp_state：TCP 连接状态（支持所有状态，如 ESTABLISHED、SYN_SENT、FIN_WAIT、TIME_WAIT 等）
tcp_saddr / tcp_daddr：源 IP / 目的 IP 地址
tcp_sport / tcp_dport：源端口 / 目的端口
tcp_seq / tcp_ack_seq：TCP 序列号 / 确认序列号
net_namespace_cookie：网络命名空间 cookie（内核 ≥ 5.14 可用，用于高效容器关联）
net_namespace_inode：网络命名空间 inode
pkt_len：数据包长度（字节）

4. oom 内存耗尽

功能描述 检测宿主机或容器内发生的 OOM（Out of Memory）事件，记录被 OOM Killer 终止的进程（victim）与触发 OOM 的进程（trigger）信息，以及对应容器和 memory cgroup 的详细信息，提供完整的故障快照。同时维护宿主机和各容器的 OOM 计数指标。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "trigger_memcg_css": "0xff4b8d8be3818000",
        "trigger_container_id": "***",
        "trigger_container_hostname": "***.docker",
        "trigger_pid": 3218804,
        "trigger_process_name": "java",
        "victim_memcg_css": "0xff4b8d8be3818000",
        "victim_container_id": "***",
        "victim_container_hostname": "***.docker",
        "victim_pid": 3218745,
        "victim_process_name": "java",
        "cgroup_memory_limit": 2147483648,
        "cgroup_memory_usage": 2143289344,
        "memory_snapshot": {
            "top_processes": [
                {
                    "pid": 3218745,
                    "process_name": "java",
                    "vm_rss": 1604321280,
                    "rss_anon": 1509949440,
                    "rss_file": 83886080,
                    "rss_shmem": 0,
                    "vm_swap": 0,
                    "total": 1593835520
                }
            ],
            "host_meminfo": {
                "MemAvailable": 3355443200,
                "Cached": 1073741824,
                "Slab": 268435456
            },
            "victim_cgroup": {
                "container_id": "***",
                "cgroup_path": "kubepods.slice/...",
                "current": 2143289344,
                "max": 2147483648,
                "stat": {
                    "anon": 1509949440,
                    "file": 83886080
                },
                "events": {
                    "oom": 1,
                    "oom_kill": 1
                }
            }
        }
    }
}

字段含义解释

victim_process_name / victim_pid：被 OOM Killer 终止的进程名称与 PID
victim_container_hostname / victim_container_id：被终止进程所在的容器主机名与容器 ID
victim_memcg_css：被终止进程对应的 memory cgroup 指针（十六进制）
trigger_process_name / trigger_pid：触发 OOM 的进程名称与 PID
trigger_container_hostname / trigger_container_id：触发进程所在的容器主机名与容器 ID
trigger_memcg_css：触发进程对应的 memory cgroup 指针（十六进制）
cgroup_memory_limit / cgroup_memory_usage：内核事件中携带的 cgroup 内存限制与使用量
memory_snapshot.top_processes：OOM 现场按 RSS/swap 排序的 Top 进程，包含 RssAnon、RssFile、RssShmem、VmRSS、VmSwap
memory_snapshot.host_meminfo：OOM 现场关键宿主机 /proc/meminfo 字段，如 MemAvailable、Cached、Slab、swap、anon/file 活跃页等
memory_snapshot.trigger_cgroup / victim_cgroup：触发容器和受害容器的 cgroup 路径、current/max、memory.stat 和 memory.events

5. softlockup 软锁死

功能描述 检测系统 softlockup 事件（CPU 长时间无法被调度，约 1 秒），提供导致锁死的目标进程信息、所在 CPU 及所有 CPU 的 NMI 回溯信息。采用退避（backoff）策略，同一轮事件风暴期间上报间隔从 10 分钟递增至最长 3 小时，防止重复上报。同时维护 softlockup 发生次数的计数指标。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "cpu": 15,
        "pid": 12345,
        "comm": "kworker/15:0",
        "cpus_stack": "2025-06-10 14:30:22 sysrq: Show backtrace of all active CPUs\nNMI backtrace for cpu 15\n..."
    }
}

字段含义解释

cpu：发生 softlockup 的 CPU 编号
pid：触发 softlockup 的进程 PID
comm：触发 softlockup 的进程名称
cpus_stack：所有 CPU 的 NMI 回溯信息（多行文本，包含时间戳和调用栈）

6. hungtask 任务挂起

功能描述 检测系统 hungtask 事件，捕获当前所有处于 D 状态（不可中断睡眠）的进程内核栈及所有 CPU 的回溯信息，用于保留故障现场。采用退避策略，同一轮事件风暴期间上报间隔从 10 分钟递增至最长 3 小时。同时维护 hungtask 发生次数的计数指标。注意：部分 Linux 发行版（如 Fedora 42）默认禁用 hungtask 检测，此时该观测器不会启动。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "pid": 2567042,
        "comm": "kworker/u48:2",
        "cpus_stack": "2025-06-10 09:57:14 sysrq: Show backtrace of all active CPUs\nNMI backtrace for cpu 33\n...",
        "blocked_processes_stack": "task:java            state:D stack:    0 pid: 12345 ..."
    }
}

字段含义解释

pid：触发 hungtask 检测的进程 PID
comm：触发 hungtask 检测的进程名称
cpus_stack：所有 CPU 的 NMI 回溯信息（多行文本，包含时间戳和调用栈）
blocked_processes_stack：D 状态进程的内核栈信息

7. memory_reclaim_events 内存回收

功能描述 检测容器进程发生直接内存回收（direct reclaim）的事件，当同一进程在 1 秒内直接回收时间超过阈值（默认 900ms）时触发，记录回收耗时、进程及容器信息。注意：该观测器仅记录容器进程的内存回收事件，宿主机进程的事件会被过滤。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "pid": 1896137,
        "comm": "java",
        "deltatime": 1412702917
    }
}

字段含义解释

comm：触发直接内存回收的进程名称
pid：触发进程的 PID
deltatime：直接回收耗时（纳秒）

8. ras 硬件错误

功能描述 通过内核 tracepoint 检测 CPU、内存、PCIe 等硬件错误，支持 5 种硬件错误来源：MCE（Machine Check Exception）、EDAC（内存控制器）、ACPI/GHES（非标准硬件错误）、PCIe AER（高级错误上报）、MCE 阈值中断（THR）。错误按严重程度分级：Corrected（已纠正）、UncorrectedRecoverable（未纠正可恢复）、UncorrectedFatal（未纠正致命）。MCE 阈值中断事件采用冷却策略（默认 30 分钟），防止中断风暴触发大量重复上报。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

MCE 示例数据

{
    "tracer_data": {
        "dev": "CPU/MEM",
        "event": "MCE",
        "type": "UncorrectedRecoverable",
        "timestamp": 1749600000000000000,
        "info": "{\"mcg_cpu_cap\":4096,\"banks_msr_status\":9295429630892703744,\"cpu\":2,\"socketid\":0,\"bank\":5}"
    }
}

PCIe AER 示例数据

{
    "tracer_data": {
        "dev": "PCIe 0000:3b:00.0",
        "event": "AER",
        "type": "UncorrectedRecoverable",
        "timestamp": 1749600000000000000,
        "info": "{\"dev_name\":\"0000:3b:00.0\",\"err_type\":\"UncorrectedRecoverable\",\"err_reason\":\"Completion Timeout\",\"tlp_header\":\"not available\"}"
    }
}

字段含义解释

dev：发生错误的硬件设备（如 CPU/MEM、PCIe 0000:3b:00.0）
event：错误类型（MCE / EDAC / NON_STANDARD / AER / MCE_THRESHOLD）
type：错误严重程度（Corrected / UncorrectedRecoverable / UncorrectedDeferred / UncorrectedFatal / Info）
timestamp：硬件错误发生时的时间戳
info：JSON 格式的详细错误信息，内容因 event 类型不同而不同

9. netdev_events 网络设备

功能描述 通过订阅内核 netlink RTM_NEWLINK 消息，检测网卡链路状态变化事件（down/up、MTU 变更、AdminDown、CarrierDown 等），输出接口名称、链路状态、MAC 地址及驱动信息。观测器启动时会扫描 device_list 中配置的网卡当前状态作为基线，后续仅上报状态变化事件。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "ifname": "eth1",
        "index": 3,
        "linkstatus": "linkStatusAdminDown, linkStatusCarrierDown",
        "mac": "5c:6f:69:34:dc:72",
        "start": false,
        "driver": "ixgbe",
        "driver_version": "5.1.0-k",
        "firmware_version": "3.25 0x80000421 1.2163.0"
    }
}

字段含义解释

ifname：网络接口名称（如 eth1）
index：接口索引号
linkstatus：链路状态变化描述（可包含多个状态）
mac：网卡 MAC 地址
start：是否为启动时扫描的基线事件（true：启动扫描，false：实时变化事件）
driver：网卡驱动名称
driver_version：网卡驱动版本
firmware_version：网卡固件版本

10. netdev_bonding_lacp LACP 协议

功能描述 检测 bonding 模式下 LACP（Link Aggregation Control Protocol，IEEE 802.3ad）协议的状态变化，读取并记录 /proc/net/bonding/ 下所有 bonding 接口的完整状态信息，包含模式、MII 状态、Actor/Partner 协商参数、Slave 链路状态等。仅在系统存在 IEEE 802.3ad bonding 模式接口时自动启用。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "content": "/proc/net/bonding/bond0\nEthernet Channel Bonding Driver: v4.18.0...\nBonding Mode: IEEE 802.3ad Dynamic link aggregation\nMII Status: down\n..."
    }
}

字段含义解释

content：完整的 bonding 接口状态信息（多行文本，包含所有 Slave 的 LACP 协商细节，等同于 /proc/net/bonding/bondX 文件内容）

11. netdev_txqueue_timeout 发送队列超时

功能描述 检测网卡发送队列超时（TX queue timeout）事件，记录发生超时的队列索引、设备名称和驱动名称，用于定位网卡发送方向的硬件故障。

数据存储 自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_data": {
        "queue_index": 3,
        "device_name": "eth0",
        "driver_name": "ixgbe"
    }
}

字段含义解释

queue_index：发生超时的发送队列索引
device_name：网卡设备名称
driver_name：网卡驱动名称

⚙️ 原理

整体架构

HUATUO 的异常事件观测基于 eBPF 技术，在内核态以极低的性能损耗采集异常事件现场数据，并通过用户态守护进程完成格式化、过滤、容器信息关联和持久化存储。

graph TB
    subgraph "Linux Kernel"
        direction TB
        K1["kprobe 挂钩\n(softirq_tracing / softlockup / hungtask\n oom / memory_reclaim_events / dropwatch\n net_rx_latency / netdev_txqueue_timeout)"]
        K2["tracepoint 挂钩\n(ras: MCE / EDAC / AER / ACPI)"]
        K3["netlink 订阅\n(netdev_events: RTM_NEWLINK)"]
        K4["kprobe 挂钩\n(netdev_bonding_lacp: 802.3ad)"]
        PEB["Perf Event 环形缓冲区\n(8192 页)"]
    end

    subgraph "HUATUO 用户态"
        direction TB
        EH["Go 事件处理 goroutine\n(每类事件独立运行)"]
        CF["过滤器\n(阈值判断 / 降噪 / 已知问题过滤)"]
        CM["容器信息关联\n(CSS → ContainerID\n NetNS → ContainerID)"]
    end

    subgraph "存储"
        ES["Elasticsearch"]
        DISK["本地磁盘文件"]
    end

    K1 --> PEB
    K2 --> PEB
    K4 --> PEB
    PEB --> EH
    K3 --> EH
    EH --> CF
    CF --> CM
    CM --> ES
    CM --> DISK

事件处理流程

sequenceDiagram
    participant K as Linux Kernel
    participant B as eBPF Program
    participant P as Perf Event Buffer
    participant H as Go 事件处理器
    participant F as 过滤器
    participant S as 存储

    K->>B: 触发 kprobe / tracepoint
    B->>B: 采集现场数据<br/>(进程信息 / 内核栈 / 网络上下文)
    B->>P: 写入 perf event 环形缓冲区
    H->>P: 读取事件数据（阻塞等待）
    H->>F: 格式化并执行过滤<br/>(阈值 / 降噪 / 已知问题)
    F->>H: 通过过滤的事件
    H->>H: 关联容器信息<br/>(CSS / NetNS 映射)
    H->>S: 持久化存储<br/>(Elasticsearch / 本地文件)

🌟 欢迎 Star: https://github.com/ccfos/huatuo

👀 欢迎订阅官方微信公众号
微信公众号二维码

5.3 - 全自动化追踪

🎯 关于 HUATUO（华佗）

HUATUO（华佗）是由滴滴开源并依托 CCF（中国计算机学会）孵化的操作系统深度观测项目，专注为云原生通用计算、AI 计算、云服务、基础服务等提供操作系统内核级深度观测能力。

📖 概述

HUATUO AutoTracing（全自动化追踪）是一种事件驱动的自动诊断机制。当物理机或容器出现 CPU 突增、D 状态进程堆积、磁盘 IO 打满、内存突发分配等性能异常时，系统依据预设阈值自动触发现场数据采集，无需人工介入即可保留完整的诊断快照。

采集内容包括 eBPF 火焰图（perf 工具系统级或容器级 CPU 调用栈采样）、D 状态进程内核调用栈、磁盘 IO 调用栈、进程内存使用排行等。为避免持续触发导致的数据冗余，各事件均内置冷却策略（默认 30 分钟），确保在事件风暴期间仅保留关键快照。

当前支持 5 类事件：cpusys（物理机 CPU sys 突增）、cpuidle（容器 CPU 使用率突增）、dload（容器 D 状态负载突增）、iotracing（磁盘 IO 异常）、memburst（内存突发分配）。

🎯 场景

AI 训练任务 CPU 热点定位：在 GPU 训练集群中，训练任务偶发性卡顿往往由内核态 CPU 占用率突增（cpusys）引起。AutoTracing 在 sys 占用率超过阈值的瞬间自动触发系统级 perf 火焰图采集，将内核调用栈热点以火焰图数据结构（flamedata）持久化，支持在故障消失后进行离线分析，避免人工复现困难。

Kubernetes 容器 CPU 性能毛刺分析：在微服务架构中，容器 CPU 使用率（cpuidle）的短暂突增可能导致响应延迟超时，但问题往往在告警响应前已恢复。AutoTracing 在容器 CPU 超阈值时自动触发容器级 perf 采样，生成精确到容器 cgroup 范围的火焰图，快速定位热点函数，降低依赖日志排查的时间成本。

云原生环境 D 状态进程堆积排查：在高 IO 负载或存储抖动时，容器内可能出现大量 D 状态（不可中断睡眠）进程，导致系统卡顿。dload 事件通过对容器负载均值进行指数加权移动平均（EMA）计算，在 D 状态进程负载超过阈值时自动抓取容器内及宿主机上相关进程的内核调用栈，精准定位阻塞根因。

磁盘 IO 瓶颈根因定位：在大数据或日志密集型业务中，磁盘 IO 利用率或写入带宽打满会导致应用请求堆积。iotracing 持续轮询 /proc/diskstats，在磁盘 IO 指标连续两次超过阈值时触发，采集高 IO 进程列表（含各进程读写字节数与打开文件详情）及正在等待 IO 调度的进程内核调用栈，快速缩小磁盘 IO 高消耗的进程范围。

🚀 使用

配置参数

各事件可通过以下参数进行调优，参数均提供默认值，无需配置即可运行：

参数	默认值	说明
`cpuidle.user_threshold`	`75`（%）	容器 CPU user 占用率触发阈值
`cpuidle.sys_threshold`	`45`（%）	容器 CPU sys 占用率触发阈值
`cpuidle.usage_threshold`	`90`（%）	容器 CPU 总占用率触发阈值
`cpuidle.delta_user_threshold`	`45`（%）	容器 CPU user 占用率增量触发阈值
`cpuidle.delta_sys_threshold`	`20`（%）	容器 CPU sys 占用率增量触发阈值
`cpuidle.delta_usage_threshold`	`55`（%）	容器 CPU 总占用率增量触发阈值
`cpuidle.interval`	`10`（秒）	检测间隔
`cpuidle.interval_tracing`	`1800`（秒）	同一容器触发冷却时间
`cpuidle.run_tracing_tool_timeout`	`10`（秒）	perf 火焰图采集超时
`cpusys.sys_threshold`	`45`（%）	物理机 CPU sys 占用率触发阈值
`cpusys.delta_sys_threshold`	`20`（%）	物理机 CPU sys 占用率增量触发阈值
`cpusys.interval`	`10`（秒）	检测间隔
`cpusys.run_tracing_tool_timeout`	`10`（秒）	perf 火焰图采集超时
`dload.threshold_load`	`5`	容器不可中断进程负载 EMA 触发阈值
`dload.interval`	`10`（秒）	检测间隔
`dload.interval_tracing`	`1800`（秒）	同一容器触发冷却时间
`iotracing.rbps_threshold`	`2000`（MB/s）	磁盘读吞吐率触发阈值
`iotracing.wbps_threshold`	`1500`（MB/s）	磁盘写吞吐率触发阈值
`iotracing.util_threshold`	`90`（%）	磁盘 IO 利用率触发阈值
`iotracing.await_threshold`	`100`（ms）	磁盘 IO 平均等待时间触发阈值
`iotracing.run_tracing_tool_timeout`	`10`（秒）	IO 调用栈采集超时
`iotracing.max_proc_dump`	`10`	最多采集的高 IO 进程数
`iotracing.max_files_per_proc_dump`	`5`	每个进程最多采集的打开文件数
`memburst.delta_memory_burst`	`100`（%）	匿名内存相对滑动窗口最早采样的增长率阈值（100% 即 ≥ 2 倍时触发）
`memburst.delta_anon_threshold`	`70`（%）	匿名内存占物理机总内存的比例阈值
`memburst.interval`	`10`（秒）	检测间隔
`memburst.interval_tracing`	`1800`（秒）	触发冷却时间
`memburst.sliding_window_length`	`60`	滑动窗口采样数（对应 600 秒历史数据）
`memburst.dump_process_max_num`	`10`	最多采集的内存消耗进程数

事件列表

事件名称（tracer_name）	观测对象	触发条件	典型场景
`cpusys`	物理机	sys > 45% 或 delta_sys > 20%	内核态 CPU 突增、系统调用热点
`cpuidle`	容器	(user>75% 且 delta_user>45%) 或 (sys>45% 且 delta_sys>20%) 或 (total>90% 且 delta_total>55%)	容器 CPU 使用率突增、热点函数分析
`dload`	容器	不可中断进程负载 EMA > 5	D 状态进程堆积、IO 阻塞
`iotracing`	物理机	磁盘 IO 指标连续两次超阈值	磁盘 IO 打满、IO 等待高延迟
`memburst`	物理机	匿名内存 ≥ 窗口最早值 2 倍且占总内存 ≥ 70%	内存突发分配、OOM 前兆

通用字段说明

所有事件数据均包含以下通用字段：

hostname：物理机 hostname
region：物理机所在可用区
uploaded_time：数据上传时间
container_id：如果事件关联容器，则记录的容器 ID
container_hostname：如果事件关联容器，则记录的容器 hostname
container_host_namespace：如果事件关联容器，则记录容器的 K8s 命名空间
container_type：容器类型
container_qos：容器 QoS 级别
tracer_name：事件名称（如 cpusys、memburst 等）
tracer_id：此次的 tracing ID
tracer_time：触发 tracing 时间
tracer_type：触发类型（手动触发或自动触发）
tracer_data：特定事件私有数据（详见各事件说明）

1. cpusys

功能描述 周期性读取 /proc/stat，计算物理机 CPU sys 占用率及相邻两次采样的增量。当 sys 占用率超过阈值（默认 45%）或增量超过阈值（默认 20%）时，触发系统级 perf 采样，生成全机 CPU 火焰图数据。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_name": "cpusys",
    "tracer_data": {
        "now_sys": 52,
        "sys_threshold": 45,
        "deltasys": 25,
        "deltasys_threshold": 20,
        "flamedata": [
            {"level": 0, "value": 1000, "self": 0, "label": "all"},
            {"level": 1, "value": 350, "self": 350, "label": "do_syscall_64"}
        ]
    }
}

字段含义解释

now_sys：触发时物理机 CPU sys 占用率（%）
sys_threshold：sys 占用率触发阈值（%）
deltasys：相邻两次采样的 sys 占用率增量（%）
deltasys_threshold：sys 增量触发阈值（%）
flamedata：perf 采样生成的火焰图帧数据列表，每帧包含：
- level：调用栈层级深度
- value：该帧（含子帧）的采样计数
- self：该帧自身（不含子帧）的采样计数
- label：函数或进程名称标签

2. cpuidle

功能描述 周期性读取容器 cgroup CPU 统计，计算容器 CPU user、sys、总占用率及各指标的相邻增量。当任意一组阈值条件成立时（user>75% 且 delta_user>45%，或 sys>45% 且 delta_sys>20%，或 total>90% 且 delta_total>55%），触发容器级 perf 采样生成火焰图。同一容器默认 30 分钟冷却，避免重复触发。支持通过容器过滤器（filter）排除特定容器。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_name": "cpuidle",
    "tracer_data": {
        "user": 80,
        "user_threshold": 75,
        "deltauser": 48,
        "deltauser_threshold": 45,
        "sys": 12,
        "sys_threshold": 45,
        "deltasys": 5,
        "deltasys_threshold": 20,
        "usage": 92,
        "usage_threshold": 90,
        "deltausage": 53,
        "deltausage_threshold": 55,
        "flamedata": [
            {"level": 0, "value": 1000, "self": 0, "label": "all"},
            {"level": 1, "value": 800, "self": 800, "label": "java/com.example.App.main"}
        ]
    }
}

字段含义解释

user / user_threshold：触发时容器 CPU user 占用率（%）及其阈值
deltauser / deltauser_threshold：user 占用率增量（%）及其阈值
sys / sys_threshold：触发时容器 CPU sys 占用率（%）及其阈值
deltasys / deltasys_threshold：sys 占用率增量（%）及其阈值
usage / usage_threshold：触发时容器 CPU 总占用率（%）及其阈值
deltausage / deltausage_threshold：总占用率增量（%）及其阈值
flamedata：容器级 perf 采样火焰图帧数据，字段含义同 cpusys

3. dload

功能描述 通过 netlink 及 cgroup 读取容器内进程状态，对不可中断（D 状态）进程的负载贡献进行指数加权移动平均（EMA）计算。当容器 D 状态负载 EMA 超过阈值（默认 5）时，采集容器内及宿主机中所有 D 状态进程的内核调用栈，支持已知问题过滤（issues_list）降低误报率。同一容器默认 30 分钟冷却。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_name": "dload",
    "tracer_data": {
        "threshold": 5,
        "nr_sleeping": 120,
        "nr_running": 4,
        "nr_stopped": 0,
        "nr_uninterruptible": 8,
        "nr_iowait": 3,
        "load_avg": 7.23,
        "dload_avg": 6.81,
        "known_issue": "",
        "stack": "task:java            state:D stack:    0 pid: 12345 tgid: 12345 ...\n  io_schedule+0x18/0x40\n  ext4_file_write_iter+0x..."
    }
}

字段含义解释

threshold：D 状态负载 EMA 触发阈值
nr_sleeping：容器内睡眠状态进程数
nr_running：容器内运行状态进程数
nr_stopped：容器内停止状态进程数
nr_uninterruptible：容器内不可中断（D 状态）进程数
nr_iowait：容器内 IO 等待状态进程数
load_avg：触发时容器负载均值
dload_avg：触发时容器 D 状态负载 EMA 值
known_issue：命中的已知问题描述（为空表示未命中）
stack：D 状态进程的内核调用栈（多进程多行文本）

4. iotracing

功能描述 以 5 秒间隔轮询 /proc/diskstats，计算各磁盘设备的读写吞吐率、IO 利用率及 IO 等待时间。当任一指标连续两次采样均超过对应阈值时触发（自动忽略 md 设备），采集高 IO 进程列表（含各进程的读写字节数及打开文件统计）以及正在等待 IO 调度的进程内核调用栈。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_name": "iotracing",
    "tracer_data": {
        "reason_snapshot": {
            "type": "ioutil",
            "device": "sda",
            "iostatus": {
                "read_bps": 120,
                "read_iops": 450,
                "read_await": 12,
                "write_bps": 2100,
                "write_iops": 890,
                "write_await": 145,
                "io_util": 95,
                "queue_size": 32
            }
        },
        "process_io_data": [
            {
                "pid": 12345,
                "comm": "java",
                "container_hostname": "app-pod-xxx",
                "fs_read": 0,
                "fs_write": 52428800,
                "disk_read": 0,
                "disk_write": 49152000,
                "file_stat": ["/data/logs/app.log"],
                "file_count": 1
            }
        ],
        "timeout_io_stack": [
            {
                "pid": 12345,
                "comm": "java",
                "container_hostname": "app-pod-xxx",
                "latency_us": 250000,
                "stack": {
                    "back_trace": [
                        "io_schedule+0x18/0x40",
                        "ext4_file_write_iter+0x2a0/0x4c0"
                    ]
                }
            }
        ]
    }
}

字段含义解释

reason_snapshot：触发 IO 采集的原因快照
- type：触发类型（ioutil IO 利用率 / read_bps 读吞吐率 / write_bps 写吞吐率 / read_await 读等待时间 / write_await 写等待时间）
- device：触发阈值的磁盘设备名称
- iostatus：触发时各磁盘 IO 指标快照（read_bps/write_bps 单位 MB/s，read_await/write_await 单位 ms，io_util 单位 %，queue_size 为队列深度）
process_io_data：高 IO 进程列表，每条记录包含：
- pid / comm：进程 PID 与进程名
- container_hostname：进程所在容器 hostname（宿主机进程为空）
- fs_read / fs_write：进程文件系统层面的读写字节数
- disk_read / disk_write：进程磁盘层面的实际读写字节数
- file_stat：进程当前打开的文件路径列表
- file_count：进程打开的文件总数
timeout_io_stack：等待 IO 调度的进程调用栈列表，每条记录包含：
- pid / comm：进程 PID 与进程名
- container_hostname：进程所在容器 hostname
- latency_us：IO 等待时长（微秒）
- stack.back_trace：内核调用栈帧列表

5. memburst

功能描述 周期性采样物理机匿名内存（anonymous memory）使用量，维护长度为 60 个采样点（对应 600 秒）的滑动窗口。当当前匿名内存 ≥ 窗口最早采样值的 2 倍，且匿名内存占物理机总内存 ≥ 70% 时触发，采集内存消耗最多的前 N 个进程（默认 10 个）的 PID、进程名和 RSS 内存值。默认 30 分钟冷却。

数据存储 事件数据自动存储至 Elasticsearch 或物理机磁盘文件。

示例数据

{
    "tracer_name": "memburst",
    "tracer_data": {
        "top_memory_usage": [
            {
                "pid": 3456,
                "process_name": "java",
                "memory_size": 8589934592
            },
            {
                "pid": 3789,
                "process_name": "python3",
                "memory_size": 2147483648
            }
        ]
    }
}

字段含义解释

top_memory_usage：内存消耗最多的进程列表（按 RSS 降序排列），每条记录包含：
- pid：进程 PID
- process_name：进程名称
- memory_size：进程 RSS 内存占用（字节）

⚙️ 原理

整体架构

HUATUO AutoTracing 以周期性轮询为基础，结合 eBPF 调用栈采集与 perf 火焰图生成，在内核层实现低开销的异常诊断数据采集。

graph TB
    subgraph "数据来源"
        P1["/proc/stat\n（物理机 CPU 占用率）"]
        P2["cgroup CPU 统计\n（容器 CPU 占用率）"]
        P3["netlink / cgroup\n（容器进程状态 / 负载均值）"]
        P4["/proc/diskstats\n（磁盘 IO 指标）"]
        P5["/proc/meminfo\n+ cgroup 内存统计"]
    end

    subgraph "HUATUO AutoTracing"
        DT["阈值检测\n（滑动窗口 / EMA / 连续两次超阈值）"]
        BO["冷却策略\n（30 分钟 backoff）"]
        PERF["perf 火焰图采集\n（系统级 / 容器级）"]
        BPF["eBPF kprobe\n（IO 调度延迟追踪）"]
        CM["容器信息关联\n（cgroup → ContainerID）"]
    end

    subgraph "存储"
        ES["Elasticsearch"]
        DISK["本地磁盘文件"]
    end

    P1 --> DT
    P2 --> DT
    P3 --> DT
    P4 --> DT
    P5 --> DT
    DT --> BO
    BO --> PERF
    BO --> BPF
    PERF --> CM
    BPF --> CM
    CM --> ES
    CM --> DISK

事件处理流程

sequenceDiagram
    participant M as 周期性指标采集
    participant D as 阈值检测器
    participant B as 冷却策略（backoff）
    participant C as 现场数据采集器
    participant S as 存储

    M->>D: 推送指标（每 10 秒）
    D->>D: 阈值判断（滑动窗口 / EMA / 连续两次）
    alt 超过阈值
        D->>B: 检查冷却状态
        alt 允许触发
            B->>C: 触发现场采集<br/>（perf 火焰图 / D 状态进程栈 / IO 进程列表）
            C->>C: 关联容器信息（cgroup → ContainerID）
            C->>S: 持久化存储（Elasticsearch / 本地文件）
        else 冷却期内
            B-->>D: 跳过本次触发
        end
    end

🌟 欢迎 Star: https://github.com/ccfos/huatuo

👀 欢迎订阅官方微信公众号
微信公众号二维码

5.4 - 持续 Profiling

概述

持续 Profiling（Continuous Profiling） 对操作系统与应用进行长期、持续的性能采样，覆盖 CPU、内存、锁 三类 Profile，产出标准 pprof 格式的火焰图数据。采样数据持久化至存储后端，并支持在 Grafana 中按时间窗口聚合查看，为性能回归分析与故障复盘等场景提供数据底座。

架构

持续 Profiling 由三个组件协作完成：

组件	角色	说明
huatuo-apiserver	控制面	接收 Profiling 任务并调度至目标节点，提供 Pyroscope 兼容的火焰图查询接口
huatuo-bamai	数据面	在目标节点执行采集，基于 eBPF（C/C++/Go）或第三方工具（Java/Python）采样调用栈
Grafana	可视化	通过 pyroscope 数据源插件直连 apiserver，渲染火焰图

支持的采集语言与底层实现：

语言	采集类型	底层实现
C / C++ / Go	CPU / 内存 / 锁	eBPF（perf_event + 栈映射）
Java	CPU / 内存 / 锁	async-profiler
Python	CPU / 内存	py-spy / memray

Profile 类型标识（Grafana 查询用）：

类型	profile_type
CPU	`process_cpu:cpu:nanoseconds:cpu:nanoseconds`
内存	`memory:alloc_space:bytes:space:bytes`
锁	`process_lock:lock:count:lock:count` `process_lock:lock:nanoseconds:lock:nanoseconds`

运行

最简方式是使用 Docker Compose 一键拉起 Elasticsearch、Prometheus、Grafana、huatuo-apiserver 与 huatuo-bamai：

$ docker compose --project-directory ./build/docker up

启动后各组件地址：

组件	地址
huatuo-apiserver	`http://127.0.0.1:12740`
huatuo-bamai 指标	`http://127.0.0.1:19704/metrics`
Grafana	`http://localhost:3000`（admin / admin）
Elasticsearch	`http://127.0.0.1:9200`

Profiling 相关配置位于 huatuo-apiserver.conf 的 [Profiling] 段：

参数	默认值	说明
`CPUProfilingInterval`	10	单次 CPU 采样时长（秒）
`MemoryProfilingInterval`	10	单次内存采样时长（秒）
`CPUSingleTraceTimeout`	20	单次 CPU 采样超时（秒）
`MemorySingleTraceTimeout`	20	单次内存采样超时（秒）
`ThirdPartyToolLimit`	10	第三方工具（async-profiler 等）最大并发数
`FlameGraphBaseURL`	`http://localhost:8006/d`	火焰图大盘基址，用于拼接任务结果链接

若希望任务返回的 results.url 直达 Grafana 大盘，将 FlameGraphBaseURL 改为实际 Grafana 地址（如 http://localhost:3000/d）。

调用 apiserver API 需通过 Authorization 请求头携带用户 ID（在 huatuo-apiserver.conf 的 [[Auth.users]] 中配置）。

默认 conf 未配置任何用户，此时鉴权中间件不启用，Authorization 可填任意非空值。生产环境请务必在 [[Auth.users]] 中配置真实用户，并将 <user-id> 替换为对应 ID。

采集：以宿主 CPU 为例

以对宿主机进行 CPU Profiling 为例。宿主级采集不指定 container 字段，target_process_language 设为 go（或 c/c++）以触发 eBPF 原生采集器：

$ curl -X POST http://127.0.0.1:12740/v1/profiles \
    -H "Content-Type: application/json" \
    -H "Authorization: <user-id>" \
    -d '{
        "type": "cpu",
        "target_process_language": "go",
        "hostname": "<target-host>",
        "duration": 30
    }'

请求字段说明：

字段	说明
`type`	采集类型：`cpu` / `memory`
`target_process_language`	目标语言：`go`、`c`、`c++`、`java`、`python`
`hostname`	必填。目标宿主机名，apiserver 据此将任务下发至 `http://{hostname}:19704` 上的 huatuo-bamai agent（需与 agent 上报的 hostname 一致）
`duration`	采集总时长（秒），期间 agent 按 `CPUProfilingInterval` 周期采样
`container`	容器级采集时填容器 hostname，宿主级采集留空
`target_exec_path`	可选，按可执行文件路径过滤目标进程

返回任务 ID：

{ "id": "<task-id>" }

采集流程：

apiserver 创建任务并下发至目标宿主上的 huatuo-bamai。
huatuo-bamai 加载 eBPF 程序（perf_event_sw_cpu_clock），按默认 99Hz 采样内核栈与用户栈。
采样数据经符号化后转换为 pprof 格式，写入 Elasticsearch（index 名为 huatuo-apiserver.conf 中 [ElasticSearch].Index 配置项，默认 huatuo_bamai）。

查询任务状态与停止任务：

# 查询任务状态
$ curl -H "Authorization: <user-id>" \
    http://127.0.0.1:12740/v1/profiles/<task-id>

# 停止任务
$ curl -X PATCH http://127.0.0.1:12740/v1/profiles/<task-id> \
    -H "Content-Type: application/json" \
    -H "Authorization: <user-id>" \
    -d '{"status":"stopped"}'

任务完成后，状态响应体 results.url 字段返回火焰图链接（基于 FlameGraphBaseURL 拼接）。

查看

火焰图通过 Grafana 大盘查看，大盘已预置并随 Docker Compose 自动加载：

大盘	UID	适用对象
Continuous Profiling(host)	`continuous-profiling-host`	宿主机
Continuous Profiling(container)	`continuous-profiling-container`	容器

打开 http://localhost:3000/d/continuous-profiling-host，选择 hostname 与 type（profile_type），即可查看聚合火焰图。大盘上方时序图展示采样点分布，下方为火焰图面板，支持按时间范围聚合查看。

原理：Grafana 通过 grafana-pyroscope-datasource 插件将火焰图请求转发至 apiserver 的 /v1/profiles/flamegraph/ 路径；apiserver 实现 Pyroscope Querier 协议（SelectMergeStacktraces 等），从 Elasticsearch 检索 pprof 数据并合并返回。

5.5 - 硬件故障诊断

概述

HUATUO 华佗以零侵入、低开销的方式持续监听 Linux 内核上报的硬件错误事件，将结构化的故障记录持久化存储，并以 Prometheus 指标形式对外暴露汇总计数器，供告警与可视化系统使用。

应用场景

通用计算

大规模服务器集群中，内存 ECC 可纠正错误（CE）是常见的低级别故障信号。单次 CE 可由硬件自动修复，但若同一 DIMM 上 CE 频率持续升高，则预示着内存条即将失效。华佗通过 EDAC/MCE tracepoint 实时感知此类事件，使工程团队能够在内存彻底失效前完成预防性换件，避免意外宕机。
AI 计算

AI 训练任务对硬件可靠性要求极高，单块故障的 PCIe 设备即可导致整个训练任务失败。华佗支持 PCIe AER 事件监测，能够实时上报 GPU、NVLink Bridge、RDMA 网卡（如 InfiniBand HCA）的链路层错误（Data Link Protocol Error、ECRC Error 等），为 AI 集群调度系统提供硬件健康状态数据，支撑故障节点的快速隔离与任务迁移。
存储服务

存储服务器通常配备大量 PCIe NVMe SSD 和 HBA 卡。PCIe AER 中的 Completion Timeout、Malformed TLP 等错误是存储设备性能抖动或掉线的先兆。华佗监控数据可与存储 IO 延迟指标联动，支撑根因分析。
安全合规

金融、政务等对合规有严格要求的行业，需要完整记录所有硬件故障历史。结构化事件存储（含时间戳、设备标识、错误类型、原始寄存器值）可直接作为硬件健康日志的合规存证。

监控原理

HUATUO 华佗通过 eBPF 技术观测内核的 MCE / EDAC / ACPI GHES / PCIe AER 子系统，当 eBPF tracepoint 被触发时，将原始事件写入 BPF Perf Event Buffer。用户态程序读取事件，解析结构体字段，生成结构化记录，并存储至本地或远端。总体架构如下：

RAS 原理

Linux 内核的 RAS 体系由多个相对独立的子系统协同构成，共同覆盖从 CPU 内部错误到 PCIe 链路错误的完整硬件故障谱系。

graph TB
    subgraph HW["硬件层"]
        CPU["CPU\nx86 / x86-64"]
        MEM["内存\nDDR4/DDR5 DIMM ECC"]
        Platform["平台硬件\nSoC / PCH"]
        PCIeDev["PCIe 设备\nGPU / NVMe / HCA / FPGA"]
    end

    subgraph FW["固件层"]
        BIOS["BIOS / UEFI\nCPER 缓冲区（APEI）"]
    end

    subgraph Kernel["Linux 内核 RAS 子系统"]
        MCE["MCE 子系统\narch/x86/kernel/cpu/mce"]
        EDAC["EDAC 子系统\ndrivers/edac"]
        GHES["ACPI GHES 子系统\ndrivers/acpi/apei"]
        AER["PCIe AER 子系统\ndrivers/pci/pcie/aer"]
    end

    subgraph TP["内核 Tracepoint"]
        TP1["tracepoint/mce/mce_record"]
        TP2["tracepoint/ras/mc_event"]
        TP3["tracepoint/ras/non_standard_event"]
        TP4["tracepoint/ras/aer_event"]
    end

    CPU -->|"MCE 异常（#MC）+ THR 中断"| MCE
    MEM -->|ECC 错误| EDAC
    Platform -->|APEI 错误记录| BIOS
    BIOS -->|CPER 缓冲区| GHES
    PCIeDev -->|AER 中断| AER

    MCE --> TP1
    EDAC --> TP2
    GHES --> TP3
    AER --> TP4

MCE

MCE（Machine Check Architecture）是处理器内置的硬件容错机制，由 Intel 和 AMD 在各自的架构规范中定义。处理器内部存在若干 Bank（Machine Check Bank），每个 Bank 对应一类硬件资源（如 L1 Cache、L2 Cache、内存控制器、TLB 等）。当检测到硬件错误时，对应 Bank 的 MSR 寄存器（MCi_STATUS、MCi_ADDR、MCi_MISC）被填充错误信息，并触发 MCE 异常。
MCE THR

MCE 支持阈值中断机制。当某类可纠正错误的计数超过预设阈值时，触发专用 APIC 中断（THR），而不升级为完整的 MCE 异常。此机制允许操作系统在错误频率异常升高时提前告警，而非等到错误完全不可纠正时才介入。
EDAC

EDAC（Error Detection And Correction）是 Linux 内核中专门处理内存和硬件 ECC 错误的子系统，其目标是"检测并报告运行在 Linux 下的计算机系统中发生的硬件错误"。EDAC 驱动直接与内存控制器通信，解析 ECC 错误的物理位置（内存控制器编号、Channel、Slot、行列地址）。
ACPI GHES

ACPI GHES（Generic Hardware Error Source，通用硬件错误源）是一种平台无关的硬件错误上报机制，由 BIOS/UEFI 通过 APEI（ACPI Platform Error Interface）规范定义。BIOS 固件将无法被特定驱动处理的硬件错误（如特定 SoC 内部错误、平台特定内存错误）写入 GHES 描述符中的 CPER（Common Platform Error Record）缓冲区。Linux 内核读取 CPER 记录，并上报无法被标准子系统解析的"非标准"错误部分。
PCIe AER

PCIe AER（Advanced Error Reporting）是 PCIe 规范定义的错误上报机制，允许 PCIe 设备向操作系统精确报告链路层和事务层的错误类型。

指标总览

RAS 指标

# HELP huatuo_bamai_ras_hw_total total RAS hardware error events by source type
# TYPE huatuo_bamai_ras_hw_total counter
huatuo_bamai_ras_hw_total{host="hostname",region="dev",type="acpi"} 0
huatuo_bamai_ras_hw_total{host="hostname",region="dev",type="aer"} 0
huatuo_bamai_ras_hw_total{host="hostname",region="dev",type="edac"} 0
huatuo_bamai_ras_hw_total{host="hostname",region="dev",type="mce"} 0
huatuo_bamai_ras_hw_total{host="hostname",region="dev",type="thr"} 0

网卡丢包

huatuo_bamai_netdev_hw_rx_dropped_total{host="hostname",region="dev",device="eth0",driver="ixgbe"} 0

RDMA PFC

# HELP huatuo_bamai_netdev_dcb_pfc_received_total count of the received pfc frames
# TYPE huatuo_bamai_netdev_dcb_pfc_received_total counter
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="0",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="1",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="2",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="3",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="4",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="5",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="6",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_received_total{device="enp6s0f0np0",host="hostname",prio="7",region="dev"} 0
# HELP huatuo_bamai_netdev_dcb_pfc_send_total count of the sent pfc frames
# TYPE huatuo_bamai_netdev_dcb_pfc_send_total counter
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="0",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="1",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="2",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="3",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="4",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="5",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="6",region="dev"} 0
huatuo_bamai_netdev_dcb_pfc_send_total{device="enp6s0f0np0",host="hostname",prio="7",region="dev"} 0

结构化存储

此外，每个硬件错误事件均以结构化形式持久化（存储于本地 huatuo-local 目录或远端 ES/OS 存储等），包含以下公共字段：

{
    "hostname": "hostname",
    "region": "dev",
    "uploaded_time": "2026-03-05T18:28:39.153438921+08:00",
    "time": "2026-03-05 18:28:39.153 +0800",
    "tracer_name": "netdev_event",
    "tracer_time": "2026-03-05 18:28:39.153 +0800",
    "tracer_type": "auto",
    "tracer_data": {
        "ifname": "eth0",
        "index": 2,
        "linkstatus": "linkstatus_admindown",
        "mac": "5c:6f:11:11:11:11",
        "start": false
    }
}

linkstatus 字段的可能取值如下：

linkstatus_adminup 管理员开启网卡，例如 ip link set dev eth0 up
linkstatus_admindown 管理员关闭网卡，例如 ip link set dev eth0 down
linkstatus_carrierup 物理链路恢复
linkstatus_carrierdown 物理链路故障

{
    "hostname": "localhost",
    "region": "xxx",
    "uploaded_time": "2026-05-11T16:58:47.328548319+08:00",
    "time": "2026-05-11 16:58:47.328 +0800",
    "tracer_name": "ras",
    "tracer_time": "2026-05-11 16:58:47.328 +0800",
    "tracer_type": "auto",
    "tracer_data": {
        "dev": "MEM",
        "event": "EDAC",
        "type": "Corrected",
        "timestamp": 537792166031,
        "info": "{\"err_count\":0,\"err_type\":\"Corrected\",\"err_msg\":\"memory read error\",\"label\":\"CPU_SrcID#0_Ha#0_Chan#0_DIMM#0\",\"mc_index\":0,\"top_layer\":0,\"mid_layer\":0,\"low_layer\":-1,\"addr\":7860269056,\"grain\":128,\"syndrome\":0,\"driver\":\" area:DRAM err_code:0000:009f socket:0 ha:0 channel_mask:1 rank:0\"}"
    }
}

字段名	含义
`Device`	发生错误的硬件部件标识（如 `CPU/MEM`、`MEM`、`ACPI`、`PCIe 0000:01:00.0`）
`Event`	事件子类型（`MCE`、`EDAC`、`APIC`、`AER`）
`ErrType`	错误严重级别（见下表）
`Timestamp`	时间戳
`Info`	具体事件的详细字段

错误类型	含义	典型来源
`Corrected`	已由硬件自动纠正，系统无感知	MCE CE, EDAC CE, ACPI Sev=1, AER Severity=2
`UncorrectedRecoverable`	硬件无法纠正，但系统软件可修复的错误	MCE UE, EDAC UE, ACPI Sev=2, AER Severity=0
`UncorrectedDeferred`	硬件无法纠正，需要延迟处理的错误	MCE MCI_STATUS_DEFERRED, EDAC HW_EVENT_ERR_DEFERRED
`UncorrectedFatal`	硬件无法纠正的致命错误，需立即重启	EDAC FATAL, ACPI Sev=3, AER Severity=0
`Info`	期望系统记录日志信息的错误类型	EDAC HW_EVENT_ERR_INFO, ACPI Sev=0

详细说明

MCE

监控部件：CPU 核心、L1/L2/L3 Cache、TLB、内存控制器（IMC）、互连总线（QPI/UPI/Infinity Fabric）。

字段名	MSR 来源	含义
`mcg_cpu_cap`	`MCG_CAP`	机器检查全局能力寄存器。低 8 位（`Count`）表示系统中 MC Bank 的数量。
`mcg_msr_status`	`MCG_STATUS`	机器检查全局状态寄存器**。
`banks_msr_status`	`MCi_STATUS`	Bank 状态寄存器（最核心字段）。低 16 位为 MCA 错误代码（分类错误类型，如内存层次错误、总线错误等）；高位包含 `UC`（不可纠正）、`EN`（已使能）、`MISCV`（MISC 有效）、`ADDRV`（ADDR 有效）、`PCC`（处理器上下文损坏）等控制位。
`banks_msr_addr`	`MCi_ADDR`	发生错误的物理内存地址（仅当 `MCi_STATUS.ADDRV=1` 时有效）。可用于定位故障 DIMM 或 Cache Line。
`banks_msr_misc`	`MCi_MISC`	补充信息寄存器（仅当 `MCi_STATUS.MISCV=1` 时有效）。
`mca_synd_msr`	`MCA_SYND`	综合征寄存器（AMD 专用）。
`mca_ipid_msr`	`MCA_IPID`	实例 ID 寄存器（AMD 专用）。
`instr_pointer`	RIP 寄存器	发生 MCE 时的指令指针（仅当 `MCG_STATUS.EIPV=1` 时可靠）。
`tsc_timestamp`	TSC	发生错误时的 CPU 时间戳计数器值（可与内核时钟换算为绝对时间）。
`walltime`	内核时间	发生错误时的 Unix 时间戳（秒）。
`cpu`	—	发生 MCE 的逻辑 CPU 编号。
`cpuid`	CPUID	发生 MCE 的 CPU 的 CPUID 值（包含 Family/Model/Stepping）。
`apicid`	APIC ID	发生 MCE 的 CPU 对应的 APIC ID（可映射到物理核/超线程）。
`socketid`	—	CPU 插槽编号（Socket ID）。多路服务器场景下用于区分物理 CPU。
`code_seg`	CS 寄存器	发生 MCE 时的代码段寄存器值（用于判断特权级）。
`bank`	—	Bank 编号（通常 Bank 0=L1I，Bank 1=L1D，Bank 2=L2，Bank 4+=内存控制器，但编号因平台而异）。
`cpuvendor`	—	CPU 厂商标识：`0`=Intel，`1`=未知，`2`=AMD。

EDAC

监控部件：内存 ECC 错误。

字段名	含义
`err_count`	本次事件中累计的错误次数。
`err_type`	错误严重级别。
`err_msg`	人类可读错误描述字符串（如 `"CE memory read error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:8 syndrome:0x0)"`）。
`label`	内存条物理位置标签（如 `"CPU_SrcID#0_Ha#0_Chan#0_DIMM#0"`），由 EDAC 驱动根据 DIMM 拓扑生成，可直接对应机器内部的内存插槽位置。
`mc_index`	内存控制器编号（0-based）。多内存控制器服务器上用于区分不同 IMC。
`top_layer`	内存层次结构顶层索引（通常为 Channel 编号，即内存通道号，-1 表示无效）。
`mid_layer`	内存层次结构中层索引（通常为 Slot/Rank 编号，-1 表示无效）。
`low_layer`	内存层次结构底层索引（通常为 Bank/Row 编号，-1 表示无效）。
`addr`	发生错误的物理内存地址（64-bit 无符号整数，0 表示地址无效）。
`grain`	错误粒度（Grain Size，字节数）。表示可能受影响的最小内存单元大小。
`syndrome`	ECC 综合征值。
`driver`	EDAC 驱动名称（如 `"amd64_edac"`、`"sb_edac"`）。

ACPI GHES

监控部件：平台特定硬件错误。

字段名	含义
`severity`	ACPI/CPER 错误严重级别原始值。
`sec_type`	错误部分类型 GUID（16 字节，十六进制字符串）。由 UEFI 规范和各硬件厂商定义，标识错误记录所属的硬件类别（如内存错误部分、PCIe 错误部分、ARM 处理器错误部分等）。
`fru_id`	FRU（Field Replaceable Unit，现场可替换单元）标识符 GUID（16 字节，十六进制字符串）。唯一标识发生错误的可更换硬件组件（如某块内存条、某个 PCIe 卡）。
`fru_text`	FRU 人类可读描述字符串（如 `"CPU0_DIMM_A1"`）。
`data_len`	原始错误数据载荷长度（字节数）。
`raw_data`	原始错误数据的十六进制转储（空格分隔字节）。用于深度诊断，需结合具体硬件厂商文档解析。

PCIe AER

监控设备包括 GPU、NVMe SSD、RDMA 网卡/HCA、FPGA 加速卡、PCIe Switch 等。

字段名	含义
`dev_name`	PCIe 设备名称（BDF 格式），如 `"0000:03:00.0"`，分别对应 Domain:Bus:Device.Function。
`err_type`	错误严重级别（`Corrected` / `Uncorrected` / `Fatal`）。
`err_reason`	具体错误原因描述字符串，由 AER 状态寄存器的比特位解码得出（见下方两张表）。
`tlp_header`	触发错误的 TLP（Transaction Layer Packet）头部四元组（格式：`{dword0, dword1, dword2, dword3}`，十六进制）。TLP 头部包含事务类型、地址、请求者 ID 等信息，是定位错误根因的关键数据。若 `TlpHeaderValid=0` 则显示 `"not available"`。

PCIe 可纠正错误类型

位掩码	含义
`0x00000001`	接收端错误。物理层接收到不符合规范的数据符号，通常由信号完整性问题引起（如过长连线、阻抗不匹配）。
`0x00000040`	TLP（事务层数据包）错误。数据包的 LCRC（链路层 CRC）校验失败，表明事务层数据在传输中发生翻转，PCIe 链路层会自动重传该 TLP。
`0x00000080`	DLLP（数据链路层数据包）错误。链路层控制包（如 ACK/NAK、流控更新）CRC 校验失败。
`0x00000100`	重传序列号溢出。`REPLAY_NUM` 字段用于追踪重传次数，该错误表明自上次 ACK 以来已发生过多次重传，通常意味着链路质量持续较差。
`0x00001000`	重传计时器超时。发送方在规定时间内未收到 ACK，触发 TLP 重传。持续出现表明链路延迟异常或接收端处理能力不足。
`0x00002000`	顾问性非致命错误。本质上是一个不可纠正但被软件降级为可纠正处理的错误（需启用 AER capability 中的 ANFE 功能），常见于接收到 Unsupported Request Completion 的场景。
`0x00004000`	已纠正内部错误。设备内部 ECC 或奇偶校验错误，已由设备自主纠正。
`0x00008000`	头部日志溢出。AER 头部日志寄存器已满，后续错误的 TLP 头部无法被记录（但错误本身仍被计数）。

PCIe 不可纠正错误类型

位掩码	含义
`0x00000001`	未定义错误。保留位被置位，通常表明固件或硬件存在不合规行为。
`0x00000010`	数据链路协议错误。收到了违反 DLLP 协议规范的数据包，属于严重的链路层故障。
`0x00000020`	意外下线错误。物理链路在未经 Hot-Plug 通知的情况下突然断开（如设备意外掉电或接触不良），为热插拔场景下的高危错误。
`0x00001000`	毒化 TLP。接收到数据有效位（EP，Error Poisoning）被主动设置为 1 的 TLP，表明上游发送方知晓该数据已损坏。此机制用于错误传播和隔离，避免静默数据损坏。
`0x00002000`	流控协议错误。接收到违反 PCIe 流控信用（Credit）规则的数据包，属于严重的协议违规。
`0x00004000`	完成超时。请求方（Requester）发出非 Posted 事务（如 Memory Read）后，在规定超时时间内未收到完成包（Completion）。常见于 NVMe 盘固件异常、RDMA 网卡驱动 Bug 或 PCIe 链路中断。
`0x00008000`	完成方中止。接收端显式返回 CA（Completer Abort）状态的 Completion，表示请求被完成方拒绝。
`0x00010000`	意外完成包。收到了无法与任何已发出的请求匹配的 Completion（Tag 不匹配），通常由设备固件 Bug 或数据路径错误引起。
`0x00020000`	接收缓冲区溢出。接收端流控信用信息显示其缓冲区未满，但实际发生了溢出，属于严重的流控违规。
`0x00040000`	格式错误的 TLP。数据包头部字段违反规范（如非法长度、保留字段被置位、不合法的地址范围），通常表明设备固件存在严重缺陷。
`0x00080000`	端到端 CRC 错误。TLP 尾部的 ECRC 校验失败（需双端设备均支持 ECRC 功能），表明数据在整个传输链路（含 PCIe Switch 内部交换）中发生损坏，是高可靠性场景中的关键指标。
`0x00100000`	不支持的请求错误。接收端返回 UR（Unsupported Request）状态，表明请求的事务类型或地址范围不被该设备支持。
`0x00200000`	ACS（Access Control Services）违规。PCIe ACS 机制用于防止 PCIe 设备之间的对等（Peer-to-Peer）DMA 绕过 IOMMU，此错误表明发生了违反 ACS 策略的数据访问，在虚拟化安全场景中需重点关注。
`0x00400000`	不可纠正的内部错误。设备内部发生无法自行纠正的 ECC 或奇偶校验错误（如 SRAM 双比特错误），通常意味着设备硬件损坏。
`0x00800000`	多播（MC）TLP 被阻断。PCIe 多播（Multicast）TLP 被 ACS 或 MC 控制机制阻止。
`0x01000000`	原子操作出口被阻断。AtomicOp（原子操作请求，如 FetchAdd、Swap、CAS）因 ACS 控制被阻止出站，常见于 RDMA/GPU 直连场景。
`0x02000000`	TLP 前缀被阻断。带有 End-End TLP Prefix 的数据包被 ACS 或其他机制阻止转发。

总结

推荐在生产环境中部署华佗，实现全面的硬件错误监控与主动运维。

6 - 应用实践

6.1 - 存储服务

🎯 关于 HUATUO（华佗）

HUATUO（华佗）是由滴滴开源并依托 CCF（中国计算机学会）孵化的操作系统深度观测项目，专注为云原生通用计算、AI 计算、云服务、基础服务等提供操作系统内核级深度观测能力。

📖 概述

HUATUO（华佗）支持将采集到的 Linux 内核事件与 AutoTracing 数据持久化写入外部存储后端。当前支持 Elasticsearch 和 OpenSearch 两种存储系统。

采集到的事件在序列化为 JSON 后，同时写入节点本地目录（huatuo-local/）和配置的远端存储后端。本地目录保留事件的本地副本，远端存储提供持久化与结构化查询能力。

本文介绍 Elasticsearch 和 OpenSearch 的配置与验证方法。示例基于 Docker 部署，生产环境只需将地址替换为实际服务地址，配置方式一致。

🎯 应用场景

Kubernetes 云原生故障溯源

容器化环境中，Pod OOM、节点 Hung Task 等内核事件具有短暂性，日志往往在事件发生后被清理。将事件写入 Elasticsearch 或 OpenSearch 后，运维团队可按时间范围查询历史异常时间线，在事后复盘阶段精确定位间歇性故障的根因。

AI 计算集群稳定性审计

GPU 训练集群长期运行过程中，ras 硬件错误、iotracing I/O 延迟等事件的历史分布对容量规划和硬件健康评估至关重要。将采集数据持久化后，可通过聚合查询建立节点稳定性基线，为主动维护提供数据依据。

合规与事件留存

等保合规要求系统异常事件具备可追溯性。将 HUATUO 采集的内核事件写入 OpenSearch 并配置索引生命周期策略，可满足对事件留存周期和查询能力的合规要求。

可观测性平台集成

Elasticsearch 和 OpenSearch 均提供与 Grafana 的原生数据源对接能力。将 HUATUO 事件写入存储后，可在 Grafana 中构建内核事件趋势面板，与应用层指标叠加展示，实现历史数据分析与告警回顾。

💎 价值

维度	仅本地存储	接入外部存储后端
数据持久性	受节点磁盘容量限制，重启后可能丢失	数据持久化至分布式存储，支持长期保留
查询能力	无结构化查询，依赖文件搜索	支持全文检索、字段过滤、时间范围聚合
可视化集成	不支持	可直接对接 Grafana、Kibana 等可视化平台
多节点汇聚	数据分散在各节点本地	集中写入统一存储，支持跨节点查询
合规留存	难以满足留存周期要求	可配置索引生命周期策略，满足合规留存要求

🚀 使用

OpenSearch V2

1. 部署 OpenSearch

docker pull opensearchproject/opensearch:2.6.0
docker run -d --name opensearch --network host \
    -e "discovery.type=single-node" \
    opensearchproject/opensearch:2.6.0

2. 验证服务状态

curl -k -u admin:admin https://localhost:9200

返回示例：

{
  "name" : "22ca72df78c0",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "yxb3foceQVKzXXO6bHpPHQ",
  "version" : {
    "distribution" : "opensearch",
    "number" : "2.6.0",
    "build_type" : "tar",
    "build_hash" : "7203a5af21a8a009aece1474446b437a3c674db6",
    "build_date" : "2023-02-24T18:57:04.388618985Z",
    "build_snapshot" : false,
    "lucene_version" : "9.5.0",
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}

若验证失败，可通过以下命令查看容器日志：

docker logs opensearch

3. 配置 huatuo-bamai

在 huatuo-bamai.conf 中添加以下配置。OpenSearch 容器镜像默认用户名和密码均为 admin。存储配置的详细说明请参见《配置指南》章节。

[Storage.ES]
    Address = "https://127.0.0.1:9200"
    Index = "huatuo_bamai"
    Username = "admin"
    Password = "admin"

4. 启动 huatuo-bamai

通过 --config-dir 指定配置文件所在目录：

./_output/bin/huatuo-bamai --region dev --config-dir .

当本地存储目录 huatuo-local/ 中生成文件（例如 net_rx_latency）时，说明已成功采集到内核事件。可使用以下命令从 OpenSearch 查询数据：

curl -k -u admin:admin \
    -X GET "https://localhost:9200/huatuo_bamai/_search?pretty" \
    -H "Content-Type: application/json" \
    -d '{"query": {"match_all": {}}}'

返回示例：

{
    "_index" : "huatuo_bamai",
    "_id" : "yjPG_50Bu_OF-hukxKR7",
    "_score" : 1.0,
    "_source" : {
      "hostname" : "hostname",
      "region" : "dev",
      "uploaded_time" : "2026-05-07T00:11:49.753166222Z",
      "time" : "2026-05-07 00:11:49.753 +0000",
      "tracer_name" : "net_rx_latency",
      "tracer_time" : "2026-05-07 00:11:49.753 +0000",
      "tracer_type" : "auto",
      "tracer_data" : {
        "comm" : "<nil>",
        "pid" : 0,
        "where" : "RX_STAGE_NETIF",
        "latency_ms" : 1776078133565,
        "saddr" : "127.0.0.1",
        "daddr" : "127.0.0.1",
        "sport" : 37736,
        "dport" : 9200,
        "seq" : 1080592402,
        "ack_seq" : 2465063876,
        "pkt_len" : 781
      }
    }
}

查看文档记录总数，不查看具体列表。

curl -k -u admin:admin -X GET "https://localhost:9200/huatuo_bamai/_count?pretty"

返回示例：其中 count 数字 = 写入记录的总数。

{
  "count" : 2680,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Elasticsearch V8

1. 部署 Elasticsearch

docker pull docker.elastic.co/elasticsearch/elasticsearch:8.15.5
docker run -d --name elasticsearch --network host \
    -e "discovery.type=single-node" \
    -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \
    -e "ELASTIC_PASSWORD=123456" \
    docker.elastic.co/elasticsearch/elasticsearch:8.15.5

2. 验证服务状态

curl -k -u elastic:123456 https://localhost:9200

返回示例：

{
  "name" : "ab0b562f8dbd",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "aVfOVgJTQXuhZ3HGotK3ww",
  "version" : {
    "number" : "8.15.5",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "b10896bcfe167cce44a84ba2771d101fb596d40d",
    "build_date" : "2024-11-21T22:06:13.985834967Z",
    "build_snapshot" : false,
    "lucene_version" : "9.11.1",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

3. 配置 huatuo-bamai

在 huatuo-bamai.conf 中添加以下配置。Elasticsearch 容器镜像默认用户名为 elastic，密码通过环境变量 ELASTIC_PASSWORD 设置。存储配置的详细说明请参见《配置指南》章节。

[Storage.ES]
    Address = "https://127.0.0.1:9200"
    Index = "huatuo_bamai"
    Username = "elastic"
    Password = "123456"

4. 启动 huatuo-bamai

通过 --config-dir 指定配置文件所在目录：

./_output/bin/huatuo-bamai --region dev --config-dir .

当本地存储目录 huatuo-local/ 中生成文件（例如 net_rx_latency）时，说明已成功采集到内核事件。可使用以下命令从 Elasticsearch 查询数据：

curl -k -u elastic:123456 \
    -X GET "https://localhost:9200/huatuo_bamai/_search?pretty" \
    -H "Content-Type: application/json" \
    -d '{"query": {"match_all": {}}}'

返回示例：

{
    "_index" : "huatuo_bamai",
    "_id" : "WtNZAJ4BQ8x-thPHEY1i",
    "_score" : 1.0,
    "_source" : {
      "hostname" : "hostname",
      "region" : "dev",
      "uploaded_time" : "2026-05-07T02:51:37.696263325Z",
      "time" : "2026-05-07 02:51:37.696 +0000",
      "tracer_name" : "net_rx_latency",
      "tracer_time" : "2026-05-07 02:51:37.696 +0000",
      "tracer_type" : "auto",
      "tracer_data" : {
        "comm" : "<nil>",
        "pid" : 0,
        "where" : "RX_STAGE_NETIF",
        "latency_ms" : 1776078133565,
        "saddr" : "127.0.0.1",
        "daddr" : "127.0.0.1",
        "sport" : 2379,
        "dport" : 36706,
        "seq" : 950542706,
        "ack_seq" : 1960972383,
        "pkt_len" : 91
      }
    }
}

查看文档记录总数，不查看具体列表。

curl -k -u elastic:123456 -X GET "https://localhost:9200/huatuo_bamai/_count?pretty"

返回示例：其中 count 数字 = 写入记录的总数。

{
  "count" : 2680,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Elasticsearch V7

V7 默认使用 HTTP，因此只需要在访问服务时替换为 HTTP 即可。

1. 部署 Elasticsearch

docker pull docker.elastic.co/elasticsearch/elasticsearch:7.10.1
docker run -d --name elasticsearch --network host \
    -e "discovery.type=single-node" \
    -e "ES_JAVA_OPTS=-Xms1g -Xmx1g" \
    -e "ELASTIC_PASSWORD=123456" \
    docker.elastic.co/elasticsearch/elasticsearch:7.10.1

2. 验证服务状态

curl -k -u elastic:123456 http://localhost:9200

返回示例：

{
  "name" : "d88c9e8df48b",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "_ZZefWx4SniAc255t_lIVg",
  "version" : {
    "number" : "7.10.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "1c34507e66d7db1211f66f3513706fdf548736aa",
    "build_date" : "2020-12-05T01:00:33.671820Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

3. 配置 huatuo-bamai

[Storage.ES]
    Address = "http://127.0.0.1:9200"
    Index = "huatuo_bamai"
    Username = "elastic"
    Password = "123456"

4. 启动 huatuo-bamai

通过 --config-dir 指定配置文件所在目录：

./_output/bin/huatuo-bamai --region dev --config-dir .

当本地存储目录 huatuo-local/ 中生成文件（例如 net_rx_latency）时，说明已成功采集到内核事件。可使用以下命令从 Elasticsearch 查询数据：

curl -k -u elastic:123456 \
    -X GET "http://localhost:9200/huatuo_bamai/_search?pretty" \
    -H "Content-Type: application/json" \
    -d '{"query": {"match_all": {}}}'

或者：
curl -k -u elastic:123456 \
    -X GET "http://localhost:9200/huatuo_bamai/_count?pretty"

⚙️ 原理

系统架构

HUATUO Storage 模块部署在节点上，将采集到的内核事件同时写入本地目录和 Elasticsearch 或 OpenSearch。两种存储后端共用同一套 [Storage.ES] 配置接口，通过地址区分。

写入远端时使用 ES/OpenSearch 的 Bulk API（_bulk）：事件先进入节点内的批量缓冲，由后台 worker 按"大小或时间"的阈值聚合后一次提交多条记录，并在传输层失败时按策略自动重试。

graph TB
    subgraph kernel["Linux 内核"]
        K1[内核事件]
        K2[AutoTracing]
    end

    subgraph huatuo["HUATUO Agent（节点级）"]
        T["采集层"]
        L["本地目录\nhuatuo-local/"]
        S["Storage 模块\nBulkIndexer 缓冲"]
    end

    subgraph backends["存储后端"]
        ES[Elasticsearch]
        OS[OpenSearch]
    end

    kernel --> T
    T --> L
    T --> S
    S -->|Bulk API + 自动重试| ES
    S -->|Bulk API + 自动重试| OS

数据写入流程

采集层调用 Save 后立即返回，事件落入 BulkIndexer 缓冲；后台 worker 在满足"字节阈值 / 时间阈值 / 进程退出"任一条件时将批次提交至远端。本地目录写入是同步落盘，与远端 Bulk 路径相互独立。

sequenceDiagram
    participant T as 采集层
    participant L as 本地目录（huatuo-local/）
    participant S as Storage 模块（BulkIndexer）
    participant B as ES / OpenSearch

    T->>S: 采集到内核事件，序列化为 JSON
    par 本地路径（同步）
        S->>L: 写入本地文件
    and 远端路径（异步批量）
        S->>S: 加入 Bulk 缓冲，立即返回
        Note over S: 满足 5 MB / 1 s / 退出 任一条件
        S->>B: POST /_bulk（多条记录）
        B-->>S: 200 OK + per-item 结果
        Note over S: 失败项通过 OnFailure 回调记录日志
    end

Bulk 批量写入机制

缓冲与刷新

参数	值	含义
`FlushBytes`	5 MB	缓冲累计达到该字节数立即刷新
`FlushInterval`	1 s	距上次刷新满 1 秒后强制刷新
`NumWorkers`	4	并发提交 Bulk 请求的后台 goroutine 数
进程退出	`Close(ctx)`	SIGTERM/SIGINT 触发，限时 10 s 内排空缓冲

两级重试策略

Bulk 请求的失败语义分为两层，重试范围不同：

层级	触发条件	处理方式	是否重试
整批失败	传输错误（连接失败、超时、TLS） HTTP 状态：`429 / 502 / 503 / 504`	客户端按指数退避自动重试：100 ms → 200 ms → 400 ms → 800 ms，最多 3 次	✅ 自动
整批拒绝	HTTP 状态：`400 / 401 / 403 / 404 / 413` 等	不重试，整批所有记录全部丢弃，并通过 `OnError` 写错误日志	❌ 丢弃
单条失败	200 OK 但 per-item 失败：版本冲突、字段映射错误、文档过大	不重试，仅该单条丢弃，通过 `OnFailure` 回调记录 `index/id/status/type/reason`	❌ 丢弃
单条成功	200 OK 且 per-item 成功	视为已落库	—

为什么这样设计：429/5xx 与传输错误是远端短暂不可用的信号，重试有效；4xx（除 429）与 per-item 错误是客户端语义问题（数据格式、权限），重试只会放大错误，应交给开发与运维侧排查日志后修正。

数据丢失场景

下列三种情况下，调用方调用 Save 时返回 nil，但事件最终未进入索引：

进程异常退出：SIGKILL 或宿主机断电时，BulkIndexer 内存缓冲尚未刷新的部分直接丢失（仅本地目录保留副本）。
- 缓解：SIGTERM/SIGINT 走优雅退出路径，shutdown 时调用 Close 强制 flush，最长等待 10 秒。
整批被永久拒绝：4xx（非 429）类错误一次性丢弃整批所有记录。常见诱因：索引被禁用、密码失效、单条文档超过集群 http.max_content_length。
- 排查：OnError 错误日志包含 ES 返回的 type 与 reason。
单条永久失败：mapping 冲突、版本冲突、文档语法错误。
- 排查：OnFailure 错误日志按 index/id 定位失败记录。

本地目录始终保留副本：即使远端写入丢失，事件仍可从 huatuo-local/ 中找回，作为最终一致性的兜底。

解决的问题

将"逐事件 Index API"换成"BulkIndexer 批量 + 自动重试"主要解决以下四类问题：

问题	旧方案瓶颈	Bulk 方案的改进
TLS 握手 CPU 开销	每事件一次 HTTPS，握手在 FIPS / RSA-PSS 下占满 CPU	多条事件复用单连接 + 单次握手；TLS PSK ticket 缓存复用
远端 RTT 与吞吐	每事件一次往返，节点级写入受 RTT 限制	单次 Bulk 请求最多 5 MB，吞吐随批大小线性提升
远端短暂抖动 / 限流（429）	单次失败立即丢弃，无重试	客户端层面自动重试，吸收瞬态故障
采集层对存储后端解耦	远端慢会回压采集，影响内核事件采集时延	异步缓冲将采集与远端写入解耦，采集路径不被网络阻塞

🌟 结尾

🌟 欢迎 Star: https://github.com/ccfos/huatuo

👀 欢迎订阅官方微信公众号
微信公众号二维码

6.2 - 内核事件订阅

🎯 关于 HUATUO（华佗）

HUATUO（华佗）是由滴滴开源并依托 CCF（中国计算机学会）孵化的操作系统深度观测项目，专注为云原生通用计算、AI 计算、云服务、基础服务等提供操作系统内核级深度观测能力。

📖 概述

/v1/events/watch 是华佗（HUATUO）提供的实时内核事件订阅接口。客户端通过一次 HTTP POST 长连接即可持续接收节点上发生的内核异常事件。事件以 CloudEvents 1.0 规范封装，通过 Server-Sent Events（SSE）协议推送。

🎯 应用场景

内核事件订阅将操作系统层的异常信号直接暴露给上层系统，消除了传统轮询带来的延迟与开销。以下是典型的集成场景。

故障自愈系统

内核事件是自愈决策的第一手信号源。订阅 events/watch 后，自愈控制器可在事件发生的瞬间触发处置动作，而不必等待监控系统的告警流转：

OOM 自愈：收到 oom 事件后，立即对触发容器执行扩容、重启或流量摘除，将服务中断时间从分钟级压缩到秒级。
Hung Task 自愈：收到 hungtask 事件后，自动隔离节点并驱逐 Pod，防止级联阻塞蔓延至整个集群。
网络故障自愈：收到 netdev_txqueue_timeout 或 netdev_bonding_lacp 事件后，触发网卡重置或流量切换，实现分钟级网络链路自愈。
I/O 风暴自愈：收到 iotracing 事件后，结合 cgroup blkio 限速策略动态降低问题容器的磁盘 I/O 配额，保护同节点其他服务。

可观测性平台

将华佗内核事件接入可观测性平台，补齐应用指标和日志之外的内核视角：

事件时间线关联：将 softlockup、oom 等内核事件叠加到 Grafana 时间线上，与应用错误率、延迟曲线精确对齐，快速定位根因。
异常驱动告警：以内核事件替代固定阈值告警，降低误报率。例如收到 ras 硬件错误事件时直接触发高优告警，而不依赖 CPU 错误率超阈值。
容量与稳定性分析：长期订阅 memburst、dload 等 AutoTracing 事件，建立节点稳定性基线，为容量规划提供内核级依据。
多维下钻：事件中携带容器 ID、命名空间、地域等上下文，告警链接可直接下钻到对应的 Pod、Node、Region 视图。

安全审计与合规

异常行为检测：oom、hungtask、softlockup 等事件若在非业务高峰期集中出现，可能指示资源滥用或恶意负载，触发安全审查流程。
事件留存与追溯：将 CloudEvents 事件流写入消息队列（Kafka、Pulsar）或对象存储，满足等保合规对系统异常事件留存的要求。

混沌工程与压测验证

故障注入验证：混沌工程平台注入网络延迟、内存压力等故障后，实时订阅 net_rx_latency、memburst 事件验证故障是否生效，取代人工观察。
压测基线建立：压测期间持续订阅全量事件，记录首个内核异常事件的出现时机，精确标定系统承压极限。

AIOps 智能运维

事件驱动根因分析：将内核事件作为特征输入 AI/ML 模型，结合应用指标进行多维根因推断，减少人工排查时间。
预测性维护：对 ras 硬件错误、netdev_bonding_lacp 等硬件层事件建模，在设备彻底失效前提前预警并触发迁移。
智能抑制与聚合：对同一时间窗口内同类事件自动聚合，避免告警风暴，向 On-call 工程师呈现精简的根因摘要。

💎 价值

维度	传统方案	接入华佗 events/watch
时效性	告警触发延迟 1–5 分钟	内核事件实时推送，延迟 < 1 秒
信号准确性	基于指标阈值，误报率高	事件源自内核判定，误报率接近零
上下文丰富度	指标维度有限	携带容器、节点、地域等完整上下文
集成成本	需自建 eBPF 采集或依赖第三方 Agent	一次 HTTP POST 即可订阅，标准 CloudEvents 格式
协议兼容性	各厂商私有格式	遵循 CloudEvents 1.0 标准，可接入任意兼容平台

🚀 使用

1. CloudEvents 规范说明

1.1 CloudEvents 1.0 信封字段

每条推送事件均为一个符合 CloudEvents 1.0 规范的 JSON 对象：

字段	类型	说明
`specversion`	string	固定值 `"1.0"`
`id`	string	事件唯一标识符（UUID v4），每条事件独立生成
`source`	string	事件来源路径，格式 `/huatuo/{hostname}/{tracer_name}`
`type`	string	固定值 `"tech.huatuo.kernel.event"`
`datacontenttype`	string	固定值 `"application/json"`
`time`	string	事件采集时间（RFC 3339 纳秒精度，UTC）
`data`	object	事件数据体，即 `WatchEventData` 结构体

1.2 华佗事件数据结构（WatchEventData）

data 字段包含华佗的标准事件记录：

{
  "specversion": "1.0",
  "id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "source": "/huatuo/node-1/oom",
  "type": "tech.huatuo.kernel.event",
  "datacontenttype": "application/json",
  "time": "2026-05-18T10:23:45.123456789Z",
  "data": {
    "hostname": "node-1",
    "region": "cn-beijing",
    "observed_timestamp": "2026-05-18T10:23:45Z",
    "tracer_name": "oom",
    "tracer_id": "abc123",
    "tracer_run_type": "auto",
    "container_id": "d3f1a2b4c5e6",
    "container_hostname": "app-pod",
    "container_host_namespace": "prod",
    "container_type": "docker",
    "container_qos": "Guaranteed"
  }
}

WatchEventData 字段说明：

字段	类型	说明
`hostname`	string	节点主机名
`region`	string	节点所在地域
`observed_timestamp`	string	内核事件发生时间（Tracer 采集时间）
`tracer_name`	string	触发事件的采集器名称（见下文内核事件列表）
`tracer_id`	string	事件实例唯一 ID
`tracer_run_type`	string	采集模式，`auto`（自动触发）或 `manual`
`container_id`	string	容器 ID（容器级事件时存在）
`container_hostname`	string	容器主机名
`container_host_namespace`	string	容器所在命名空间
`container_type`	string	容器运行时类型（docker / containerd 等）
`container_qos`	string	容器 QoS 等级

2. 支持的内核事件列表

`tracer_name`	说明
`oom`	内存不足（OOM Killer）触发事件
`hungtask`	内核任务长时间 D 状态（Hung Task）检测
`softlockup`	CPU 软锁死（Soft Lockup）检测
`ras`	硬件可靠性（RAS）错误，如 ECC 内存错误
`dropwatch`	内核网络数据包丢弃（Drop Watch）事件
`netdev_events`	网络设备状态变更事件（Link Up/Down 等）
`netdev_txqueue_timeout`	网络设备发送队列超时事件
`netdev_bonding_lacp`	Bond 设备 LACP 协议异常事件
`net_rx_latency`	网络接收延迟异常事件
`softirq_tracing`	软中断耗时异常追踪事件
`memory_reclaim_events`	内存回收异常事件
`cpuidle`	CPU 空闲率异常（AutoTracing 自动触发）
`cpusys`	CPU 系统态占用率异常（AutoTracing 自动触发）
`dload`	系统负载异常（AutoTracing 自动触发）
`iotracing`	I/O 延迟异常（AutoTracing 自动触发）
`memburst`	内存突增异常（AutoTracing 自动触发）

3. POST 请求说明

3.1 接口地址

POST /v1/events/watch

3.2 请求头

Content-Type: application/json

3.3 请求体结构

{
  "filters": {
    "tracer_name": "<regex>",
    "hostname": "<regex>",
    "container_hostname": "<regex>",
    "container_host_namespace": "<regex>",
    "region": "<regex>"
  }
}

filters 字段说明：

字段	类型	是否必填	说明
`tracer_name`	string	否	按采集器名称过滤，支持正则表达式
`hostname`	string	否	按节点主机名过滤，支持正则表达式
`container_hostname`	string	否	按容器主机名过滤，支持正则表达式
`container_host_namespace`	string	否	按容器命名空间过滤，支持正则表达式
`region`	string	否	按地域过滤，支持正则表达式

所有过滤字段均为可选；省略或留空表示匹配所有值。
多个字段同时指定时，所有条件须同时满足（AND 语义）。
过滤器在服务端生效，仅匹配的事件才会推送到客户端。

3.4 响应格式（SSE 流）

连接建立后，服务端以 SSE 格式持续推送事件：

data: {"specversion":"1.0","id":"...","source":"/huatuo/node-1/oom",...}\n\n

服务端还会定期发送心跳注释行以保持连接：

: ping\n

4. EventsWatch 配置说明

在华佗配置文件（huatuo-bamai.conf）中通过 [EventsWatch] 段配置：

[EventsWatch]
    # 最大并发客户端连接数，超出后新连接返回 HTTP 429
    # Default: 100
    MaxClients = 100

    # SSE 心跳间隔（秒），防止代理/负载均衡因空闲而断开连接
    # 连续 3 次心跳写入失败则主动关闭该客户端连接
    # Default: 30
    KeepAliveInterval = 30

配置项	默认值	说明
`MaxClients`	100	同时允许的 `/v1/events/watch` 长连接上限，超出返回 HTTP 429
`KeepAliveInterval`	30	心跳间隔（秒），建议不超过上游代理的 idle timeout，推荐 15–60 秒

5. Curl 调用示例

5.1 订阅所有内核事件

curl -s -N -X POST http://<node-ip>:19704/v1/events/watch \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -H "Cache-Control: no-cache" \
  -H "Connection: keep-alive" \
  -d '{}'

5.2 只订阅 OOM 事件

curl -s -N -X POST http://<node-ip>:19704/v1/events/watch \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -H "Cache-Control: no-cache" \
  -H "Connection: keep-alive" \
  -d '{"filters": {"tracer_name": "^oom$"}}'

5.3 订阅指定节点的网络类事件

curl -s -N -X POST http://<node-ip>:19704/v1/events/watch \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -H "Cache-Control: no-cache" \
  -H "Connection: keep-alive" \
  -d '{
    "filters": {
      "hostname": "^node-1$",
      "tracer_name": "netdev|dropwatch|net_rx_latency"
    }
  }'

5.4 订阅 prod 命名空间的容器事件

curl -s -N -X POST http://<node-ip>:19704/v1/events/watch \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -H "Cache-Control: no-cache" \
  -H "Connection: keep-alive" \
  -d '{
    "filters": {
      "container_host_namespace": "^prod$"
    }
  }'

说明： -N 参数禁用 curl 缓冲，使 SSE 事件即时输出到终端。

6. Go 编程调用示例

以下示例展示如何在 Go 程序中订阅 events/watch 接口，实时消费 CloudEvents 事件。

package main

import (
	"bufio"
	"bytes"
	"context"
	"encoding/json"
	"fmt"
	"log/slog"
	"net/http"
	"os"
	"strings"
	"time"
)

// WatchRequest 是发送给 /v1/events/watch 的请求体。
type WatchRequest struct {
	Filters WatchFilters `json:"filters"`
}

type WatchFilters struct {
	TracerName             string `json:"tracer_name,omitempty"`
	Hostname               string `json:"hostname,omitempty"`
	ContainerHostname      string `json:"container_hostname,omitempty"`
	ContainerHostNamespace string `json:"container_host_namespace,omitempty"`
	Region                 string `json:"region,omitempty"`
}

// WatchEvent 是华佗推送的 CloudEvents 1.0 信封。
type WatchEvent struct {
	SpecVersion     string          `json:"specversion"`
	ID              string          `json:"id"`
	Source          string          `json:"source"`
	Type            string          `json:"type"`
	DataContentType string          `json:"datacontenttype"`
	Time            string          `json:"time"`
	Data            json.RawMessage `json:"data"`
}

func watchEvents(ctx context.Context, endpoint string, filters WatchFilters) error {
	reqBody, err := json.Marshal(WatchRequest{Filters: filters})
	if err != nil {
		return fmt.Errorf("marshal request: %w", err)
	}

	req, err := http.NewRequestWithContext(ctx, http.MethodPost, endpoint, bytes.NewReader(reqBody))
	if err != nil {
		return fmt.Errorf("create request: %w", err)
	}
	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Accept", "text/event-stream")

	client := &http.Client{Timeout: 0} // SSE 长连接，不设超时
	resp, err := client.Do(req)
	if err != nil {
		return fmt.Errorf("connect: %w", err)
	}
	defer resp.Body.Close()

	if resp.StatusCode != http.StatusOK {
		return fmt.Errorf("unexpected status: %d", resp.StatusCode)
	}

	scanner := bufio.NewScanner(resp.Body)
	for scanner.Scan() {
		line := scanner.Text()

		// 跳过心跳注释行和空行
		if line == "" || strings.HasPrefix(line, ":") {
			continue
		}

		// SSE data 行格式：`data: <json>`
		data, ok := strings.CutPrefix(line, "data: ")
		if !ok {
			continue
		}

		var event WatchEvent
		if err := json.Unmarshal([]byte(data), &event); err != nil {
			slog.Warn("parse event", "err", err)
			continue
		}

		fmt.Printf("[%s] source=%s id=%s\n", event.Time, event.Source, event.ID)
		fmt.Printf("  data: %s\n", event.Data)
	}

	return scanner.Err()
}

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
	defer cancel()

	err := watchEvents(ctx, "http://192.168.1.10:19704/v1/events/watch", WatchFilters{
		TracerName: "oom|hungtask|softlockup",
	})
	if err != nil {
		slog.Error("watch events", "err", err)
		os.Exit(1)
	}
}

6.1 使用 pkg/types 官方包（推荐）

如果你的项目与华佗在同一 Go module，可直接引用官方类型：

import pkgtypes "huatuo-bamai/pkg/types"

var event pkgtypes.WatchEvent
if err := json.Unmarshal([]byte(data), &event); err != nil { ... }

// WatchEvent.Data 是 json.RawMessage（延迟解析），需二次反序列化才能访问具体字段
dataBytes, err := json.Marshal(event.Data)
if err != nil {
    slog.Warn("marshal event data", "err", err)
    return
}
var payload pkgtypes.WatchEventData
if err := json.Unmarshal(dataBytes, &payload); err != nil {
    slog.Warn("unmarshal event data", "err", err)
    return
}
fmt.Println("tracer:", payload.TracerName)
fmt.Println("observed_timestamp:", payload.ObservedTimestamp)

6.2 重连机制建议

生产环境中，网络抖动或服务重启会导致连接断开，建议加入指数退避重连逻辑：

func watchWithRetry(ctx context.Context, endpoint string, filters WatchFilters) {
	backoff := time.Second
	for {
		if err := watchEvents(ctx, endpoint, filters); err != nil {
			if ctx.Err() != nil {
				return
			}
			slog.Warn("disconnected, retrying", "err", err, "backoff", backoff)
			// time.NewTimer + Stop 确保 context 取消时计时器资源立即释放
			timer := time.NewTimer(backoff)
			select {
			case <-ctx.Done():
				timer.Stop()
				return
			case <-timer.C:
			}
			if backoff < 30*time.Second {
				backoff *= 2
			}
		}
	}
}

⚙️ 原理

系统架构

HUATUO Agent 部署在每个节点上，通过 eBPF、Kprobe、Tracepoint 等机制挂钩内核关键路径，将内核异常事件采集后经过滤、封装，以 SSE 长连接推送给多个并发订阅客户端。

graph TB
    subgraph kernel["Linux 内核"]
        K1[OOM Killer]
        K2[Hung Task 检测]
        K3[Soft Lockup 检测]
        K4[RAS 硬件错误]
        K5[网络子系统]
        K6[AutoTracing]
    end

    subgraph huatuo["HUATUO Agent（节点级）"]
        T["Tracer 采集层\neBPF / Kprobe / Tracepoint"]
        F["过滤器\nhostname / tracer / namespace / region"]
        CE["CloudEvents 1.0 封装\nid / source / time / data"]
        EW["EventsWatch 分发层\nSSE 长连接管理"]
    end

    subgraph clients["订阅客户端"]
        C1[故障自愈系统]
        C2[可观测性平台]
        C3[AIOps 系统]
        C4[安全审计系统]
    end

    kernel --> T
    T --> F
    F --> CE
    CE --> EW
    EW -->|SSE 推送| C1
    EW -->|SSE 推送| C2
    EW -->|SSE 推送| C3
    EW -->|SSE 推送| C4

事件采集与推送原理

客户端发起 POST 请求后，连接保持打开状态。内核每次触发异常事件，HUATUO Agent 完成过滤和封装后立即将事件写入所有匹配的 SSE 流，无需客户端轮询。

sequenceDiagram
    participant C as 客户端
    participant EW as EventsWatch
    participant T as Tracer 采集层
    participant K as Linux 内核

    C->>EW: POST /v1/events/watch {"filters": {...}}
    EW-->>C: 200 OK (Content-Type: text/event-stream)

    loop SSE 长连接持续推送
        K->>T: 内核事件触发（oom / hungtask / softlockup ...）
        T->>EW: 上报原始事件
        EW->>EW: 过滤器匹配
        alt 匹配成功
            EW-->>C: data: {CloudEvents JSON}\n\n
        else 不匹配
            note over EW: 丢弃，不推送
        end
        EW-->>C: : ping（心跳保活，间隔 KeepAliveInterval 秒）
    end

事件处理流程

从内核事件产生到推送至客户端，经过采集、过滤、封装三个阶段，整体链路延迟小于 1 秒。

flowchart LR
    A([内核异常触发]) --> B["Tracer 采集\neBPF / Kprobe"]
    B --> C{过滤器匹配?}
    C -- 否 --> D([丢弃])
    C -- 是 --> E["封装 CloudEvents 1.0\nid / source / time / data"]
    E --> F[写入 SSE 流]
    F --> G([推送至订阅客户端])

🌟 结尾

🌟 欢迎 Star: https://github.com/ccfos/huatuo

👀 欢迎订阅官方微信公众号
微信公众号二维码

6.3 - 性能剖析

火焰图格式

在性能剖析领域，collapsed 和 flamegraph 是最常用的两种火焰图格式，分别对应"原始数据"与"可视化视图"两个层次。

Collapsed 格式

标准语法与格式

collapsed 格式（又称 folded stacks）由 Brendan Gregg 定义，是火焰图的原始文本输入格式。每行代表一条唯一的调用栈及其采样计数。

基本规则：

frame1;frame2;frame3;...;frameN COUNT

组成部分	说明
`frame1`	栈底（入口/根帧），如 `main`、`start_thread`
`;`	帧分隔符（分号）
`frameN`	栈顶（当前执行帧，即采样命中点）
`COUNT`	采样次数（整数），与栈帧之间用空格分隔

格式要点：

每行一条独立调用栈，相同栈路径的样本合并计数
帧的排列顺序：从左到右为 根→叶（调用链方向）
空行及 # 开头的行通常被视为注释，解析时忽略
COUNT 的语义取决于分析模式：CPU 采样时为采样次数，内存分配时为分配字节数，锁分析时为竞争时间（毫秒）

扩展规范：

部分剖析工具（如 async-profiler）在标准格式基础上引入了帧类型注解，用于标识帧的运行时类别：

frameName_{type} COUNT

注解	含义	说明
`_[j]`	JIT compiled Java	JIT 编译后的 Java 方法
`_[i]`	Interpreted Java	解释执行的 Java 方法
`_[k]`	Kernel	内核态帧
`_[n]`	Native C/C++	原生 C/C++ 帧
`_[t]`	Thread	线程帧

此外，部分工具支持带权重的折叠格式（weighted collapsed），用于差分火焰图：

frame1;frame2;frameN WEIGHT

其中 WEIGHT 为浮点数，表示该栈的权重值而非简单计数。

样本示例

CPU 分析示例（以下数据源自 async-profiler 官方文档）：

FileConverter.main;FileConverter.convertFile;FileConverter.saveResult 21
FileConverter.main;FileConverter.convertFile;FileConverter.saveResult;java/io/DataOutputStream.writeInt 1
FileConverter.main;FileConverter.convertFile;FileConverter.saveResult;java/io/DataOutputStream.writeInt;java/io/ByteArrayOutputStream.write 5
FileConverter.main;FileConverter.convertFile;FileConverter.saveResult;java/io/DataOutputStream.writeUTF;java/io/DataOutputStream.writeUTF 12
FileConverter.main;FileConverter.convertFile;FileConverter.saveResult;java/io/DataOutputStream.writeUTF;java/io/DataOutputStream.writeUTF;java/lang/String.length 3
FileConverter.main;FileConverter.convertFile;FileConverter.saveResult;java/io/DataOutputStream.writeUTF;java/io/DataOutputStream.writeUTF;java/io/DataOutputStream.write 6
start_thread;thread_native_entry;Thread::call_run;VMThread::run;VMThread::inner_execute;VMThread::evaluate_operation;VM_Operation::evaluate;VM_GenCollectForAllocation::doit;GenCollectedHeap::satisfy_failed_allocation;GenCollectedHeap::do_collection;GenCollectedHeap::collect_generation;DefNewGeneration::collect;DefNewGeneration::FastEvacuateFollowersClosure::do_void 12

带帧类型注解的示例（async-profiler 扩展）：

Main.run_[j];Service.process_[j];DAO.query_[j];mysql_real_query_[n] 45
Main.run_[j];Service.process_[j];DAO.query_[j];recv_[k] 18

核心用途

用途	说明
火焰图生成	作为 `flamegraph.pl`、`inferno` 等可视化工具的标准输入格式
差分分析	对比两次 collapsed 文件，生成红蓝差分火焰图，定位性能回归
程序化处理	纯文本格式，便于用 `awk`、`sed`、Python 等工具做自定义聚合与过滤
跨工具互操作	Brendan Gregg 定义的通用标准，几乎所有火焰图工具链都支持此格式
长期存储	文本格式体积小，适合归档和版本对比
CI/CD 集成	可在流水线中自动采集、diff、判断性能回归阈值

生成命令示例：

# 以 async-profiler 为例
asprof -d 30 -f profile.collapsed -o collapsed <PID>

Flamegraph 格式

标准语法与格式

flamegraph 格式是一个自包含的 HTML 文件，内嵌 SVG 可视化与 JavaScript 交互逻辑，可直接在浏览器中打开。

结构组成：

flamegraph.html
├── HTML 骨架 + CSS 样式
├── SVG 火焰图主体
│   ├── <g> 每个帧对应的矩形块
│   │   ├── <title> 帧名称 + 采样数/占比
│   │   └── <rect> 位置、宽高、颜色
│   └── ...
├── JavaScript 交互逻辑
│   ├── 点击缩放（zoom into subtree）
│   ├── 搜索高亮（search & highlight）
│   ├── 悬浮提示（tooltip）
│   └── 重置视图（reset zoom）
└── 元数据（title、total samples 等）

视觉编码规则：

维度	编码含义
X 轴	调用栈帧按字母序排列（非时间线），宽度与采样数成正比
Y 轴	调用栈深度，底部为根帧，顶部为叶帧
帧宽度	该帧在栈中出现的采样比例，越宽表示消耗资源越多
帧颜色	标识帧类型（见下表）

帧颜色规范（以 async-profiler 为参考）：

注意：火焰图的颜色方案并非跨工具统一标准。Brendan Gregg 原始 flamegraph.pl 使用随机暖色调，颜色无语义含义；perf/bpftrace 通常按 DSO 着色或使用随机色；async-profiler 则按帧类型语义着色。以下为 async-profiler 的配色规范：

颜色	帧类型	说明
🟢 绿色	Java (interpreted)	解释执行的 Java 方法
🟡 黄/橙色	Java (JIT compiled)	JIT 编译后的 Java 方法
🔴 红色	C/C++ (native)	原生 C/C++ 代码
🔵 蓝色	Kernel	内核态代码
⬜ 灰色	Other/Unknown	其他类型或未知帧

扩展特性（以 async-profiler 为参考）：

Icicle Graph（冰柱图）：自顶向下展示调用链（根在顶部），更符合自上而下的阅读习惯，通过 --reverse 选项或浏览器内 Reverse 按钮切换
多线程视图：不同线程的调用栈并列展示在根级别
搜索高亮：输入关键词后，匹配帧高亮为紫色，不匹配帧变暗
采样信息提示：悬浮显示帧名、采样数、占总采样百分比
Cutoff 帧：标记为 [...] 的帧表示栈截断（如因栈深度限制）

样本示例

生成命令示例：

# 以 async-profiler 为例
asprof -d 30 -f flamegraph.html <PID>

交互操作：

点击帧：缩放至该帧为全宽，仅展示其子树
搜索框：输入关键词，匹配帧高亮
悬浮：显示帧名、采样数、百分比
Reset Zoom：恢复全局视图

核心用途

用途	说明
热点定位	直观识别最宽的帧块，快速找到 CPU/内存消耗最大的代码路径
根因分析	从叶帧向上追溯，理解资源消耗的调用链上下文
团队协作	HTML 文件可直接分享，无需安装额外工具，浏览器即可查看
性能优化验证	优化前后各生成一张火焰图，对比帧宽度变化验证优化效果
非专业友好	可视化形式对非性能工程师也更易理解，便于跨团队沟通

两种格式对比

对比维度	Collapsed	Flamegraph
格式类型	纯文本	HTML + SVG
人可读性	中等（需理解栈帧语法）	高（可视化，直觉理解）
机器可读性	高（易解析、易 diff）	低（需解析 HTML/SVG）
交互性	无	支持缩放、搜索、悬浮提示
文件大小	极小（KB 级）	较大（百 KB~MB 级）
工具链依赖	无（纯文本）	浏览器
差分分析	原生支持（diff 两个文件）	需转换为 collapsed 后 diff
典型使用场景	程序化处理、CI 对比、存档	人工分析、团队分享、演示

典型工作流：

采集 ──► collapsed ──► flamegraph.html（人工分析）
                  │
                  ├──► 差分火焰图（性能回归检测）
                  ├──► 自定义聚合脚本
                  └──► 归档存储

6.4 - 网络丢包

🎯 关于 HUATUO（华佗）

HUATUO（华佗）是由滴滴开源并依托 CCF（中国计算机学会）孵化的操作系统深度观测项目，广泛应用于AI 计算、AI 沙箱、云原生通用计算、云服务、基础架构服务等场景。

📖 概述

dropwatch 是 HUATUO 提供的内核网络丢包观测工具。它通过挂载内核 Tracepoint tracepoint/skb/kfree_skb 实时采集网络丢包事件，输出完整的丢包上下文：协议类型、IP 五元组、进程名、PID、网络设备、MAC 地址，以及触发丢包的完整内核调用栈。

dropwatch 支持基于 tcpdump 风格过滤表达式的内核侧过滤，过滤逻辑由内置的纯 Go pcap 编译器 internal/pcapfilter 在加载时编译为 eBPF 字节码，过滤完全在内核态执行，只有匹配的数据包才会上报到用户空间，降低对宿主机的性能影响。

此外，dropwatch 支持设备白名单/黑名单过滤、全局上报限速，并可与 huatuo-bamai 集成，将丢包事件存储至 Elasticsearch 进行长期分析。

🎯 场景

1. Kubernetes 云原生网络丢包诊断

在容器漂移、Pod 频繁重启、Service 端口冲突等场景下，通过 dropwatch 实时捕获 kfree_skb 事件并关联到具体容器，快速定位丢包根因。结合 --filter "tcp and port <service-port>" 过滤特定业务流量，将平均故障定位时间从小时级降低至分钟级。

2. 网络性能毛刺分析

针对间歇性网络延迟突增、吞吐下降等问题，通过 dropwatch 采集丢包事件，结合内核调用栈定位丢包发生的具体内核函数（如 tcp_v4_rcv、ip_output 等），辅助区分是防火墙丢弃、路由失败还是缓冲区溢出等原因。

3. 多租户环境网络隔离故障排查

在共享网络命名空间或 veth 设备的容器环境中，通过 --device 过滤指定网络设备，结合 --filter 过滤特定协议，精确采集目标容器的丢包事件，避免其他租户流量干扰诊断结果。

4. 与可观测性平台集成

通过 --output-storage 将丢包事件发送给 huatuo-bamai，存储至 Elasticsearch 后与指标、日志进行多维关联分析。将丢包事件叠加到 Grafana 时间线上，与应用错误率、延迟曲线对齐，实现内核丢包与应用异常的精确关联。

🚀 使用

1. 过滤表达式

过滤表达式采用 tcpdump 语法，由内置的纯 Go pcap 编译器 internal/pcapfilter 在加载时编译为 eBPF 字节码，过滤完全在内核侧执行，降低对宿主机影响，只有匹配的数据包才会上报到用户空间。

1.1 支持的表达式

internal/pcapfilter 支持 tcpdump 标准语法的一个子集，下列原语可以可靠使用：

协议

ip   ip6   tcp   udp   icmp   icmp6   igmp   pim   esp   ah   vrrp   arp   rarp
ip proto tcp      ip6 proto udp        （仅协议名，不支持数字协议号）

主机地址

host 10.0.0.1
src host 10.0.0.1
dst host 10.0.0.1

端口

port 80
src port 443
dst port 8080

网段（CIDR）

net 10.0.0.0/8
src net 192.168.1.0/24
dst net 172.16.0.0/12

组播与以太地址

ip multicast    ip6 multicast    multicast    ether multicast
ether host 00:11:22:33:44:55

布尔运算与分组

tcp and port 80
tcp or udp
not arp
tcp and (port 80 or port 443)
ip and src net 192.168.1.0/24 and tcp dst port 3306

1.2 不支持的表达式

下列表达式不支持，使用后会导致编译失败或产生错误的匹配结果：

表达式	原因
`tcp[tcpflags] & tcp-syn != 0`、`ip[8]`、`tcp[0:4]`	字节偏移表达式（`proto[offset:size]`）未实现
`ip proto 6`、`ip6 proto 17`	不支持数字协议号，请改用协议名（如 `ip proto tcp`）
`ether proto 0x0800`	不支持十六进制 EtherType，请改用名字（如 `ether proto ip`）
`sctp`	关键字未识别
`portrange 80-90`、`tcp portrange 1-100`	不支持端口范围
`less N`、`greater N`	不支持按报文长度过滤
`ip broadcast`、`ether broadcast`	不支持广播匹配
`vlan`、`mpls`、`pppoes`	不支持隧道/封装关键字
`gateway`	不支持

1.3 推荐写法示例

# 监控所有 TCP 丢包（默认值——L2 和 L3 上下文均可靠）
--filter "tcp"

# TCP 和 UDP
--filter "tcp or udp"

# 指定目标主机（TCP 和 UDP 均适用）
--filter "dst host 10.0.0.1"

# 指定端口
--filter "tcp and port 443"

# 排除噪声主机
--filter "tcp and not host 169.254.169.254"

# 指定子网 + 指定端口
--filter "src net 192.168.1.0/24 and tcp dst port 3306"

# 监控非 TCP 的丢包（仅 UDP 和 ICMP——不要用 "not tcp"，会捕获到未知 L3 事件）
--filter "udp or icmp"

# 仅监控 ARP 丢包（仅 L2 上下文有效，L3 永远不匹配）
--filter "arp"

--filter "ip" / --filter "ip6" 现可正确匹配对应 IP 协议族（L2 按 EtherType、L3 按版本 nibble）。若只关心特定传输层或主机，仍建议用更精确的 tcp、udp、host 或 ip proto <name>。

2. 运行 dropwatch

dropwatch [flags]

参数	默认值	说明
`--bpf-path <path>`	必填	`dropwatch` eBPF 对象文件路径
`--filter <expr>`	（无）	tcpdump 风格过滤表达式
`--device <names>`	（无）	设备白名单：只采集这些设备的丢包，多个设备用逗号分隔（如 `eth0,eth1`）
`--device-excluded <names>`	（无）	设备黑名单：排除这些设备的丢包；与 `--device` 互斥
`--duration <n>`	0	运行 N 秒后退出（0 表示持续运行直至 Ctrl-C）
`--output <json\|text>`	`text`	输出格式；设置 `--output-storage` 时会被忽略
`--output-storage <path>`	（无）	通过 Unix socket 将事件发送给 huatuo-bamai
`--task-id <id>`	（无）	关联本次会话的任务 ID；通常与 `--output-storage` 一起使用
`--max-events-per-second <n>`	0	全局上报限速，0 表示不限速；在 `--device` / `--filter` 后生效

--filter 与设备过滤相互正交，同时指定时两者均生效（AND 语义）。不指定 --device / --device-excluded 时采集所有设备。--device 和 --device-excluded 不能同时使用；白名单模式会丢弃没有 net_device 的 SKB，黑名单模式会放行没有 net_device 的 SKB。

常用命令

# 文本格式输出，监控所有设备的 TCP 丢包
sudo dropwatch --bpf-path bpf/dropwatch.o --filter "tcp"

# 只监控 eth0 上的丢包
sudo dropwatch --bpf-path bpf/dropwatch.o --device eth0 --output json

# 排除 loopback
sudo dropwatch --bpf-path bpf/dropwatch.o --device-excluded lo --output json

# 设备过滤与协议过滤组合
sudo dropwatch --bpf-path bpf/dropwatch.o --device eth0 --filter "tcp and port 443" --output json

# 抓取 60 秒后退出
sudo dropwatch --bpf-path bpf/dropwatch.o --filter "tcp and port 443" --duration 60 --output json

# 将事件转发给正在运行的 huatuo-bamai 实例
sudo dropwatch --bpf-path bpf/dropwatch.o --filter "tcp" --output-storage /var/run/huatuo/events.sock

# 通过 jq 过滤仅显示 RST 包
sudo dropwatch --bpf-path bpf/dropwatch.o --output json 2>/dev/null | jq 'select(.layers.tcp.flags == "RST")'

# 采集 10 秒 JSON 输出，并排除调用栈包含 ip_finish_output 的事件
sudo dropwatch --output json --duration 10 --bpf-path bpf/dropwatch.o | jq -c 'select(.stack | test("ip_finish_output") | not)'

# 采集 10 秒 JSON 输出，只打印除 stack 之外的字段
sudo dropwatch --output json --duration 10 --bpf-path bpf/dropwatch.o | jq -c 'del(.stack)'

jq -c 会把每条匹配事件压缩成单行 JSON，便于保存为 NDJSON 或继续用管道处理。test("ip_finish_output") 判断 stack 是否匹配该正则，not 会把结果取反，因此上面的命令会排除包含 ip_finish_output 的调用栈；去掉 | not 后，就是只保留包含 ip_finish_output 的事件。del(.stack) 只从 jq 输出中删除 stack 字段，适合只查看时间、设备、进程、packet_* 元数据和 layers 协议字段。如需在内核侧按调用栈过滤，可通过 huatuo-bamai 配置 EventTracing.IssuesList 实现（参见第 4 节）。

3. 事件数据结构

每条丢包事件以 JSON 对象（types.DropWatchTracing）表示。

字段	类型	说明
`observed_timestamp`	string	采集到事件的 UTC 时间戳（RFC3339Nano）
`type`	string	事件类型保留字段，当前为空字符串
`drop_reason`	string	丢包原因保留字段，当前为空字符串
`source`	string	事件来源，存在时标识 `events` 或 `tools`（omitempty）
`comm`	string	丢包时的进程名
`pid`	uint64	进程 TGID
`container_id`	string	容器 ID（由 huatuo-bamai 解析填充，omitempty）
`memory_cgroup_css_addr`	string	内存 cgroup CSS 地址，用于容器归属解析
`net_namespace_cookie`	uint64	网络命名空间 cookie，用于容器归属解析
`net_namespace_inode`	uint32	网络命名空间 inode，用于容器归属解析
`netdev_name`	string	网络设备名（如 `eth0`）
`netdev_ifindex`	uint32	网络接口索引
`netdev_queue_mapping`	uint32	TX 队列映射
`netdev_linkstatus`	[]string	网络设备链路标志
`packet_skb_addr`	string	SKB 地址（十六进制，omitempty）
`packet_eth_proto`	string	原始 EtherType（十六进制，如 `0x0800`）
`packet_len`	uint32	数据包长度（字节）
`layers`	object	分层协议解析结果，缺失的层会省略
`stack`	string	内核调用栈（换行分隔）

layers 使用固定字段表达协议栈，不再依赖单独的协议枚举：

字段	说明
`layers.label`	协议组合标签，如 `IPv4/TCP`、`IPv6/UDP`、`ARP`、`unknown`
`layers.ether`	二层字段：`src`、`dst`、`type`、`len`（仅 802.3 帧存在）
`layers.ipv4`	IPv4 字段：`version`、`ihl`、`tos`、`len`、`id`、`flags`、`frag_offset`、`ttl`、`protocol`、`checksum`、`src`、`dst`
`layers.ipv6`	IPv6 字段：`version`、`traffic_class`、`flow_label`、`len`、`next_header`、`hop_limit`、`src`、`dst`
`layers.tcp`	TCP 字段：`sport`、`dport`、`seq`、`ack`、`data_offset`、`flags`、`window`、`checksum`、`urgent`、`sk_state`
`layers.udp`	UDP 字段：`sport`、`dport`、`len`、`checksum`
`layers.icmp`	ICMP/ICMPv6 字段：`type`、`code`、`checksum`、`id`、`seq`
`layers.arp`	ARP 字段：`addr_type`、`protocol`、`hw_address_size`、`prot_address_size`、`operation`、`sender_mac`、`sender_ip`、`target_mac`、`target_ip`

4. 与 huatuo-bamai 集成

huatuo-bamai 以子进程形式启动 dropwatch，并通过 --output-storage 将事件发送到内置处理流程，并最终存储到 Elasticsearch。典型参数如下：

dropwatch \
  --bpf-path <CoreBpfDir>/dropwatch.o \
  --output-storage /var/run/huatuo/events.sock \
  --filter "tcp"

4.1 配置项参考（`huatuo-bamai.conf`）

[EventTracing]
    # 已知噪声调用栈过滤。dropwatch 会丢弃 stack 匹配这些正则的事件。
    # 默认示例覆盖邻居表清理和 bnxt TX 完成释放 SKB。
    IssuesList = [["neigh_invalidate", "neigh_invalidate"], ["bnxt_tx_int", "bnxt_tx_int"]]

[EventTracing.Dropwatch]
    # tcpdump 过滤表达式，转发给 dropwatch --filter。
    # 默认值: "tcp"
    Filter = "tcp"

    # 转发给 dropwatch --max-events-per-second。
    # 默认值: 100
    MaxEventsPerSecond = 100

4.2 噪声过滤

以下三类 kfree_skb 事件默认被过滤，因为它们不是真实的数据面丢包：

模式	调用栈帧前缀	原因
TCP `CLOSE_WAIT` + `skb_rbtree_purge`	`skb_rbtree_purge/`	正常的套接字关闭流程：内核在关闭 `CLOSE_WAIT` 状态的套接字时会释放飞行中的 SKB。
ARP/邻居表到期	`neigh_invalidate/`	邻居表项到期清理，不影响任何活跃数据流。可从 `EventTracing.IssuesList` 移除对应规则以关闭过滤。
bnxt 网卡 TX 完成	`bnxt_tx_int/` 或 `__bnxt_tx_int/`	Broadcom bnxt 网卡驱动在 DMA 发送完成后调用 `kfree_skb` 释放 SKB，此为正常行为，非丢包。

🌟 结尾

🌟 欢迎 Star: https://github.com/ccfos/huatuo

👀 欢迎订阅官方微信公众号
微信公众号二维码

7 - 开发手册

7.1 - 采集模式

为帮助用户全面深入洞察系统的运行状态，HUATUO 提供三种数据采集: metrics, event, autotracing. 用户可以根据具体场景和需求实现自己的观测数据采集。

模式

模式	类型	触发条件	数据存储	适用场景
Metrics	指标数据	Pull 采集	Prometheus	系统性能指标
Event	异常事件	内核事件触发	ES + 本地存储，Prometheus（可选）	常态运行，事件触发，获取内核运行上下文
Autotracing	系统异常	系统异常触发	ES + 本地存储，Prometheus（可选）	系统异常触发，获取例如火焰图数据

指标

类型：指标采集。
功能：采集内核各子系统指标数据。
特点：
- 通过 Procfs 或 eBPF 方式采集。
- Prometheus 格式输出，最终集成到 Prometheus/Grafana。
- 主要采集系统的基础指标，如 CPU 使用率、内存使用率、网络等。
- 适合用于监控系统运行状态，支持实时分析和长期趋势观察。
已集成：
- CPU sys, usr, util, load, nr_running …
- Memory vmstat, memory_stat, directreclaim, asyncreclaim …
- IO d2c, q2c, freeze, flush …
- Networking arp, socket mem, qdisc, netstat, netdev, socketstat …

事件

类型：Linux 内核事件采集。
功能：常态运行，事件触发并在达到预设阈值时，获取内核运行上下文。
特点：
- 常态运行，异常事件触发，支持阈值设定。
- 数据实时存储 ElasticSearch、物理机本地文件。
- 适合用于常态监控和实时分析，捕获系统更多异常行为观测数据。
已集成：
- 软中断异常 softirq
- 内存异常分配 oom
- 软锁定 softlockup
- D 状态进程 hungtask
- 内存回收 memreclaim
- 异常丢包 dropwatch
- 网络入向延迟 net_rx_latency

自动追踪

类型：系统异常追踪
功能：自动跟踪系统异常状态，并在异常发生时触发工具抓取现场信息。
特点：
- 系统出现异常时自动触发，捕获。
- 数据实时存储 ElasticSearch、物理机本地文件。
- 适用于获取现场时性能开销较大、指标突发的场景。
已集成：
- CPU 异常追踪
- 进程 D 状态追踪
- 容器内外争抢
- 内存突发分配
- 磁盘异常追踪

7.2 - 自定义指标

概述

Metrics 类型用于采集系统性能等指标数据，可以 Prometheus 格式输出，作为 /metrics（curl localhost:<port>/metrics）的数据提供方。

类型：指标采集
功能：采集各子系统的性能指标
特点：
- 指标主要用于采集 CPU 使用率、内存使用量、网络统计等系统性能数据，适用于监控系统性能，支持实时分析和长期趋势观察。
- 指标来源可以是常规 procfs/sysfs 采集，也可以由 tracing 类型（autotracing、event）生成。
- 以 Prometheus 格式输出，无缝集成 Prometheus 可观测性生态。
已集成：
- CPU（sys、usr、util、load、nr_running…）
- 内存（vmstat、memory_stat、directreclaim、asyncreclaim…）
- IO（d2c、q2c、freeze、flush…）
- 网络（arp、socket mem、qdisc、netstat、netdev、socketstat…）

如何添加统计指标

只需实现 Collector 接口并完成注册即可将指标添加到系统。

type Collector interface {
    // Get new metrics and expose them via prometheus registry.
    Update() ([]*Data, error)
}

1. 创建结构体

在 core/metrics 目录下创建实现 Collector 接口的结构体：

type exampleMetric struct{}

2. 注册回调函数

func init() {
    tracing.RegisterEventTracing("example", newExample)
}

func newExample() (*tracing.EventTracingAttr, error) {
    return &tracing.EventTracingAttr{
        TracingData: &exampleMetric{},
        Flag: tracing.FlagMetric, // 标记为 Metric 类型
    }, nil
}

3. 实现 `Update` 方法

func (c *exampleMetric) Update() ([]*metric.Data, error) {
    // do something
    ...
    return []*metric.Data{
        metric.NewGaugeData("example", value, "description of example", nil),
    }, nil
}

项目 core/metrics 目录中已集成多种实用的 Metrics 示例，框架还提供了丰富的底层接口，包括 BPF 程序和 map 数据交互、容器信息等。更多详情请参考对应的代码实现。

7.3 - 自定义事件

只需实现 ITracingEvent 接口并完成注册即可。

type ITracingEvent interface {
    Start(ctx context.Context) error
}

创建

type exampleTracing struct{}

注册

func init() {
    tracing.RegisterEventTracing("example", newExample)
}

func newExample() (*tracing.EventTracingAttr, error) {
    return &tracing.EventTracingAttr{
        TracingData: &exampleTracing{},
        Internal:    10, // 再次开启 tracing 的间隔时间，单位秒
        Flag:        tracing.FlagTracing, // 标记为 tracing 类型；tracing.FlagMetric（可选）
    }, nil
}

实现 `Start`

func (t *exampleTracing) Start(ctx context.Context) error {
    // do something
    ...

    // 存储数据到 ES 和 本地
    storage.Save("example", ccontainerID, time.Now(), tracerData)
}

此外，可同时实现接口 Collector 并以 Prometheus 格式输出（可选）

func (c *exampleTracing) Update() ([]*metric.Data, error) {
    // from tracerData to prometheus.Metric 
    ...

    return data, nil
}

7.4 - 自定义追踪

概述

类型：异常事件驱动（tracing/autotracing）
功能：自动追踪系统异常状态，在异常发生时触发上下文信息捕获
特点：
- 当系统出现异常时，autotracing 自动触发并捕获相关上下文信息
- 事件数据实时存储到本地，同时发送到远程 ES，还可以生成 Prometheus 指标进行观测
- 适用于性能开销较大的场景，例如在检测到指标超过阈值或上升过快时触发捕获
已集成：CPU 空闲异常追踪（cpu idle）、D 状态追踪（dload）、容器内外部竞争（waitrate）、内存突发分配（memburst）、磁盘异常追踪（iotracer）

如何添加 Autotracing

AutoTracing 只需实现 ITracingEvent 接口并完成注册即可将事件添加到系统。

AutoTracing 与 Event 在框架实现上没有区别，只是根据实际应用场景进行区分。

// ITracingEvent represents a autotracing or event
type ITracingEvent interface {
    Start(ctx context.Context) error
}

1. 创建结构体

type exampleTracing struct{}

2. 注册回调函数

func init() {
    tracing.RegisterEventTracing("example", newExample)
}

func newExample() (*tracing.EventTracingAttr, error) {
    return &tracing.EventTracingAttr{
        TracingData: &exampleTracing{},
        Internal:    10, // 重新触发追踪的间隔（秒）
        Flag:        tracing.FlagTracing, // 标记为 tracing 类型；| tracing.FlagMetric（可选）
    }, nil
}

3. 实现 ITracingEvent

func (t *exampleTracing) Start(ctx context.Context) error {
    // 检测你关注的内容
    ...

    // 将数据存储到 ES 和本地
    storage.Save("example", ccontainerID, time.Now(), tracerData)
}

此外，可以选择实现 Collector 接口以 Prometheus 格式输出：

func (c *exampleTracing) Update() ([]*metric.Data, error) {
    // 将 tracerData 转换为 prometheus.Metric
    ...

    return data, nil
}

项目 core/autotracing 目录中已集成多种实用的 autotracing 示例，框架还提供了丰富的底层接口，包括 BPF 程序和 map 数据交互、容器信息等。更多详情请参考对应的代码实现。

7.5 - 集成测试

集成测试用于验证 huatuo-bamai在使用模拟的 /proc 和 /sys 文件系统时，能够正确启动并对外暴露符合预期的Prometheus指标。

测试运行的是真实的可执行文件，并通过校验 /metrics 接口的输出结果，确保指标采集与暴露逻辑正确，而不依赖宿主机的内核或硬件环境。

脚本执行流程

该集成测试脚本主要包含以下步骤：

生成临时的bamai.conf配置文件
使用模拟的 procfs 和 sysfs 启动 huatuo-bamai 服务
等待 /metrics 接口可访问
从 /metrics 接口拉取所有指标数据
校验所有预期指标是否存在且内容匹配
停止服务并清理相关资源
若任意一个预期指标缺失或不匹配，测试将直接失败

运行方式

请在项目根目录下执行集成测试：

bash integration/run.sh

或通过 Makefile 执行：

make integration

失败时的行为

huatuo-bamai 服务指标和日志将直接输出到标准输出，便于问题定位
临时工作目录将被保留，用于后续调试分析

成功时的行为

显示验证成功的metrics 列表

如何新增指标测试

第一步：新增或更新模拟数据

如果新增的指标依赖 /proc 或 /sys 文件内容，请在以下目录中新增或修改模拟数据：

integration/fixtures/

目录结构需与真实内核文件系统保持一致。

第二步：添加预期指标

在以下目录中新建一个文件：

integration/fixtures/expected_metrics/
├── cpu.txt
├── memory.txt
└── ...

每一行（非空、非注释行）表示一条期望的 Prometheus 指标，指标内容必须与 /metrics 接口返回结果完全一致，新增的*.txt 文件会被测试脚本自动加载并参与校验。

第三步：运行测试

bash integration/run.sh

当任意一个预期指标缺失或不匹配时，测试将失败。

8 - 常见问题

指标

为什么 memory_others_*（如 directstall_time）指标没有数据？

memory_others 采集器读取的是滴滴云定制内核提供的 memory cgroup 扩展接口（memory.directstall_stat、memory.asynreclaim_stat、memory.local_direct_reclaim_time）。主线内核及常见发行版内核不提供这些接口，也没有可加载的内核模块能提供，因此在标准内核上这些指标不会输出，属预期行为。

在标准内核上观测容器直接回收（direct reclaim）行为，请使用基于 eBPF 实现的 memory_reclaim_container_directstall 指标，详见「核心特性 / 内核全景观测」文档中的内存系统章节。

9 - 贡献

9.1 - 源码贡献

HUATUO 贡献指南

感谢你对 HUATUO 的关注！本指南将帮助你快速上手。

贡献方式

你可以通过多种方式参与 HUATUO：

代码 — 修复 Bug、添加功能、优化性能
文档 — 完善文档、翻译内容、编写教程
测试 — 编写单元测试、集成测试、报告 Bug
eBPF — 添加新的内核探针、改进内核兼容性
审查 — 审查其他贡献者的 Pull Request

开发环境

前置条件

工具	要求	说明
Go	1.24+	项目主体使用 Go 编写
Linux	内核 4.18+	eBPF 程序需要 Linux 内核
Clang/LLVM	任意较新版本	编译 eBPF C 程序所需
Kernel headers	linux-headers	BPF 编译所需
Docker	(可选)	容器化开发环境
Git	任意较新版本	版本管理

克隆仓库

# 先在 GitHub 上 Fork 仓库，然后：
git clone https://github.com/YOUR_USERNAME/huatuo.git
cd huatuo
git remote add upstream https://github.com/ccfos/huatuo.git

构建与测试

构建

make all          # 全部构建（BPF + Go）
make bpf-build    # 只构建 BPF 程序
make build        # 只构建 Go 二进制文件
make docker-build # 构建 Docker 镜像

测试

make test  # 运行全部测试
make unit  # 只运行单元测试
make check # 运行代码风格和格式化检查

注意：make test 需要 /etc/kubernetes/pki 来运行 E2E 测试。如果没有 K8s 集群，请使用 make unit。

贡献流程

1. 找到或创建 Issue

在 open issues 中查找 Bug 和功能需求
若找到未分配的 Issue，留言请求认领
如果有新想法，先创建 Issue 讨论

2. 创建分支

git checkout -b fix/short-description
# 或: git checkout -b feat/short-description
# 或: git checkout -b docs/short-description

分支命名前缀：

前缀	用途
`fix/`	Bug 修复
`feat/`	新功能
`docs/`	文档
`refactor/`	代码重构
`test/`	添加测试

3. 编写代码

一次只解决一个问题
编写或更新与变更对应的测试
运行 make check 检查代码风格
运行 make unit 验证测试通过

4. 提交代码

使用 Conventional Commits 规范：

git commit -s -m "fix(scope): 简短描述

详细说明（可选）

Closes #issue-number

Signed-off-by: Your Name <your.email@example.com>"

-s 参数会自动添加 DCO 要求的 Signed-off-by 行。

5. 推送并发起 PR

git push origin your-branch-name

然后到 ccfos/huatuo 创建 Draft Pull Request。准备就绪后点击 Ready for review 请求审阅。

6. 代码审查

维护者会审查你的 PR
根据审查意见修改并推送新 commit
修改期间可将 PR 设为 Draft，完成后再次点击 Ready for review
通过后维护者会合入你的 PR

提交信息规范

HUATUO 遵循 Conventional Commits 规范：

<type>(<scope>): <description>

[可选正文]

[可选脚注]

类型

类型	说明
`fix`	Bug 修复
`feat`	新功能
`docs`	文档变更
`test`	测试相关
`refactor`	不改变行为的代码重构
`chore`	构建、依赖等
`perf`	性能优化

示例

fix(pod): preserve response body read errors in httpDoRequest
feat(bpf): add probe for kernel scheduling latency
docs(contributing): add development setup guide
test(request): verify response body is readable after doRequest

代码风格

语言	工具
Go	`gofumpt` + `goimports`
C (eBPF)	`clang-format`（配置见 `.clang-format`）
Shell	`shfmt`
YAML/JSON	2 空格缩进

每次提交前运行 make check 确保代码符合规范。

DCO 签名

所有贡献必须包含 开发者原产地证书 (DCO) 签名。

每次提交必须以以下内容结尾：

Signed-off-by: Your Name <your.email@example.com>

使用 git commit -s 自动添加。

签名表明你是代码的作者，或有权在此项目的 Apache 2.0 许可证下贡献此代码。

社区

GitHub Issues — 报告 Bug 和提功能需求
GitHub Discussions — 提问和分享想法
微信 — 扫描 README 中的二维码加入微信群

感谢你为 HUATUO 做出贡献！