gitdataai/admin/metrics.md
ZhenYi 27cd4ea83c
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
feat(admin/metrics): add Prometheus-compatible metrics endpoint and ops documentation
- Add /api/metrics/prometheus endpoint using prom-client (unauthenticated for scraping)
- Update middleware to allow unauthenticated access to prometheus endpoint
- Add /api/metrics permission routing (platform:read for GET)
- Install prom-client dependency
- Add metrics.md with Grafana dashboard JSON, Prometheus config, alerting rules
2026-04-26 14:49:25 +08:00

6.6 KiB
Raw Blame History

Admin 平台指标 — Grafana / Prometheus 配置指南

概述

Admin 服务暴露两个指标端点:

端点 格式 用途
GET /api/metrics JSON 前端页面 / 人工查看 / API 消费
GET /api/metrics/prometheus Prometheus Text Prometheus 采集

Prometheus 端点 无需认证,可直接 scrape。

采集的指标

所有指标通过 platform_entity_count Gauge 暴露,带 entitywindow 两个 label

# HELP platform_entity_count Platform entity counts by time window
# TYPE platform_entity_count gauge
platform_entity_count{entity="users",window="total"} 1000
platform_entity_count{entity="users",window="27h"} 5
platform_entity_count{entity="users",window="7d"} 32
platform_entity_count{entity="users",window="30d"} 150
platform_entity_count{entity="workspaces",window="total"} 50
platform_entity_count{entity="workspaces",window="27h"} 1
...
platform_entity_count{entity="skills",window="30d"} 45

Entity 列表:usersworkspacesprojectsreposroomsskills

Window 列表:total(累计)、27h近27小时7d近7天30d近30天

Prometheus 配置

prometheus.yml

scrape_configs:
  - job_name: 'admin-metrics'
    scrape_interval: 60s
    metrics_path: '/api/metrics/prometheus'
    static_configs:
      - targets: ['<admin-host>:<port>']
        labels:
          env: 'production'
          service: 'admin'

K8s ServiceMonitor如果用 prometheus-operator

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: admin-metrics
  namespace: monitoring
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app: admin
  endpoints:
    - port: http
      path: /api/metrics/prometheus
      interval: 60s

Grafana Dashboard

推荐 Panel 配置

Panel 1: 实体总量Stat Panel

Query:
  platform_entity_count{window="total"}

Visualization: Stat
  - Show: Value
  - Color mode: Background
  - Thresholds: 按实际业务设定

Panel 2: 27 小时增长趋势Time Series / Bar Gauge

Query:
  platform_entity_count{window="27h"}

Visualization: Bar Gauge
  - Display: Basic
  - Show: Value

Panel 3: 7 天 / 30 天对比Bar Chart

Query:
  platform_entity_count{window=~"7d|30d"}

Visualization: Bar Chart
  - Group by: entity
  - Bar mode: grouped

Panel 4: 总量汇总表Table Panel

Query:
  platform_entity_count

Transform:
  1. Labels to fields
  2. Pivot by entity
  3. Organize fields

Visualization: Table

Dashboard JSON 模板

将以下 JSON 导入 GrafanaDashboard → Import → Paste JSON

注:uiddatasource 需要根据实际 Prometheus 数据源修改。

{
  "dashboard": {
    "title": "Admin 平台指标",
    "tags": ["admin", "platform"],
    "timezone": "browser",
    "panels": [
      {
        "id": 1,
        "title": "实体总量",
        "type": "stat",
        "gridPos": { "h": 4, "w": 24, "x": 0, "y": 0 },
        "targets": [
          {
            "expr": "platform_entity_count{window=\"total\"}",
            "legendFormat": "{{entity}}"
          }
        ],
        "options": {
          "colorMode": "background",
          "graphMode": "none",
          "justifyMode": "auto"
        },
        "fieldConfig": {
          "defaults": {
            "thresholds": {
              "mode": "absolute",
              "steps": [
                { "color": "green", "value": null },
                { "color": "yellow", "value": 100 },
                { "color": "red", "value": 1000 }
              ]
            }
          }
        }
      },
      {
        "id": 2,
        "title": "近 27 小时新增",
        "type": "bargauge",
        "gridPos": { "h": 6, "w": 12, "x": 0, "y": 4 },
        "targets": [
          {
            "expr": "platform_entity_count{window=\"27h\"}",
            "legendFormat": "{{entity}}"
          }
        ],
        "options": {
          "displayMode": "gradient",
          "orientation": "horizontal"
        }
      },
      {
        "id": 3,
        "title": "近 7 天 / 30 天对比",
        "type": "barchart",
        "gridPos": { "h": 6, "w": 12, "x": 12, "y": 4 },
        "targets": [
          {
            "expr": "platform_entity_count{window=~\"7d|30d\"}",
            "legendFormat": "{{entity}} ({{window}})"
          }
        ],
        "options": {
          "barRadius": 0.05,
          "groupWidth": 0.7,
          "orientation": "auto"
        }
      },
      {
        "id": 4,
        "title": "指标汇总表",
        "type": "table",
        "gridPos": { "h": 8, "w": 24, "x": 0, "y": 10 },
        "targets": [
          {
            "expr": "platform_entity_count",
            "format": "table",
            "instant": true
          }
        ],
        "transformations": [
          { "id": "labelsToFields", "options": {} },
          {
            "id": "organize",
            "options": {
              "excludeByName": { "Time": true, "__name__": true },
              "indexByName": { "entity": 0, "window": 1, "Value": 2 }
            }
          }
        ],
        "options": {
          "showHeader": true,
          "sortBy": [{ "desc": false, "displayName": "entity" }]
        }
      }
    ],
    "time": { "from": "now-24h", "to": "now" },
    "refresh": "1m"
  },
  "overwrite": true
}

告警规则(可选)

prometheus rules

groups:
  - name: admin-entity-growth
    rules:
      # 27 小时内用户增长超过 100 告警
      - alert: HighUserGrowth27h
        expr: platform_entity_count{entity="users", window="27h"} > 100
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "27 小时内新增用户 {{ $value }} 超过阈值"

      # 仓库 7 天零增长告警
      - alert: NoRepoGrowth7d
        expr: platform_entity_count{entity="repos", window="7d"} == 0
          and on() platform_entity_count{entity="repos", window="total"} > 0
        for: 1h
        labels:
          severity: info
        annotations:
          summary: "近 7 天无新增仓库"

验证

# 1. JSON 格式
curl http://localhost:3000/api/metrics | jq .

# 2. Prometheus 格式
curl http://localhost:3000/api/metrics/prometheus

# 预期输出:
# HELP platform_entity_count Platform entity counts by time window
# TYPE platform_entity_count gauge
platform_entity_count{entity="users",window="27h"} 0
platform_entity_count{entity="users",window="30d"} 0
platform_entity_count{entity="users",window="7d"} 0
platform_entity_count{entity="users",window="total"} 5
...