Commit Graph

4 Commits

Author SHA1 Message Date
ZhenYi
962bf0312d feat(observability): Phase 6 OTLP tracing + Prometheus metrics endpoint
OTLP tracing:
- libs/observability/otlp.rs: SdkTracerProvider via HTTP/proto OTLP exporter
- libs/observability/tracing_middleware.rs: Actix-web span with trace_id propagation
- libs/observability/tracing_fmt.rs: JSON fmt + registry.try_init for layered init
- libs/rpc: gRPC method spans via info_span
- libs/agent, libs/room, libs/service, libs/api: structured tracing throughout

Prometheus metrics:
- libs/observability/prometheus_exporter.rs: /metrics HTTP handler + metrics crate
- libs/observability/metrics_middleware.rs: HttpMetrics middleware + AtomicU64
- libs/observability/redis_metrics.rs: Redis counter poller via RedisMetrics
- libs/room/metrics.rs: RoomMetrics (connections, messages, presence counters)

Config env vars: APP_OTEL_ENABLED, APP_OTEL_ENDPOINT, APP_OTEL_SERVICE_NAME
2026-04-22 10:27:54 +08:00
ZhenYi
57779822dc refactor(room): migrate from slog to tracing + upgrade metrics to 0.22
- Remove all use slog::* imports and log: slog::Logger fields
- Replace slog macros with tracing::{info!, warn!, error!, debug!}
- metrics.rs: upgrade metrics 0.21→0.22, remove register_*! macros,
  use functional API: metrics::gauge!(), metrics::counter!(),
  metrics::histogram!(), metrics::describe_gauge!() etc.
- RoomMetrics: all fields now use functional metrics API, dynamic
  room_id labels passed as owned String to avoid lifetime issues
- RoomService: remove pub log: slog::Logger field
- connection.rs: remove log from subscribe_room_events,
  subscribe_project_room_events, subscribe_task_events_fn
2026-04-21 22:28:52 +08:00
ZhenYi
a09ff66191 refactor(room): remove NATS, use Redis pub/sub for message queue
- Remove async-nats from Cargo.toml dependencies
- Rename nats_publish_failed metric → redis_publish_failed
- Update queue lib doc comment: Redis Streams + Redis Pub/Sub
- Add Paused/Cancelled task statuses to agent_task model
- Add issue_id and retry_count fields to agent_task
- Switch tool executor Mutex from std::sync → tokio::sync (async context)
- Add timeout/rate-limited/retryable/tool-not-found error variants
2026-04-16 17:24:04 +08:00
ZhenYi
93cfff9738 init 2026-04-15 09:08:09 +08:00