1. WS disconnect now unsubscribes from user_notification_inner.
Previously, every WebSocket connection created a broadcast channel
for user notifications that was never removed on disconnect, causing
unbounded growth proportional to unique connected users over time.
2. Room worker tasks now use the manager's room_shutdown_txs channel
instead of a local broadcast channel. shutdown_room() sends on this
channel, so when a room is deleted the worker task receives the signal
and terminates, releasing its DashMap (capacity 10,000) and all
captured closures. Previously the worker ran forever.
When stdout is connected to a TTY, use tracing_subscriber's pretty
format with colors instead of single-line JSON. Non-TTY (container
logs, pipes) continue to output JSON for log aggregation.
Override auto-detection via APP_LOG_FORMAT=json|pretty.
Also adds APP_LOG_PRETTY=true to use serde_json::to_string_pretty
for human-readable JSON output (useful for development/debugging).
The upstream AI endpoint already returns complete model metadata:
- name, owned_by, context_length, max_output_tokens
- capabilities (vision, tool_call, reasoning)
- pricing (input, output, cache_read, cache_write, currency)
Remove the OpenRouter fallback entirely and parse the upstream
response directly for all sync operations. Both sync_upstream_models
(API) and sync_once (background task) now use a single unified path.
Changes:
- Remove OpenRouter types and fetch_openrouter_models()
- Add UpstreamModel / UpstreamCapabilities / UpstreamPricing types
- Parse capabilities from upstream instead of inferring from name
- Use real pricing from upstream instead of defaulting to 0.00
- Simplify sync flow: list → parse → upsert (no filtering/matching)
- Add provider normalizations for moonshot, zai, minimax, qwen
The upstream AI endpoint returns an OpenAI-compatible format, but the
response body parsing was fragile. Make it resilient:
1. Try standard OpenAI format: { "data": [{id}, ...] }
2. Try raw array: [{id}, ...]
3. Try alternate format: { "models": [{id}, ...] }
4. Log actual response body (first 500 chars) when all formats fail
Also adds a warning log with the raw response on parse failure so
future debugging is straightforward.
The check_compatibility(false) method was added in the previous commit
but does not exist in sqlx 0.8.x used by sea-orm 2.0. The warning
"Failed to obtain server version" is cosmetic and does not affect
functionality.
When OpenRouter's public /api/v1/models endpoint fails (network error,
timeout, parse failure), the entire sync was aborted — meaning models
accessible from the user's AI endpoint were never synced.
Now: if OpenRouter fetch fails, fall back to sync_models_direct for all
available models instead of returning an error. Both sync_upstream_models
(API) and sync_once (background task) have this fix.
Cloud-managed PostgreSQL variants (PolarDB, CockroachDB, etc.) may
not return a standard version string, causing:
"Failed to obtain server version. Unable to check client-server
compatibility."
Setting check_compatibility(false) on both writer and reader
connections silences this harmless warning.
When upstream /v1/models returns models not yet in OpenRouter's catalog
(e.g. brand-new models like DeepSeek-V4), also upsert them through the
same pipeline (provider → model → version → pricing → capabilities →
parameter_profile) with inferred defaults, instead of silently dropping
them. Previously the direct-sync fallback only triggered when *zero*
OpenRouter matches existed.
Remove daily report system (page, API routes, cron scheduler) as it is
no longer needed. Add /api/metrics endpoint exposing total and time-
windowed counts (27h, 7d, 30d) for users, workspaces, projects, repos,
rooms, and skills.
Also clean up dead code:
- Remove OpenRouter sync and alerts check routes
- Remove syncModels/checkAlerts from adminrpc client
- Remove unused adminRpcAvailable state from platform sessions page
- Fix handleEdit displayName comparison bug in platform users page
- Simplify pricing sync to create 0-price defaults
Add MsgJsonFormat custom event formatter that outputs JSON with _msg as
the first field, required by VictoriaLogs for full-text search. HTTP
middleware stores interpolated "METHOD /path" in thread-local buffer
for the formatter to read on span-close events.
- Save thinking_content as {"__chunks__": [{type, content}]} for replay
- Tool call sanitization — don't expose raw results to frontend
- Billing record_ai_usage integration
- Room service module refactoring into service/ directory
- StreamChunk/StreamChunkType types for preserving arrival order
- Chunk collection in call_stream_once and process_stream
- Add "error sending request" and "Http client error" to retryable errors
- StreamResult includes chunks vector for ordered replay
CNPG's cluster-ro service already handles load balancing and failover,
so the application-level Vec + random_range is redundant.
- db_read: Vec<DatabaseConnection> → Option<DatabaseConnection>
- database_read_replicas returns Option<String> instead of Vec<String>
- health checks now explicitly ping both writer() and reader()
- remove unused rand dependency from libs/db
Hook tasks and email metrics were missing from /metrics because
describe_counter! was never called before install_recorder(), so
unincremented counters were not exported. Room metrics appeared
because RoomMetrics::new() already described them.
- apps/git-hook: describe 8 hook_tasks_*/hook_sync_* counters
- apps/email: describe 8 email_* counters
- both: add metrics = "0.22" as direct dependency
- Main sitemap index at /sitemap.xml referencing 4 sub-sitemaps
- /sidemap/static: fixed routes (homepage, auth, marketing pages)
- /sidemap/users: public user profiles sorted alphabetically
- /sidemap/projects: public projects sorted alphabetically
- /sidemap/repos: public repos sorted alphabetically
- Redis cache with 8h TTL (no refresh on access), key: sidemap:{type}
- robots.txt Sitemap URL uses main_domain() with https:// forced
- All sitemap loc entries use https:// base URL
Health monitoring:
- gitserver: /health endpoint on port 8021 (DB + Redis ping)
- git-hook: hyper health server on port 8083 with /health
- email-worker: hyper health server on port 8084 with /health
- K8s probes updated to httpGet for all three services
Metrics (via /metrics endpoint):
- git-hook: hook_tasks_total/success/failed/locked/retried/exhausted,
hook_sync_branches/tags_changed_total
- email: email_queued/consumed/sent/failed_total,
email_validation_skipped/build_errors/send_attempts_total
Skip terminal access logs for noisy K8s probe and monitoring
endpoints: /health, /metrics, and /ws path prefix.
Applied to both the main app and static file server.
- TypingUsers state split by sender_type: AI vs human typing
- AI typing shows "{Name} is thinking..." with accent color
- Human typing shows "{Name} is typing..." with muted style
- AI typing relies on backend 60s TTL stop (no client-side 4s fallback)
- Add Page Visibility API to reconnect WS on tab become visible
- Add debug logs for typing flow tracing
- Pass sender_type through WS room.typing event routing
- Add sender_type field to TypingEvent (user/ai)
- Change Redis TTL from 10s to 60s for AI typing persistence
- Broadcast typing.start/stop with sender_type=ai when AI stream starts/ends
- Replay active AI typing events from Redis on new WS subscribe
- Fix ai.stream_chunk WS payload missing display_name and chunk_type
- Add initial thinking chunk on AI stream start for immediate indicator
Actix-web's Data<T> extractor expects T to implement Clone,
so PrometheusHandle (which is Clone) does not need to be
wrapped in Arc. The Arc type mismatch caused /metrics to
return 500.
- service now delegates model/provider/pricing logic to agent crate
- ChatService built at startup with EmbedService (graceful degradation)
- RoomService wired with EmbedService for Qdrant embedding
- Add error types for embedding service
- message.rs: after creating a text message, spawn async task to
embed and upsert into Qdrant collection "room:{project}:{room_id}"
- service.rs: RoomService now takes optional EmbedService, embed on
every message creation
- build.rs: generate .gz and .br compressed variants at build time,
embed both into the FRONTEND constant
- dist.rs: content-type for SPA fallback uses index.html path explicitly,
explicit extension mapping for Vite content-hashed filenames
- frontend lib: get_frontend_asset_compressed returns (data, encoding, etag)
- MessageInput: ignore empty text in handleEditorUpdate to avoid
TipTap's onUpdate("") on init clearing the typing state
- DiscordChatPanel: show typing indicator when other users are typing
- room-context: wire onTypingStart/Stop into ws callbacks
CommitReflogEntry now includes offset_minutes field, populated from
sig.when().offset_minutes() — matches git's internal timezone offset
representation. Used by git_tools for accurate reflog timestamps.
CommandPalette: replace workspaceProjects with getCurrentUserProjects
(no workspace dependency so it works outside WorkspaceProvider).
Repos fetched per-project to preserve correct /repository/ns/repo
routes. Keyboard shortcut correctly matches Ctrl+Alt+F / Cmd+Ctrl+F.
sidebar-user: fix notification button layout — bell icon and label
now on the same row instead of separate stacked elements.
- Remove duplicate smooth scroll effect from DiscordChatPanel; handle
all scroll logic in MessageList instead
- MessageList: track isInitialLoadRef to instant-jump to bottom on
first load (no animation), and only auto-scroll for new messages
when user is already near the bottom
- sender.ts: getSenderDisplayName rejects UUID values and falls back
to 'AI' for AI messages; getSenderModelId uses display_name
RoomMessageEvent was losing the AI model name because the
From<RoomMessageEnvelope> impl hardcoded display_name: None.
Add display_name to RoomMessageEnvelope and propagate it through
all AI streaming code paths (chat, ReAct, non-streaming).
Member messages keep display_name: None.