Commit Graph

79 Commits

Author SHA1 Message Date
ZhenYi
329b526bfb fix(git): add storage_path existence pre-check to run_fsck and run_gc
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
Consistent with run_sync: fail fast before blocking the thread pool
if the repo storage path does not exist on disk.
2026-04-16 22:24:14 +08:00
ZhenYi
d26e947f8e fix(git): add storage_path existence pre-check in run_sync
Fail fast if repo storage_path does not exist on disk, before
blocking the thread pool with spawn_blocking. This prevents
invalid tasks from consuming blocking threads while failing.
2026-04-16 22:23:17 +08:00
ZhenYi
8a0d2885f7 fix(git): correct pool panic detection and add failure diagnostic logs
- Fix match pattern: `Ok(Ok(Err(_)))` must be treated as NOT a panic,
  since execute_task_body returns Err(()) on task failure (not a panic).
  Previously the Err path incorrectly set panicked=true, which could
  cause pool to appear unhealthy.
- Add "task failed" log before retry/discard decision so real error
  is visible in logs (previously only last_error on exhausted-retries
  path was logged, which required hitting 5 retries to see the cause).
- Convert 3 remaining pool/mod.rs shorthand logs to format!() pattern.
2026-04-16 22:21:30 +08:00
ZhenYi
bbf2d75fba fix(git): harden hook pool retry, standardize slog log format
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
- Add retry_count to HookTask with serde(default) for backwards compat
- Limit hook task retries to MAX_RETRIES=5, discard after limit to prevent
  infinite requeue loops that caused 'task nack'd and requeued' log spam
- Add nak_with_retry() in RedisConsumer to requeue with incremented count
- Standardize all slog logs: replace "info!(l, "msg"; "k" => v)" shorthand
  with "info!(l, "{}", format!("msg k={}", v))" across ssh/authz.rs,
  ssh/handle.rs, ssh/server.rs, hook/webhook_dispatch.rs, hook/pool/mod.rs
2026-04-16 21:41:35 +08:00
ZhenYi
759ce7444e build(docker): 在多个 Docker 镜像中添加 Git 安装
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 21:25:55 +08:00
ZhenYi
d0fc590491 fix(frontend): remove unused Zap import in landing-sections.tsx
Some checks are pending
CI / Frontend Build (push) Blocked by required conditions
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
2026-04-16 21:10:39 +08:00
ZhenYi
752b91d329 chore: update Cargo.lock for russh downgrade to 0.50.4
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 20:58:12 +08:00
ZhenYi
02847ef1db fix(git): downgrade russh to 0.50.4, remove flate2 feature, fix log format
- Downgrade russh from 0.55.0 to 0.50.4
- Remove unused flate2 feature from russh dependency
- Use info!(logger, "{}", format!(...)) for channel lifecycle log messages
2026-04-16 20:58:01 +08:00
ZhenYi
1090359951 fix(git): add SSH channel lifecycle logging and fix password auth username check
- Remove user=="git" restriction from auth_password: the actual user is
  determined by the token, not the SSH username, matching Gitea's approach
- Add channel_open_session logging with explicit flush to verify
  CHANNEL_OPEN_CONFIRMATION reaches the client
- Add pty_request handler (reject with log) so git clients that request
  a PTY are handled gracefully instead of falling through to default
- Add subsystem_request handler (log + accept) so git subsystems are
  visible in logs
- Prefix unused variables with _ to eliminate warnings
2026-04-16 20:40:17 +08:00
ZhenYi
f5ab554d6b fix(git): add LFS upload size limits and fix HTTP rate limiter read/write counter
- Add LFS_MAX_OBJECT_SIZE (50 GiB) and validate object sizes in both the
  batch advisory check and the upload_object streaming loop to prevent
  unbounded disk usage from malicious clients
- Fix HTTP rate limiter: track read_count and write_count separately so
  a burst of writes cannot exhaust the read budget (previously all
  operations incremented read_count regardless of type)
2026-04-16 20:14:13 +08:00
ZhenYi
cef4ff1289 fix(git): harden HTTP and SSH git transports for robustness
HTTP:
- Return Err(...) instead of Ok(HttpResponse::...) for error cases so
  actix returns correct HTTP status codes instead of 200
- Add 30s timeout on info_refs and handle_git_rpc git subprocess calls
- Add 1MB pre-PACK limit to prevent memory exhaustion on receive-pack
- Enforce branch protection rules (forbid push/force-push/deletion/tag)
- Simplify graceful shutdown (remove manual signal handling)

SSH:
- Fix build_git_command: use block match arms so chained .arg() calls
  are on the Command, not the match expression's () result
- Add MAX_RETRIES=5 to forward() data-pump loop to prevent infinite
  spin on persistent network failures
- Fall back to raw path if canonicalize() fails instead of panicking
- Add platform-specific git config paths (/dev/null on unix, NUL on win)
- Start rate limiter cleanup background task so HashMap doesn't grow
  unbounded over time
- Derive Clone on RateLimiter so SshRateLimiter::start_cleanup works
2026-04-16 20:11:18 +08:00
ZhenYi
5a59f56319 fix(room): revert stale edited_at if messageGet fetch fails
onMessageEdited optimistically set edited_at, then fetched the full
message. If the fetch failed the "Edited" indicator persisted even though
the content was stale. Fix by capturing the original edited_at and
reverting it in the catch block — consistent with editMessage rollback.
2026-04-16 19:34:59 +08:00
ZhenYi
beea8854ce fix(room): fix stale room subscribe after async connect
connect() is async/fire-and-forget — if the user switches rooms while
WS is still connecting, the subscribeRoom() call captures the stale
(activeRoomId) closure value and subscribes to the wrong room. Fix by
re-reading activeRoomIdRef.current after the await so we always subscribe
to the room that is active when the connection actually opens.
2026-04-16 19:34:07 +08:00
ZhenYi
7989f7ba4b fix(room): fix StrictMode reconnect loop, add revokeMessage rollback
- useEffect([wsClient]): remove wsClient from deps to prevent
  React StrictMode double-mount from disconnecting the real client.
  First mount connects client-1; StrictMode cleanup disconnects it.
  Second mount connects client-2; first mount's second cleanup would
  then disconnect client-2, leaving WS permanently unconnected.
  Changing to useEffect([]) + optional chaining fixes this.
- revokeMessage: add optimistic removal + rollback on server rejection,
  consistent with editMessage pattern. Previously a failed delete left the
  message visible with no feedback.
2026-04-16 19:33:14 +08:00
ZhenYi
7416f37cec fix(room): prevent double-send, log resubscribe errors, dim pending messages
- sendMessage: guard with sendingRef to prevent concurrent in-flight
  sends (was missing — rapid clicks could create duplicate messages)
- resubscribeAll: log at warn level instead of silently swallowing,
  so operators can observe auth expiry or persistent failure patterns
- RoomMessageBubble: apply opacity-60 when isPending or isFailed,
  and hide action toolbar for pending messages (can't react/act on
  unconfirmed messages)
2026-04-16 19:29:34 +08:00
ZhenYi
677e88980b fix(room): add edit rollback, clean stream channels on room shutdown/idle 2026-04-16 19:28:23 +08:00
ZhenYi
c89f01b718 feat(room): improve robustness — optimistic send, atomic seq, jitter reconnect
Backend:
- Atomic seq assignment via Redis Lua script: INCR + GET run atomically
  inside a Lua script, preventing duplicate seqs under concurrent requests.
  DB reconciliation only triggers on cross-server handoff (rare path).
- Broadcast channel capacity: 10,000 → 100,000 to prevent message drops
  under high-throughput rooms.

Frontend:
- Optimistic sendMessage: adds message to UI immediately (marked
  isOptimistic=true) so user sees it instantly. Replaces with
  server-confirmed message on success, marks as isOptimisticError on
  failure. Fire-and-forget to IndexedDB for persistence.
- Seq-based dedup in onRoomMessage: replaces optimistic message by
  matching seq, preventing duplicates when WS arrives before REST confirm.
- Reconnect jitter: replaced deterministic backoff with full jitter
  (random within backoff window), preventing thundering herd on server
  restart.
- Visual WS status dot in room header: green=connected, amber
  (pulsing)=connecting, red=error/disconnected.
- isPending check extended to cover both old 'temp-' prefix and new
  isOptimistic flag, showing 'Sending...' / 'Failed' badges.
2026-04-16 19:23:06 +08:00
ZhenYi
5482283727 feat(seo): add useHead to all landing pages with Command as Service titles and descriptions
Some checks are pending
CI / Frontend Build (push) Blocked by required conditions
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
2026-04-16 19:12:06 +08:00
ZhenYi
022b45d44f docs(landing): replace 'Real-time' with 'Command-first' in nav, align Enterprise nav desc
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 19:08:20 +08:00
ZhenYi
e59edf6c1a docs(landing): align docs page hero with Command as Service platform
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 19:05:57 +08:00
ZhenYi
757a7044bf docs(landing): align Enterprise pricing and Public Rooms nav with Command as Service
Some checks are pending
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
2026-04-16 19:04:34 +08:00
ZhenYi
260d154db2 docs(landing): final alignment — rooms features, API endpoint, and Pro pricing tagline
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 19:03:27 +08:00
ZhenYi
c9f4e2dbe7 docs(landing): replace 'chat rooms' with 'command-first rooms' in feature descriptions
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 19:02:26 +08:00
ZhenYi
84f2604106 docs(landing): align remaining landing pages with Command as Service concept
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 19:01:27 +08:00
ZhenYi
4c77426965 docs(landing): align all non-business pages with Command as Service concept
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 18:59:37 +08:00
ZhenYi
4eb93da28b docs(landing): update README and SEO defaults to Command as Service tagline
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 18:58:28 +08:00
ZhenYi
a4cb18580b docs(landing): rewrite tagline and features around Command as Service
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
- Badge: 'Command as Service · Human + Agent Engineering'
- Subtitle: 'Every action is a command. Every command is versioned,
  auditable, and composable.'
- Terminal demo: show service deploy, skill run, and replay commands
- Features: lead with Command as Service, rewrite all descriptions
  around the command stream paradigm
- Section header: 'Build. Review. Automate. All via Commands.'
- Highlight: 'Your CLI is also your API'
- Nav/Footer: surface Command as Service in Platform menu
2026-04-16 18:54:50 +08:00
ZhenYi
431f40063f fix(ws): allow APP_DOMAIN_URL and APP_STATIC_DOMAIN origins
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
validate_origin() only allowed localhost origins by default, causing
production WebSocket connections to be rejected. Now it reads
APP_DOMAIN_URL and APP_STATIC_DOMAIN from env and automatically
adds their http/https/ws/wss variants to the allowed origins list.

Also add APP_DOMAIN_URL to the production configmap.
2026-04-16 18:51:52 +08:00
ZhenYi
89deebced6 fix(api): add clone url
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
2026-04-16 18:31:05 +08:00
ZhenYi
df87a65cbb fix(room): accept room_public JSON key for HTTP fallback
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
The frontend WebSocket client sends room_public, but the HTTP fallback
sends it as a JSON body parsed directly into RoomCreateRequest/RoomUpdateRequest
which expects public. Add #[serde(rename = "room_public")] so both
ws params and HTTP JSON body work consistently.
2026-04-16 18:27:53 +08:00
ZhenYi
8b433c9169 feat(repo): return ssh/https clone URLs from backend
Some checks are pending
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
Add ssh_clone_url and https_clone_url to ProjectRepositoryItem,
constructed from config.ssh_domain() and config.git_http_domain().
Update frontend RepoInfo interface and header.tsx to use the
server-provided URLs instead of hardcoded values.
2026-04-16 18:20:48 +08:00
ZhenYi
b1e93a7cfc fix(init): use resp.error to detect project_not_found instead of HTTP status
Some checks are pending
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
The generated client returns AxiosError as resp when throwOnError=false.
Check resp.error (set by the client to err.response?.data) for the
business-level code/error fields rather than relying on HTTP status.
2026-04-16 18:05:44 +08:00
ZhenYi
b4af88e730 chore(scripts): use git short SHA as default image tag
Some checks are pending
CI / Frontend Build (push) Blocked by required conditions
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
Replace timestamp-based and 'latest' defaults with git rev-parse --short
HEAD across build.js, deploy.js, and push.js for consistent, traceable
image tags tied to source commits.
2026-04-16 17:34:52 +08:00
ZhenYi
fa091b9d22 deploy(k8s): add CRD definitions for operator 2026-04-16 17:24:25 +08:00
ZhenYi
756af4fb73 docs: add branch protection, commit convention, and gitflow guides 2026-04-16 17:24:14 +08:00
ZhenYi
51e4531e95 docs: update README to reflect Redis pub/sub instead of NATS 2026-04-16 17:24:11 +08:00
ZhenYi
a09ff66191 refactor(room): remove NATS, use Redis pub/sub for message queue
- Remove async-nats from Cargo.toml dependencies
- Rename nats_publish_failed metric → redis_publish_failed
- Update queue lib doc comment: Redis Streams + Redis Pub/Sub
- Add Paused/Cancelled task statuses to agent_task model
- Add issue_id and retry_count fields to agent_task
- Switch tool executor Mutex from std::sync → tokio::sync (async context)
- Add timeout/rate-limited/retryable/tool-not-found error variants
2026-04-16 17:24:04 +08:00
ZhenYi
41f96af064 fix(init): use isAxiosError to correctly detect 404 on project name availability check 2026-04-16 17:22:25 +08:00
ZhenYi
f5084974b3 fix(k8s): add APP_SESSION_SECRET to ConfigMap to fix captcha errors with multi-pod
Without a shared cookie signing key, each pod generates a random key on
startup. Requests that hit different pods fail session validation, causing
CaptchaError when the captcha and login requests route to different pods.
2026-04-15 23:31:11 +08:00
ZhenYi
b6022e824d feat(k8s): enforce minimum 2 replicas for all services except email-worker 2026-04-15 23:08:25 +08:00
ZhenYi
e7cf0c544f fix(k8s): remove all health probes from gitserver 2026-04-15 23:07:17 +08:00
ZhenYi
dd4bbf3bb5 fix(k8s): add startupProbe to gitserver deployment template 2026-04-15 23:04:17 +08:00
ZhenYi
451e55596a fix(k8s): protect PVCs from deletion on helm uninstall 2026-04-15 23:00:55 +08:00
ZhenYi
2afad11cb6 fix(frontend): use workspaceInfo API to get full workspace data for sidebar 2026-04-15 22:55:01 +08:00
ZhenYi
a1ebe47564 fix(frontend): remove unused imports and fix workspace type in homepage 2026-04-15 22:45:07 +08:00
ZhenYi
c033cc3ff8 fix(k8s): add procps to worker images and fix probe commands
- Add procps to git-hook and email-worker Dockerfiles (provides pgrep)
- Change all exec probes from pgrep to kill -0 1 (more reliable, bash built-in)
- Add startupProbe to gitserver with 30 failure threshold (5min max startup time)
- Increase gitserver liveness initialDelay to 30s for slower SSH init
2026-04-15 22:13:16 +08:00
ZhenYi
afbc58d9bf feat(frontend): landing pages with Command as Service concept
- Add landing subpages: pricing, skills, solutions, network, about, docs
- Nav pop cards link to all subpages with nested routes
- Homepage: full landing content with top nav (no sidebar) for logged-in users
- Rewrite copy based on real backend: Git repos, Issues/PRs, Rooms, AI Agents
- Introduce "Command as Service" as core product concept
- Terminal demo shows realistic gitdata CLI commands
- Footer links updated to real routes
- Fix workspace redirect slug guard (undefined route)
2026-04-15 21:45:30 +08:00
ZhenYi
0ce70eca7f fix(deploy): bind app to 0.0.0.0 for K8s Service connectivity 2026-04-15 14:26:42 +08:00
ZhenYi
d307c13878 fix(deploy): route /api and /ws to app, frontend as default on gitdata.ai 2026-04-15 14:19:25 +08:00
ZhenYi
6c3f5b49f8 feat(deploy): single unified Ingress with per-host routing
Replace multiple conflicting Ingress resources with one that routes:
- gitdata.ai         → frontend (port 80)
- api.gitdata.ai     → app (port 8080)
- git.gitdata.ai     → gitserver-http (port 8022)
- static.gitdata.ai  → static (port 8081)

Disable service-level ingress configs in values.yaml (they would
conflict on the same host/path). Single TLS secret covers all hosts.
2026-04-15 14:17:03 +08:00