Delete deprecated chat components:
- agent_profile, context, message_builder
- nonstreaming_execution, session_recording
These are superseded by the consolidated chat module.
Extend delegation system with 5 new specialized roles alongside
researcher/analyst/reviewer. Each role has curated tool access.
Refactor profile lookup to use profile_for_role_name and update
compact/summarizer and tool context accordingly.
- Extract push_unique_skill_context method to MessageBuilder
- Merge built-in skills with DB skills in passive injection
- Simplify code structure for both streaming/nonstreaming execution
Extract agent, compact, embed, task, and modes modules from single
service.rs files into focused sub-modules. Add orao module for
O1-like reasoning loop. Move RigAgentService to rig_tool.rs.
- Add gitignore and prettier configuration files for project scaffolding
- Implement room access control service with project member verification
- Create user access key management with CRUD operations and activity logging
- Add accordion UI component for frontend expandable sections
- Implement room AI configuration with list, upsert, and delete operations
- Add AI event types for agent join/leave/status change tracking
- Create streaming AI processing services for mode and react patterns
- Build room AI service with model detection and idempotency handling
- Integrate chat service orchestration for AI message processing
- Add typing indicators and stream cancellation for AI interactions
- Implement mention parsing and context extraction for AI agents
Add persistent chat session state (ChatState, sequence tracking, tool
calls). Introduce basic billing record in agent crate. Refine chat
service to route messages through state machine with tool support.
Add RoomAiService as the central dispatcher that selects execution
path based on mode (react/chat/cot/reflexion/rewoo) and streams
vs nonstreaming preference. Replace monolithic ai_streaming with
mode-aware dispatch and dedicated streaming implementation.
Move tool execution to a spawned task so synchronous git2 operations
don't block the tokio worker thread, allowing heartbeat chunks to be
sent every 10s during long tool execution.
Also add analysis-first reasoning prompt to system messages.
Backend:
- New GET /api/agents/models/catalog endpoint with page/per_page/search
params, excludes deprecated models, returns pricing data via
model→version→pricing join
- ModelWithPricingResponse includes input_price, output_price, currency
- ModelListResponse with pagination metadata (total, page, per_page)
- Batch-fetches default versions + latest pricing to avoid N+1
Frontend:
- RoomSettingsPanel: replace Dialog with inline two-step panel
- Step 1: paginated model browser with search, shows context length,
max output tokens, pricing per 1K tokens, capability/modality badges
- Step 2: selected model info card + AI configuration form
- Removed Dialog import and related unused dependencies
- Implement SSHandle struct with comprehensive Git service handling capabilities
- Add support for multiple authentication methods including password, public key and certificate
- Integrate Git command parsing and execution with proper channel management
- Implement branch protection rules enforcement during Git operations
- Add robust error handling and logging for SSH connections and Git processes
- Create secure Git command execution with environment isolation
- Implement proper resource cleanup for channels and subprocesses
- Add support for receive-pack, upload-pack and upload-archive services
- Integrate with existing authz and database services for permission checks
- Implement proper data forwarding between SSH channels and Git processes
fix(config): improve environment loading with error reporting
- Replace silent dotenv loading failures with informative error messages
- Handle global config race conditions safely during application startup
- Improve config loading reliability and debugging capabilities
fix(link-unfurl): handle server-side rendering compatibility
- Add undefined window object check for SSR environments
- Prevent client-side only code from breaking server-side rendering
refactor(agent): improve tool registry error handling
- Replace panics with graceful error logging for duplicate tool registrations
- Add proper error type definitions for tool registry operations
- Implement safe merging of registries with duplicate detection
fix(room-context): enhance WebSocket connection reliability
- Add proper error handling for room subscription operations
- Improve connection management with better error suppression
- Add console warnings for debugging connection issues
feat(ws-client): add comprehensive WebSocket client implementation
- Create RoomWsClient class with complete WebSocket communication layer
- Implement request-response pattern with timeout handling
- Add support for various room-related events and actions
- Include proper connection status tracking and management
- Implement callback system for different event types
- Add automatic reconnection and error recovery mechanisms
Add with_embed_service() builder and embed_service() accessor to ToolContext,
wired through ChatService so function-calling tools can access Qdrant vector search.
- chunk_text(): char-boundary-safe text chunking at paragraph/sentence breaks (7000 char limit)
- embed_memories_batch(): groups messages by room, batch-embeds all texts to reduce Qdrant calls
- embed_issue_chunked(): auto-chunks long issue bodies
- embed_skill(): upgraded with auto-chunking via chunk_text
- TagEmbedInput struct for batch tag embedding
- embed_tags_batch() / search_tags() with project isolation
- ensure_collections() now creates embed_repo_tag collection
Critical fixes:
- Wrap balance updates in database transactions with SELECT FOR UPDATE
- Move history insert after balance validation to prevent orphaned records
- Use Decimal throughout to avoid silent conversion failures
- Prevent concurrent requests from causing negative balances
Tasks resolved:
- Task #4: Silent Decimal conversion failures
- Task #5: Missing transaction isolation (race conditions)
- Task #6: History inserted before validation
- process_react now returns (String, i64, i64) tuple with token counts
- Extract token stats from rig Agent FinalResponse usage field
- Both streaming and non-streaming ReAct modes now bill correctly
- Add record_ai_session() helper calling billing::record_ai_usage()
- Replace all Set(None) cost/currency with actual calculated values
- Cost computed from model_pricing via Decimal precision
- Use AgentBuilder for native tool-calling with stream_prompt()
- Add RecordingTool wrapper preserving retry + DB recording
- Fix tool_choice bug in do_completion (same as call_stream_once)
- Add seq field to RoomMessageStreamChunkEvent for strict ordering
- Map streaming events: Text→Answer, Reasoning→Thought, ToolCall→Action
- Only final event has done=true, removed premature stream ending
- Store __chunks__ JSON in thinking_content for ordered replay
- Start SSH rate limiter cleanup task that was missing (prevent memory leak)
- Create single ToolContext outside tool execution loop so max_tool_calls
and max_depth guards actually fire across batch tool calls (was creating
fresh context per call, bypassing all limits)
When a user mentions a repository in room chat, extract the repo name
from @[repo:name:label] brackets, look up the full repo model from the
database, and inject its details (name, description, default branch,
visibility) into the AI message context. Works independently of
embed_service availability.
When APP_AI_BASIC_URL already ends with /v1 (e.g. openrouter.ai/api/v1),
appending /v1/models produces /v1/v1/models. Detect trailing /v1 and
only append /models in that case.
- StreamChunk/StreamChunkType types for preserving arrival order
- Chunk collection in call_stream_once and process_stream
- Add "error sending request" and "Http client error" to retryable errors
- StreamResult includes chunks vector for ordered replay
- Streaming path: on tool_call execution error, emit an [Observation]
chunk so the model sees the failure and can retry/adapt
- Non-streaming path: inject error as a user message so the loop
continues with error context, not silently stop
ReAct loop was terminating early when the model returned:
[Agent ran through N steps...]
{"thought": "...", "action": {...}}
The extract_json function only checked the string start or code fences.
Now scans for { or [ at non-word positions and uses depth-counting
to strip trailing text, allowing JSON buried anywhere in the response.