gitdataai/libs/git/HOOK_QUEUE_NATS_MIGRATION.md
ZhenYi 14f6e1e500 feat(core): initialize project with access control and AI integration
- Add gitignore and prettier configuration files for project scaffolding
- Implement room access control service with project member verification
- Create user access key management with CRUD operations and activity logging
- Add accordion UI component for frontend expandable sections
- Implement room AI configuration with list, upsert, and delete operations
- Add AI event types for agent join/leave/status change tracking
- Create streaming AI processing services for mode and react patterns
- Build room AI service with model detection and idempotency handling
- Integrate chat service orchestration for AI message processing
- Add typing indicators and stream cancellation for AI interactions
- Implement mention parsing and context extraction for AI agents
2026-05-03 06:04:31 +08:00

7.5 KiB

Hook Queue NATS JetStream Migration Guide

Overview

The git hook queue now supports both Redis Lists and NATS JetStream as backend message queues. This allows gradual migration from Redis to NATS without downtime.

Architecture

Producer (ReceiveSyncService)

The producer tries NATS first (if configured), then falls back to Redis:

pub struct ReceiveSyncService {
    pool: deadpool_redis::cluster::Pool,
    redis_prefix: String,
    nats_publish: Option<Arc<dyn Fn(String, Vec<u8>) -> Pin<Box<dyn Future<Output = Result<u64>> + Send>> + Send + Sync>>,
}

Consumer (RedisConsumer)

The consumer uses NATS if configured, otherwise falls back to Redis:

pub struct RedisConsumer {
    pool: deadpool_redis::cluster::Pool,
    prefix: String,
    block_timeout_secs: u64,
    nats_consume: Option<NatsHookConsumeFn>,
}

Integration with AppTransport

Producer Integration

use git::ssh::ReceiveSyncService;
use transport::AppTransport;

let transport = Arc::new(AppTransport::new(/* ... */));

// Create NATS publish function
let nats_publish = {
    let transport = transport.clone();
    Arc::new(move |subject: String, payload: Vec<u8>| {
        let transport = transport.clone();
        Box::pin(async move {
            let ack = transport.publish(&subject, payload).await?;
            Ok(ack.sequence)
        }) as Pin<Box<dyn Future<Output = anyhow::Result<u64>> + Send>>
    })
};

// Create service with NATS support
let sync_service = ReceiveSyncService::with_nats(redis_pool, nats_publish);

// Or use Redis-only mode
let sync_service = ReceiveSyncService::new(redis_pool);

Consumer Integration

use git::hook::pool::redis::{RedisConsumer, NatsHookConsumeFn};

// Create NATS consume function
let nats_consume: NatsHookConsumeFn = {
    let transport = transport.clone();
    Arc::new(move |subject: String, batch_size: usize| {
        let transport = transport.clone();
        Box::pin(async move {
            let mut results = Vec::new();
            
            // Pull messages from JetStream consumer
            for _ in 0..batch_size {
                match transport.pull_one(&subject).await {
                    Ok(Some(msg)) => {
                        let data = msg.payload.to_vec();
                        let msg_clone = msg.clone();
                        let ack_fn = Box::new(move || {
                            let msg = msg_clone.clone();
                            Box::pin(async move {
                                msg.ack().await?;
                                Ok(())
                            }) as Pin<Box<dyn Future<Output = anyhow::Result<()>> + Send>>
                        });
                        results.push((data, ack_fn));
                    }
                    Ok(None) => break,
                    Err(e) => return Err(e),
                }
            }
            
            Ok(results)
        }) as Pin<Box<dyn Future<Output = anyhow::Result<Vec<(Vec<u8>, Box<dyn Fn() -> Pin<Box<dyn Future<Output = anyhow::Result<()>> + Send>> + Send>)>>> + Send>>
    })
};

// Create consumer with NATS support
let consumer = RedisConsumer::with_nats(
    redis_pool,
    "{hook}".to_string(),
    5, // block_timeout_secs
    nats_consume,
);

// Or use Redis-only mode
let consumer = RedisConsumer::new(redis_pool, "{hook}".to_string(), 5);

Queue Subjects

The hook queue uses the following NATS subjects:

  • queue.hook.sync - Repository sync tasks (git push/pull operations)

Additional task types can be added by extending the subject pattern:

  • queue.hook.{task_type} - Generic pattern for any hook task type

Migration Strategy

Phase 1: Dual Write (Current)

  • Producer writes to both NATS and Redis
  • Consumer reads from Redis only
  • Zero risk, full rollback capability

Phase 2: Dual Read

  • Producer writes to both NATS and Redis
  • Consumer reads from NATS, falls back to Redis on error
  • Validates NATS consumer stability

Phase 3: NATS Primary

  • Producer writes to NATS only (Redis disabled)
  • Consumer reads from NATS only
  • Redis queue deprecated

Phase 4: Redis Removal

  • Remove Redis Lists code
  • Remove pool parameter
  • Simplify to NATS-only implementation

NATS JetStream Setup

Stream Configuration

nats stream add HOOK_QUEUE \
  --subjects "queue.hook.>" \
  --retention limits \
  --max-msgs=-1 \
  --max-age=7d \
  --storage file \
  --replicas 3

Consumer Configuration

nats consumer add HOOK_QUEUE hook-sync-worker \
  --filter "queue.hook.sync" \
  --ack explicit \
  --pull \
  --deliver all \
  --max-deliver 3 \
  --max-pending 100

Differences from Email Queue

Redis Backend

  • Email Queue: Uses Redis Streams (XADD/XREADGROUP)
  • Hook Queue: Uses Redis Lists (LPUSH/BLMOVE)

Atomicity

  • Email Queue: Consumer group provides at-least-once delivery
  • Hook Queue: BLMOVE provides atomic move-to-work-queue pattern

Work Queue Pattern

  • Email Queue: No work queue, relies on consumer group
  • Hook Queue: Uses separate work queue ({hook}:sync:work) for in-flight tracking

Acknowledgment

  • Email Queue: XACK removes from pending entries list
  • Hook Queue: LREM removes from work queue

Retry Logic

  • Email Queue: Automatic via consumer group pending entries
  • Hook Queue: Manual via Lua script (LREM + LPUSH)

Monitoring

Logs

  • NATS publish: "hook task queued to NATS"
  • Redis publish: "hook task queued to Redis"
  • NATS consume: "task dequeued from NATS"
  • Redis consume: "task dequeued"

Metrics

Add these metrics to track hook queue performance:

counter!("hook_task_queued_total", "backend" => "nats").increment(1);
counter!("hook_task_queued_total", "backend" => "redis").increment(1);
counter!("hook_task_consumed_total", "backend" => "nats").increment(1);
counter!("hook_task_consumed_total", "backend" => "redis").increment(1);

Rollback

To disable NATS and return to Redis-only:

// Producer
let sync_service = ReceiveSyncService::new(redis_pool);

// Consumer
let consumer = RedisConsumer::new(redis_pool, "{hook}".to_string(), 5);

No code changes required, just use the new() constructor instead of with_nats().

Benefits

  1. Zero Downtime: Gradual migration with fallback
  2. No Circular Dependency: Uses function pointers instead of crate dependencies
  3. Backward Compatible: Existing code works without changes
  4. Type Safe: Compile-time guarantees for integration
  5. Observable: Consistent logging for both backends

Known Limitations

NATS Acknowledgment Timing

The current implementation acks NATS messages immediately after deserialization, not after successful processing. This is different from the Redis pattern where:

  • Redis: Task moves to work queue → processes → acks (removes from work queue)
  • NATS: Task received → acks immediately → processes

Future Enhancement: Store ack functions in a map keyed by task ID, then call them after successful processing. This requires refactoring the worker loop to track pending acks.

Work Queue Pattern

NATS JetStream doesn't have a direct equivalent to Redis's work queue pattern. The current implementation relies on JetStream's built-in redelivery mechanism instead of a separate work queue.

Next Steps

  1. Add NATS integration to apps/git-hook/src/main.rs
  2. Add configuration flags for queue backend selection
  3. Test dual-write mode in staging
  4. Monitor NATS consumer stability
  5. Implement proper ack-after-processing pattern
  6. Add metrics for queue depth and processing latency