# RabbitMQ Queue Ownership Architecture **Last Updated:** 2026-02-05 **Status:** Implemented ## Overview Attune uses a **service-specific infrastructure setup** pattern where each service is responsible for declaring only the queues it consumes. This provides clear ownership, reduces redundancy, and makes the system architecture more maintainable. ## Principle **Each service declares the queues it consumes.** This follows the principle that the consumer owns the queue declaration, ensuring that: - Queue configuration is co-located with the service that uses it - Services can start in any order (all operations are idempotent) - Ownership is clear from the codebase structure - Changes to queue configuration are localized to the consuming service ## Infrastructure Layers ### Common Infrastructure (Shared by All Services) **Declared by:** Any service on startup (first-to-start wins, idempotent) **Responsibility:** Ensures basic messaging infrastructure exists **Components:** - **Exchanges:** - `attune.events` (topic) - Event routing - `attune.executions` (topic) - Execution lifecycle routing - `attune.notifications` (fanout) - Real-time notifications - **Dead Letter Exchange (DLX):** - `attune.dlx` (direct) - Failed message handling - `attune.dlx.queue` - Dead letter queue bound to DLX **Setup Method:** `Connection::setup_common_infrastructure()` ### Service-Specific Infrastructure Each service declares only the queues it consumes: ## Service Responsibilities ### Executor Service **Role:** Orchestrates execution lifecycle, enforces rules, manages inquiries **Queues Owned:** - `attune.enforcements.queue` - Routing: `enforcement.#` - Purpose: Rule enforcement requests - `attune.execution.requests.queue` - Routing: `execution.requested` - Purpose: New execution requests - `attune.execution.status.queue` - Routing: `execution.status.changed` - Purpose: Execution status updates from workers - `attune.execution.completed.queue` - Routing: `execution.completed` - Purpose: Completed execution results - `attune.inquiry.responses.queue` - Routing: `inquiry.responded` - Purpose: Human-in-the-loop responses **Setup Method:** `Connection::setup_executor_infrastructure()` **Code Location:** `crates/executor/src/service.rs` ### Worker Service **Role:** Execute actions in various runtimes (shell, Python, Node.js, containers) **Queues Owned:** - `worker.{id}.executions` (per worker instance) - Routing: `execution.dispatch.worker.{id}` - Purpose: Execution tasks dispatched to this specific worker - Properties: Durable, auto-delete on disconnect **Setup Method:** `Connection::setup_worker_infrastructure(worker_id, config)` **Code Location:** `crates/worker/src/service.rs` **Notes:** - Each worker instance gets its own queue - Worker ID is assigned during registration - Queue is created after successful registration - Multiple workers can exist for load distribution ### Sensor Service **Role:** Monitor for events and generate trigger instances **Queues Owned:** - `attune.events.queue` - Routing: `#` (all events) - Purpose: Events generated by sensors and triggers **Setup Method:** `Connection::setup_sensor_infrastructure()` **Code Location:** `crates/sensor/src/service.rs` ### Notifier Service **Role:** Real-time notifications via WebSockets **Queues Owned:** - `attune.notifications.queue` - Routing: `` (fanout, no routing key) - Purpose: System notifications for WebSocket broadcasting **Setup Method:** `Connection::setup_notifier_infrastructure()` **Code Location:** `crates/notifier/src/service.rs` **Notes:** - Uses fanout exchange (broadcasts to all consumers) - Also uses PostgreSQL LISTEN/NOTIFY for database events ### API Service **Role:** HTTP gateway for client interactions **Queues Owned:** None (API only publishes, doesn't consume) **Setup Method:** `Connection::setup_common_infrastructure()` only **Code Location:** `crates/api/src/main.rs` **Notes:** - Only needs exchanges to publish messages - Does not consume from any queues - Publishes to various exchanges (events, executions, notifications) ## Queue Configuration All queues are configured with: - **Durable:** `true` (survives broker restarts) - **Exclusive:** `false` (accessible by multiple connections) - **Auto-delete:** `false` (persist even without consumers) - **Dead Letter Exchange:** `attune.dlx` (enabled by default) Exception: - Worker-specific queues may have different settings based on worker lifecycle ## Message Flow Examples ### Rule Enforcement Flow ``` Event Created → `attune.events` exchange → `attune.events.queue` (consumed by Executor) → Rule evaluation → `enforcement.created` published to `attune.executions` → `attune.enforcements.queue` (consumed by Executor) ``` ### Execution Flow ``` Execution Requested (from API) → `attune.executions` exchange (routing: execution.requested) → `attune.execution.requests.queue` (consumed by Executor/Scheduler) → Executor dispatches to worker → `execution.dispatch.worker.{id}` to `attune.executions` → `worker.{id}.executions` (consumed by Worker) → Worker executes action → `execution.completed` to `attune.executions` → `attune.execution.completed.queue` (consumed by Executor) ``` ### Notification Flow ``` System Event Occurs → `attune.notifications` exchange (fanout) → `attune.notifications.queue` (consumed by Notifier) → WebSocket broadcast to connected clients ``` ## Implementation Details ### Setup Call Order Each service follows this pattern: ```rust // 1. Connect to RabbitMQ let mq_connection = Connection::connect(mq_url).await?; // 2. Setup common infrastructure (exchanges, DLX) mq_connection.setup_common_infrastructure(&config).await?; // 3. Setup service-specific queues mq_connection.setup_SERVICE_infrastructure(&config).await?; // (where SERVICE is executor, worker, sensor, or notifier) ``` ### Idempotency All setup operations are idempotent: - Declaring an existing exchange/queue with the same settings succeeds - Multiple services can call `setup_common_infrastructure()` safely - Services can start in any order ### Error Handling Setup failures are logged but not fatal: - Queues may already exist from previous runs - Another service may have created the infrastructure - Only actual consumption failures should stop the service ## Startup Sequence **Typical Docker Compose Startup:** 1. **PostgreSQL** - Starts first (dependency) 2. **RabbitMQ** - Starts first (dependency) 3. **Migrations** - Runs database migrations 4. **Services start in parallel:** - **API** - Creates common infrastructure - **Executor** - Creates common + executor infrastructure - **Workers** - Each creates common + worker-specific queue - **Sensor** - Creates common + sensor infrastructure - **Notifier** - Creates common + notifier infrastructure The first service to start creates the common infrastructure. All subsequent services find it already exists and proceed. ## Benefits ✅ **Clear Ownership** - Code inspection shows which service owns which queue ✅ **Reduced Redundancy** - Each queue declared exactly once (per service type) ✅ **Better Debugging** - Queue issues isolated to specific services ✅ **Improved Maintainability** - Changes to queue config localized ✅ **Self-Documenting** - Code structure reflects system architecture ✅ **Order Independence** - Services can start in any order ✅ **Monitoring** - Can track which service created infrastructure ## Monitoring and Verification ### RabbitMQ Management UI Access at `http://localhost:15672` (credentials: `guest`/`guest`) **Expected Queues:** - `attune.dlx.queue` - Dead letter queue - `attune.events.queue` - Events (Sensor) - `attune.enforcements.queue` - Enforcements (Executor) - `attune.execution.requests.queue` - Execution requests (Executor) - `attune.execution.status.queue` - Status updates (Executor) - `attune.execution.completed.queue` - Completions (Executor) - `attune.inquiry.responses.queue` - Inquiry responses (Executor) - `attune.notifications.queue` - Notifications (Notifier) - `worker.{id}.executions` - Worker queues (one per worker) ### Verification Commands ```bash # Check which queues exist docker compose exec rabbitmq rabbitmqctl list_queues name messages # Check queue bindings docker compose exec rabbitmq rabbitmqctl list_bindings # Check who's consuming from queues docker compose exec rabbitmq rabbitmqctl list_consumers ``` ### Log Verification Each service logs its infrastructure setup: ```bash # API (common only) docker compose logs api | grep "infrastructure setup" # Executor (common + executor) docker compose logs executor | grep "infrastructure setup" # Workers (common + worker-specific) docker compose logs worker-shell | grep "infrastructure setup" # Sensor (common + sensor) docker compose logs sensor | grep "infrastructure setup" ``` ## Troubleshooting ### Queue Already Exists with Different Settings **Error:** `PRECONDITION_FAILED - inequivalent arg 'durable' for queue...` **Cause:** Queue exists with different configuration than code expects **Solution:** ```bash # Stop services docker compose down # Remove RabbitMQ volume to clear all queues docker volume rm attune_rabbitmq_data # Restart services docker compose up -d ``` ### Service Can't Connect to RabbitMQ **Check:** Is RabbitMQ healthy? ```bash docker compose ps rabbitmq ``` **Check:** RabbitMQ logs for errors ```bash docker compose logs rabbitmq ``` ### Messages Not Being Consumed 1. **Check queue has consumers:** ```bash docker compose exec rabbitmq rabbitmqctl list_consumers ``` 2. **Check service is running:** ```bash docker compose ps ``` 3. **Check service logs for consumer startup:** ```bash docker compose logs | grep "Consumer started" ``` ## Migration from Old Architecture **Previous Behavior:** All services called `setup_infrastructure()` which created ALL queues **New Behavior:** Each service calls its specific setup method **Migration Steps:** 1. Update to latest code 2. Stop all services: `docker compose down` 3. Clear RabbitMQ volume: `docker volume rm attune_rabbitmq_data` 4. Start services: `docker compose up -d` No data loss occurs as message queues are transient infrastructure. ## Related Documentation - [Queue Architecture](queue-architecture.md) - Overall queue design - [RabbitMQ Queues Quick Reference](../../QUICKREF-rabbitmq-queues.md) - [Executor Service](executor-service.md) - [Worker Service](worker-service.md) - [Sensor Service](sensor-service.md) ## Change History | Date | Change | Author | |------|--------|--------| | 2026-02-05 | Initial implementation of service-specific queue ownership | AI Assistant |