11 KiB
Work Summary: Inquiry Queue Separation Fix
Date: 2026-02-03
Issues:
- Executor deserialization error: "missing field
inquiry_id" - Executor deserialization error: "missing field
action_id"
Status: ✅ Both Fixed
Visual Overview
Before Fix ❌
attune.execution.status.queue
├─ Consumer: CompletionListener (expects ExecutionCompletedPayload)
├─ Consumer: ExecutionManager (expects ExecutionStatusPayload)
└─ Consumer: InquiryHandler (expects InquiryRespondedPayload)
Incoming Messages:
- execution.completed → ExecutionCompletedPayload
- execution.status.changed → ExecutionStatusChangedPayload
- inquiry.responded → InquiryRespondedPayload
Problem: Round-robin distribution causes wrong consumer to receive wrong message type!
After Fix ✅
attune.execution.completed.queue
└─ Consumer: CompletionListener (expects ExecutionCompletedPayload)
└─ Message: execution.completed → ExecutionCompletedPayload ✓
attune.execution.status.queue
└─ Consumer: ExecutionManager (expects ExecutionStatusPayload)
└─ Message: execution.status.changed → ExecutionStatusChangedPayload ✓
attune.inquiry.responses.queue
└─ Consumer: InquiryHandler (expects InquiryRespondedPayload)
└─ Message: inquiry.responded → InquiryRespondedPayload ✓
Result: Each queue has ONE consumer expecting ONE message type!
Problem Description
The executor service was logging deserialization errors when processing messages from the execution_status queue:
ERROR ThreadId(13) crates/common/src/mq/consumer.rs:112: Failed to deserialize message: missing field `inquiry_id` at line 1 column 318. Rejecting message.
Root Cause Analysis
The issue was caused by two different consumers listening to the same RabbitMQ queue but expecting different message payload types:
Queue Configuration Issue
The execution_status queue (attune.execution.status.queue) was bound to the attune.executions exchange with routing key "execution.status.changed", but it was receiving messages with two different routing keys:
execution.completed→ExecutionCompletedPayload(published by Worker service)inquiry.responded→InquiryRespondedPayload(published by API service)
Competing Consumers
Two consumers were configured to read from the same execution_status queue:
-
CompletionListener (
executor.completiontag)- Expected:
ExecutionCompletedPayload - Fields:
execution_id,action_id,action_ref,status,result,completed_at
- Expected:
-
InquiryHandler (
executor.inquirytag)- Expected:
InquiryRespondedPayload - Fields:
inquiry_id,execution_id,response,responded_by,responded_at
- Expected:
Message Routing Behavior
RabbitMQ distributes messages to consumers on the same queue using round-robin load balancing. This meant:
- When an
InquiryRespondedPayloadwas delivered toCompletionListener→ deserialization failed (missinginquiry_id) - When an
ExecutionCompletedPayloadwas delivered toInquiryHandler→ deserialization failed (missingaction_id)
The error message specifically mentioned inquiry_id because CompletionListener tried to deserialize an inquiry response message.
Solution Implemented
1. Created Separate Queue for Inquiry Responses
File: attune/crates/common/src/mq/config.rs
Added a new queue configuration:
pub struct QueuesConfig {
// ... existing queues ...
/// Inquiry responses queue configuration
pub inquiry_responses: QueueConfig,
}
Default configurations:
execution_completed: QueueConfig {
name: "attune.execution.completed.queue".to_string(),
durable: true,
exclusive: false,
auto_delete: false,
},
inquiry_responses: QueueConfig {
name: "attune.inquiry.responses.queue".to_string(),
durable: true,
exclusive: false,
auto_delete: false,
}
2. Updated Infrastructure Setup
File: attune/crates/common/src/mq/connection.rs
Added queue declarations and bindings in setup_infrastructure():
// Declare the new queues with DLX support
self.declare_queue_with_dlx(&config.rabbitmq.queues.execution_completed, dlx).await?;
self.declare_queue_with_dlx(&config.rabbitmq.queues.inquiry_responses, dlx).await?;
// Bind execution_status queue to status changed messages for ExecutionManager
self.bind_queue(
&config.rabbitmq.queues.execution_status.name,
&config.rabbitmq.exchanges.executions.name,
"execution.status.changed",
)
.await?;
// Bind execution_completed queue to completed messages for CompletionListener
self.bind_queue(
&config.rabbitmq.queues.execution_completed.name,
&config.rabbitmq.exchanges.executions.name,
"execution.completed",
)
.await?;
// Bind inquiry_responses queue to inquiry responded messages for InquiryHandler
self.bind_queue(
&config.rabbitmq.queues.inquiry_responses.name,
&config.rabbitmq.exchanges.executions.name,
"inquiry.responded",
)
.await?;
3. Updated Executor Service Configuration
File: attune/crates/executor/src/service.rs
Changed InquiryHandler and CompletionListener to consume from dedicated queues:
// InquiryHandler - Before:
let inquiry_response_queue = self.inner.mq_config.rabbitmq.queues.execution_status.name.clone();
// InquiryHandler - After:
let inquiry_response_queue = self.inner.mq_config.rabbitmq.queues.inquiry_responses.name.clone();
// CompletionListener - Before:
let execution_completed_queue = self.inner.mq_config.rabbitmq.queues.execution_status.name.clone();
// CompletionListener - After:
let execution_completed_queue = self.inner.mq_config.rabbitmq.queues.execution_completed.name.clone();
Message Flow After Fix
Execution Completion Flow
Worker → publishes ExecutionCompletedPayload
→ routing key: "execution.completed"
→ exchange: "attune.executions"
→ queue: "attune.execution.completed.queue"
→ consumer: CompletionListener
✅ Correct payload type received
Execution Status Change Flow
Worker → publishes ExecutionStatusChangedPayload
→ routing key: "execution.status.changed"
→ exchange: "attune.executions"
→ queue: "attune.execution.status.queue"
→ consumer: ExecutionManager
✅ Correct payload type received
Inquiry Response Flow
API → publishes InquiryRespondedPayload
→ routing key: "inquiry.responded"
→ exchange: "attune.executions"
→ queue: "attune.inquiry.responses.queue"
→ consumer: InquiryHandler
✅ Correct payload type received
Benefits
- Type Safety: Each queue receives only one message type, eliminating deserialization errors
- Scalability: Can scale
CompletionListener,ExecutionManager, andInquiryHandlerindependently - Maintainability: Clear separation of concerns - each queue has a single purpose
- Reliability: No message rejection due to type mismatches
- Performance: No wasted processing from consumers receiving wrong message types
Queue Separation Summary
After both fixes, we now have three dedicated queues for execution-related messages:
| Queue | Routing Key | Message Type | Consumer |
|---|---|---|---|
attune.execution.status.queue |
execution.status.changed |
ExecutionStatusChangedPayload |
ExecutionManager |
attune.execution.completed.queue |
execution.completed |
ExecutionCompletedPayload |
CompletionListener |
attune.inquiry.responses.queue |
inquiry.responded |
InquiryRespondedPayload |
InquiryHandler |
Result: Each queue now has exactly one consumer expecting exactly one message type. ✅
Testing Recommendations
- Restart all services to recreate the queue infrastructure with new bindings
- Verify queue creation in RabbitMQ management UI:
- Check that
attune.inquiry.responses.queueexists - Check that
attune.execution.completed.queueexists - Verify bindings on
attune.executionsexchange:inquiry.responded→attune.inquiry.responses.queueexecution.completed→attune.execution.completed.queueexecution.status.changed→attune.execution.status.queue
- Check that
- Monitor executor logs for absence of deserialization errors (
inquiry_idandaction_id) - Test inquiry workflow:
- Create an action that requests inquiry (
__inquiryin result) - Respond to inquiry via API
- Verify execution resumes correctly
- Create an action that requests inquiry (
- Test execution completion:
- Execute a simple action
- Verify completion notification processed without errors
Files Modified
attune/crates/common/src/mq/config.rs- Addedinquiry_responsesandexecution_completedqueuesattune/crates/common/src/mq/connection.rs- Added queue declarations and bindingsattune/crates/executor/src/service.rs- Updated InquiryHandler and CompletionListener to use new queues
Migration Notes
This is a breaking change for existing deployments:
- Two new queues will be created automatically on service startup:
attune.inquiry.responses.queueattune.execution.completed.queue
- The
execution_statusqueue now has only one binding (execution.status.changed) - Existing messages in queues are unaffected
- No database migrations required
- Action Required: Restart executor service to apply changes
Related Issues
- Original implementation assumed a single queue could handle multiple message types
- RabbitMQ round-robin distribution caused non-deterministic deserialization failures
- Errors were intermittent because they depended on which consumer received which message
ExecutionManageruses local payload struct instead of canonicalExecutionStatusChangedPayload(not critical but should be unified in future)
Lessons Learned
- One queue, one message type: RabbitMQ queues should have a single message schema
- One queue, one consumer: Multiple consumers on the same queue creates competition, not cooperation
- Use routing keys effectively: Topic exchanges with specific routing keys provide better message segregation
- Consumer tag awareness: Consumer tags don't prevent round-robin distribution within the same queue
- Type-safe patterns: Rust's strong typing revealed the issue quickly through deserialization errors
- Canonical message types: Use shared message structs from
attune_common::mq::messages, not local definitions - Incremental fixes: Sometimes you discover deeper issues while fixing surface-level problems - fix them all at once
- Test thoroughly: Restart services and monitor logs to catch related issues before they reach production