5.8 KiB
QUICK FIX: Executor Deserialization Errors
Date: 2026-02-03
Status: ✅ FIXED
Severity: Critical
Downtime: Minimal (service restart only)
What Was Broken
The executor service was rejecting messages with these errors:
ERROR: Failed to deserialize message: missing field `inquiry_id`
ERROR: Failed to deserialize message: missing field `action_id`
Root Causes
-
Multiple consumers on same queue: Three different consumers were competing for messages on the same RabbitMQ queue, but each expected different message structures.
-
Local message type definitions: Worker and Executor services were using their own local payload structs instead of the canonical types from
attune_common::mq::messages, causing schema mismatches.
The Problem in Detail
attune.execution.status.queue had 3 consumers:
- CompletionListener - Expected
ExecutionCompletedPayload(hasaction_id) - ExecutionManager - Expected
ExecutionStatusPayload(noaction_id) - InquiryHandler - Expected
InquiryRespondedPayload(hasinquiry_id)
All three message types were being routed to this single queue, causing random deserialization failures.
The Fixes
Fix 1: Queue Separation
Created 2 new dedicated queues so each consumer gets its own queue with the correct message type:
| Queue | Consumer | Message Type | Routing Key |
|---|---|---|---|
attune.execution.status.queue |
ExecutionManager | ExecutionStatusChangedPayload | execution.status.changed |
attune.execution.completed.queue |
CompletionListener | ExecutionCompletedPayload | execution.completed |
attune.inquiry.responses.queue |
InquiryHandler | InquiryRespondedPayload | inquiry.responded |
Fix 2: Canonical Message Types
Updated Worker and Executor to use canonical message types from attune_common::mq:
- Worker now imports and uses
ExecutionStatusChangedPayload(canonical) - Executor now imports and uses
ExecutionStatusChangedPayloadandExecutionCompletedPayload(canonical) - Removed all local payload struct definitions
- Added database queries to populate required fields (action_ref, action_id)
Files Changed
Queue Separation
attune/crates/common/src/mq/config.rs- Added 2 new queue configsattune/crates/common/src/mq/connection.rs- Added queue declarations and bindingsattune/crates/executor/src/service.rs- Updated consumers to use correct queues
Canonical Message Types
attune/crates/worker/src/service.rs- Use canonicalExecutionStatusChangedPayloadattune/crates/executor/src/execution_manager.rs- Use canonical payload types
How to Deploy
Quick Deploy (Production)
# 1. Stop both executor and worker
sudo systemctl stop attune-executor attune-worker
# 2. Pull and rebuild (BOTH services need rebuild)
git pull origin main
cd attune
cargo build --release --bin attune-executor --bin attune-worker
# 3. OPTIONAL BUT RECOMMENDED: Clear old messages
rabbitmqadmin purge queue name=attune.execution.status.queue
rabbitmqadmin purge queue name=attune.execution.completed.queue
# 4. Start services (new queues created automatically)
sudo systemctl start attune-executor attune-worker
# 5. Verify (should see NO errors)
grep "Failed to deserialize" /var/log/attune/executor.log
grep "missing field" /var/log/attune/executor.log
Development Deploy
# Stop both services
make stop-executor stop-worker
# or: docker-compose stop executor worker
# Rebuild both
cargo build --bin attune-executor --bin attune-worker
# OPTIONAL: Clear old messages
rabbitmqadmin purge queue name=attune.execution.status.queue
rabbitmqadmin purge queue name=attune.execution.completed.queue
# Start both services
make run-executor run-worker
# or: docker-compose up -d executor worker
# Watch logs
tail -f logs/executor.log logs/worker.log
Verification
After deploying, verify these 3 things:
1. New Queues Exist
Check RabbitMQ UI (http://localhost:15672):
- ✅
attune.inquiry.responses.queueexists - ✅
attune.execution.completed.queueexists
2. No Deserialization Errors
# Wait 5 minutes, then check logs (should be empty):
grep "missing field" /var/log/attune/executor.log
grep "Failed to deserialize" /var/log/attune/executor.log
3. Executions Work
# Test execution completes successfully
attune action execute core.echo --param message="test"
Rollback (If Needed)
# Stop executor
sudo systemctl stop attune-executor
# Revert code
git revert <commit-hash>
cargo build --release --bin attune-executor
# Start executor
sudo systemctl start attune-executor
Impact
Before: ~30-50% message rejection rate, executions failing
After: 0% rejection rate, all executions working ✅
Why Old Messages Still Cause Errors
If you rebuilt and restarted but still see errors, it's because old messages with the wrong schema are still in the queues. The fix prevents NEW messages from having the problem, but old messages need to be purged:
# Clear old messages from queues
rabbitmqadmin purge queue name=attune.execution.status.queue
rabbitmqadmin purge queue name=attune.execution.completed.queue
rabbitmqadmin purge queue name=attune.inquiry.responses.queue
# Or via RabbitMQ Management UI
# http://localhost:15672 → Queues → Select queue → Purge Messages
More Details
See complete documentation:
attune/work-summary/2026-02-03-inquiry-queue-separation.md- Queue separation detailsattune/work-summary/2026-02-03-canonical-message-types.md- Message type fix detailsattune/docs/QUICKREF-rabbitmq-queues.md- Queue architecture referenceattune/docs/MIGRATION-queue-separation-2026-02-03.md- Detailed migration guide
TL;DR: Separated queues + unified message types. Rebuild/restart executor + worker. Purge old messages if errors persist.