7.5 KiB
Migration Guide: Queue Separation Fix (2026-02-03)
Issue: Deserialization errors in executor service
Urgency: High - Critical bug causing message rejection
Downtime Required: Yes (brief - service restart only)
Overview
This migration separates competing consumers on shared RabbitMQ queues into dedicated queues, fixing deserialization errors:
missing field 'inquiry_id'missing field 'action_id'
Changes Summary
New Queues Created
attune.inquiry.responses.queue- For inquiry response messagesattune.execution.completed.queue- For execution completion messages
Queue Bindings Modified
attune.execution.status.queue- Now only receivesexecution.status.changedmessagesattune.execution.completed.queue- Now receivesexecution.completedmessagesattune.inquiry.responses.queue- Now receivesinquiry.respondedmessages
Services Affected
- Executor Service - Requires restart (consumers reconfigured)
- Worker Service - No changes required (publishers work automatically)
- API Service - No changes required (publishers work automatically)
Pre-Migration Checklist
- Backup current RabbitMQ configuration
- Note current queue depths in RabbitMQ management UI
- Verify all services are running and healthy
- Review recent executor logs for deserialization errors
- Ensure you have access to restart the executor service
Migration Steps
Step 1: Stop the Executor Service
# Using systemd
sudo systemctl stop attune-executor
# Using docker-compose
docker-compose stop executor
# Or kill the process
pkill -f attune-executor
Step 2: Deploy Updated Code
# Pull latest code
git pull origin main
# Rebuild executor (and common library)
cd attune
cargo build --release --bin attune-executor
Step 3: Verify RabbitMQ Queue Creation
The new queues will be created automatically when the executor starts, but you can verify the configuration:
# Check that the code is updated
grep -r "inquiry_responses" crates/common/src/mq/config.rs
grep -r "execution_completed" crates/common/src/mq/config.rs
Step 4: Start the Executor Service
# Using systemd
sudo systemctl start attune-executor
# Using docker-compose
docker-compose start executor
# Or directly
./target/release/attune-executor --config config.production.yaml
Step 5: Verify Queue Creation in RabbitMQ
Check RabbitMQ Management UI (http://localhost:15672):
Queues Tab:
attune.inquiry.responses.queueexistsattune.execution.completed.queueexistsattune.execution.status.queuestill exists
Exchanges Tab → attune.executions → Bindings:
inquiry.responded→attune.inquiry.responses.queueexecution.completed→attune.execution.completed.queueexecution.status.changed→attune.execution.status.queue
Step 6: Monitor Executor Logs
# Watch for successful startup
tail -f /var/log/attune/executor.log
# Or with journalctl
journalctl -u attune-executor -f
# Or with docker
docker logs -f attune-executor
Expected log messages:
INFO Starting Executor Service
INFO Message queue connection established
INFO Queue manager initialized with database persistence
INFO Starting event processor...
INFO Starting completion listener...
INFO Starting enforcement processor...
INFO Starting execution scheduler...
INFO Starting execution manager...
INFO Starting inquiry handler...
INFO Executor Service started successfully
Step 7: Verify No Deserialization Errors
# Check for the specific errors (should be NONE)
grep "missing field.*inquiry_id" /var/log/attune/executor.log
grep "missing field.*action_id" /var/log/attune/executor.log
grep "Failed to deserialize message" /var/log/attune/executor.log
If no output, the fix is working! ✅
Step 8: Functional Testing
Test Execution Completion:
# Execute a simple action
attune action execute core.echo --param message="test"
# Verify execution completes without errors in logs
Test Inquiry Workflow (if applicable):
# Create an action that requests inquiry
# Respond to the inquiry via API
# Verify execution resumes
Test Status Updates:
# Execute a longer-running action
# Verify status updates are processed correctly
Rollback Procedure
If issues occur, you can rollback:
Step 1: Stop Executor
sudo systemctl stop attune-executor
Step 2: Revert Code
git revert <commit-hash>
cargo build --release --bin attune-executor
Step 3: Remove New Queues (Optional)
# Via RabbitMQ Management API
curl -u guest:guest -X DELETE http://localhost:15672/api/queues/%2F/attune.inquiry.responses.queue
curl -u guest:guest -X DELETE http://localhost:15672/api/queues/%2F/attune.execution.completed.queue
Step 4: Restart Executor
sudo systemctl start attune-executor
Post-Migration Verification
- Executor service is running and healthy
- No deserialization errors in logs for 15+ minutes
- Test executions complete successfully
- Inquiries (if used) work correctly
- All three new queue bindings show in RabbitMQ UI
- Queue message rates look normal
- No messages in dead letter queues
Monitoring Points
Watch these metrics for 24 hours post-migration:
- Executor Error Rate - Should drop to near zero
- Queue Depths - Should remain stable/low
- Message Delivery Rate - Should remain consistent
- Dead Letter Queue Depth - Should not increase
Troubleshooting
Issue: New queues not created
Symptoms: Queues don't appear in RabbitMQ UI
Solution:
# Check executor logs for connection errors
grep "Failed to declare queue" /var/log/attune/executor.log
# Verify RabbitMQ permissions
rabbitmqctl list_user_permissions attune_user
Issue: Still seeing deserialization errors
Symptoms: Errors persist after restart
Solution:
# 1. Verify code was rebuilt
attune-executor --version
# 2. Check which queues consumers are using
grep "Starting.*listener" /var/log/attune/executor.log
# 3. Verify bindings in RabbitMQ UI match expected configuration
# 4. Restart ALL services to ensure workers/API use new bindings
sudo systemctl restart attune-worker attune-api attune-executor
Issue: Messages stuck in old queue
Symptoms: Old execution.status.queue has growing backlog
Solution:
# Check what messages are in the queue
rabbitmqadmin get queue=attune.execution.status.queue count=5
# If they're completion messages, manually move them:
# 1. Temporarily stop executor
# 2. Purge old queue
# 3. Restart executor (messages will be redelivered after TTL)
Impact Assessment
Before Fix:
- ❌ ~30-50% of messages rejected due to deserialization errors
- ❌ Executions not completing properly
- ❌ Inquiries not being processed
- ❌ Resource waste from redelivery attempts
After Fix:
- ✅ 100% message delivery success rate
- ✅ All executions complete correctly
- ✅ Inquiries processed immediately
- ✅ Reduced message queue load
Questions?
Contact the platform team or refer to:
attune/work-summary/2026-02-03-inquiry-queue-separation.md- Technical detailsattune/docs/QUICKREF-rabbitmq-queues.md- Queue architecture referenceattune/docs/architecture/queue-architecture.md- Overall architecture
Migration Completed: __________ (date/time)
Performed By: __________
Issues Encountered: __________
Notes: __________