7.6 KiB
Quick Reference: Execution State Ownership
Last Updated: 2026-02-09
Ownership Model at a Glance
┌──────────────────────────────────────────────────────────┐
│ EXECUTOR OWNS │ WORKER OWNS │
│ Requested │ Running │
│ Scheduling │ Completed │
│ Scheduled │ Failed │
│ (+ pre-handoff Cancelled) │ (+ post-handoff │
│ │ Cancelled/Timeout/ │
│ │ Abandoned) │
└───────────────────────────────┴──────────────────────────┘
│ │
└─────── HANDOFF ──────────┘
execution.scheduled PUBLISHED
Who Updates the Database?
Executor Updates (Pre-Handoff Only)
- ✅ Creates execution record
- ✅ Updates status:
Requested→Scheduling→Scheduled - ✅ Publishes
execution.scheduledmessage ← HANDOFF POINT - ✅ Handles cancellations/failures BEFORE handoff (worker never notified)
- ❌ NEVER updates after
execution.scheduledis published
Worker Updates (Post-Handoff Only)
- ✅ Receives
execution.scheduledmessage (takes ownership) - ✅ Updates status:
Scheduled→Running - ✅ Updates status:
Running→Completed/Failed/Cancelled/etc. - ✅ Handles cancellations/failures AFTER handoff
- ✅ Updates result data
- ✅ Writes for every status change after receiving handoff
Who Publishes Messages?
Executor Publishes
enforcement.created(from rules)execution.requested(to scheduler)execution.scheduled(to worker) ← HANDOFF MESSAGE - OWNERSHIP TRANSFER
Worker Publishes
execution.status_changed(for each status change after handoff)execution.completed(when done)
Executor Receives (But Doesn't Update DB Post-Handoff)
execution.status_changed→ triggers orchestration logic (read-only)execution.completed→ releases queue slots
Code Locations
Executor Updates DB
// crates/executor/src/scheduler.rs
execution.status = ExecutionStatus::Scheduled;
ExecutionRepository::update(pool, execution.id, execution.into()).await?;
Worker Updates DB
// crates/worker/src/executor.rs
self.update_execution_status(execution_id, ExecutionStatus::Running).await?;
// ...
ExecutionRepository::update(&self.pool, execution_id, input).await?;
Executor Orchestrates (Read-Only)
// crates/executor/src/execution_manager.rs
async fn process_status_change(...) -> Result<()> {
let execution = ExecutionRepository::find_by_id(pool, execution_id).await?;
// NO UPDATE - just orchestration logic
Self::handle_completion(pool, publisher, &execution).await?;
}
Decision Tree: Should I Update the DB?
Are you in the Executor?
├─ Have you published execution.scheduled for this execution?
│ ├─ NO → Update DB (you own it)
│ │ └─ Includes: Requested/Scheduling/Scheduled/pre-handoff Cancelled
│ └─ YES → Don't update DB (worker owns it now)
│ └─ Just orchestrate (trigger workflows, etc)
│
Are you in the Worker?
├─ Have you received execution.scheduled for this execution?
│ ├─ YES → Update DB for ALL status changes (you own it)
│ │ └─ Includes: Running/Completed/Failed/post-handoff Cancelled/etc.
│ └─ NO → Don't touch this execution (doesn't exist for you yet)
Common Patterns
✅ DO: Worker Updates After Handoff
// Worker receives execution.scheduled
self.update_execution_status(execution_id, ExecutionStatus::Running).await?;
self.publish_status_update(execution_id, ExecutionStatus::Running).await?;
✅ DO: Executor Orchestrates Without DB Write
// Executor receives execution.status_changed
let execution = ExecutionRepository::find_by_id(pool, execution_id).await?;
if status == ExecutionStatus::Completed {
Self::trigger_child_executions(pool, publisher, &execution).await?;
}
❌ DON'T: Executor Updates After Handoff
// Executor receives execution.status_changed
execution.status = status;
ExecutionRepository::update(pool, execution.id, execution).await?; // ❌ WRONG!
❌ DON'T: Worker Updates Before Handoff
// Worker updates execution it hasn't received via execution.scheduled
ExecutionRepository::update(&self.pool, execution_id, input).await?; // ❌ WRONG!
✅ DO: Executor Handles Pre-Handoff Cancellation
// User cancels execution before it's scheduled to worker
// Execution is still in Requested/Scheduling state
execution.status = ExecutionStatus::Cancelled;
ExecutionRepository::update(pool, execution_id, execution).await?; // ✅ CORRECT!
// Worker never receives execution.scheduled, never knows execution existed
✅ DO: Worker Handles Post-Handoff Cancellation
// Worker received execution.scheduled, now owns execution
// User cancels execution while it's running
execution.status = ExecutionStatus::Cancelled;
ExecutionRepository::update(&self.pool, execution_id, execution).await?; // ✅ CORRECT!
self.publish_status_update(execution_id, ExecutionStatus::Cancelled).await?;
Handoff Checklist
When an execution is scheduled:
Executor Must:
- Update status to
Scheduled - Write to database
- Publish
execution.scheduledmessage ← HANDOFF OCCURS HERE - Stop updating this execution (ownership transferred)
- Continue to handle orchestration (read-only)
Worker Must:
- Receive
execution.scheduledmessage ← OWNERSHIP RECEIVED - Take ownership of execution state
- Update DB for all future status changes
- Handle any cancellations/failures after this point
- Publish status notifications
Important: If execution is cancelled BEFORE executor publishes execution.scheduled, the executor updates status to Cancelled and worker never learns about it.
Benefits Summary
| Aspect | Benefit |
|---|---|
| Race Conditions | Eliminated - only one owner per stage |
| DB Writes | Reduced by ~50% - no duplicates |
| Code Clarity | Clear boundaries - easy to reason about |
| Message Traffic | Reduced - no duplicate completions |
| Idempotency | Safe to receive duplicate messages |
Troubleshooting
Execution Stuck in "Scheduled"
Problem: Worker not updating status to Running
Check: Was execution.scheduled published? Worker received it? Worker healthy?
Workflow Children Not Triggering
Problem: Orchestration not running
Check: Worker published execution.status_changed? Message queue healthy?
Duplicate Status Updates
Problem: Both services updating DB
Check: Executor should NOT update after publishing execution.scheduled
Execution Cancelled But Status Not Updated
Problem: Cancellation not reflected in database
Check: Was it cancelled before or after handoff?
Fix: If before handoff → executor updates; if after handoff → worker updates
Queue Warnings
Problem: Duplicate completion notifications
Check: Only worker should publish execution.completed
See Also
- Full Architecture Doc:
docs/ARCHITECTURE-execution-state-ownership.md - Bug Fix Visualization:
docs/BUGFIX-duplicate-completion-2026-02-09.md - Work Summary:
work-summary/2026-02-09-execution-state-ownership.md