17 KiB
Execution Hierarchy and Parent Relationships
Overview
The execution table supports two types of parent-child relationships:
- General execution hierarchies (via
parentfield) - Workflow task executions (via
workflow_taskmetadata)
This document explains why both are needed, how they differ, and when to use each.
Field Purposes
execution.parent (General Hierarchy)
Type: Option<Id> - Foreign key to execution.id
Purpose: Generic execution tree traversal for ANY type of parent-child relationship.
Used for:
- Workflow tasks: Parent is the workflow's main execution record
- Child actions: Parent is the action that spawned them
- Nested workflows: Parent is the outer workflow's execution
- Any future parent-child patterns
Example SQL:
-- Find all child executions (any type)
SELECT * FROM attune.execution WHERE parent = 100;
execution.workflow_task.workflow_execution (Workflow-Specific)
Type: Id within WorkflowTaskMetadata JSONB - References workflow_execution.id
Purpose: Direct link to workflow orchestration state.
Provides access to:
- Task graph structure
- Workflow variables
- Current/completed/failed task lists
- Workflow-specific metadata
Example SQL:
-- Find all tasks in a specific workflow
SELECT * FROM attune.execution
WHERE workflow_task->>'workflow_execution' = '50';
Workflow Task Execution Structure
When a workflow executes, three types of records are created:
┌─────────────────────────────────────────────────────────────┐
│ 1. Parent Execution (the workflow itself as an execution) │
├─────────────────────────────────────────────────────────────┤
│ id: 100 │
│ action_ref: "my_pack.my_workflow" │
│ parent: None (or outer workflow ID if nested) │
│ workflow_task: None │
│ status: running │
└─────────────────────────────────────────────────────────────┘
▲
│
│ references (execution field)
│
┌─────────────────────────────────────────────────────────────┐
│ 2. Workflow Execution Record (orchestration state) │
├─────────────────────────────────────────────────────────────┤
│ id: 50 │
│ execution: 100 ← points to parent execution │
│ workflow_def: 10 │
│ task_graph: {...} │
│ variables: {...} │
│ current_tasks: ["send_email", "process_data"] │
│ completed_tasks: [] │
│ failed_tasks: [] │
└─────────────────────────────────────────────────────────────┘
▲
│
│ references (workflow_execution)
│
┌─────────────────────────────────────────────────────────────┐
│ 3. Task Execution (one per workflow task) │
├─────────────────────────────────────────────────────────────┤
│ id: 101 │
│ action_ref: "my_pack.send_email" │
│ parent: 100 ← points to workflow execution │
│ workflow_task: { │
│ workflow_execution: 50 ← points to workflow_execution │
│ task_name: "send_email", │
│ task_index: null, │
│ retry_count: 0, │
│ max_retries: 3, │
│ ... │
│ } │
│ status: running │
└─────────────────────────────────────────────────────────────┘
Relationship Diagram
┌─────────────────────┐
│ Task Execution │
│ (id: 101) │
│ │
│ parent: 100 │──────┐
│ │ │
│ workflow_task: { │ │
│ workflow_exec: 50│──┐ │
│ } │ │ │
└─────────────────────┘ │ │
│ │
│ ▼
│ ┌─────────────────────┐
│ │ Parent Execution │
│ │ (id: 100) │
│ │ [The Workflow] │
│ └─────────────────────┘
│ ▲
│ │
│ │ execution: 100
│ │
│ ┌─────────────────────┐
└─▶│ Workflow Execution │
│ (id: 50) │
│ [Orchestration] │
└─────────────────────┘
Key: Both parent and workflow_task.workflow_execution ultimately reference the same workflow, but serve different query patterns.
Why Both Fields Are Needed
✅ Reason 1: parent is Generic
The parent field is used for all types of execution hierarchies, not just workflows:
Example 1: Action spawning child actions
// Parent action execution
let parent_exec = create_execution("my_pack.parent_action").await?;
// Child action executions (NOT workflow tasks)
let child1 = CreateExecutionInput {
action_ref: "my_pack.child_action_1".to_string(),
parent: Some(parent_exec.id),
workflow_task: None, // Not a workflow task!
...
};
Example 2: Nested workflows
// Outer workflow execution
let outer_workflow = create_workflow("outer_workflow").await?;
// Inner workflow execution (nested)
let inner_workflow = CreateExecutionInput {
action_ref: "inner_workflow".to_string(),
parent: Some(outer_workflow.id),
workflow_task: None, // This is a workflow, not a task
...
};
✅ Reason 2: Workflow-Specific State is Separate
The workflow_execution table contains orchestration state that doesn't belong in the main execution record:
- Task graph: Directed acyclic graph of task dependencies
- Workflow variables: Scoped variable context
- Task tracking: current_tasks, completed_tasks, failed_tasks arrays
- Workflow metadata: pause_reason, error_message, etc.
Direct access via workflow_task.workflow_execution avoids JOINs.
✅ Reason 3: Query Efficiency
Without direct workflow_execution reference, finding workflow state requires:
-- BAD: Two JOINs required
SELECT we.*
FROM attune.execution task
JOIN attune.execution parent ON task.parent = parent.id
JOIN attune.workflow_execution we ON we.execution = parent.id
WHERE task.id = 101;
With direct reference:
-- GOOD: Single lookup via JSONB
SELECT we.*
FROM attune.workflow_execution we
WHERE we.id = (
SELECT (workflow_task->>'workflow_execution')::bigint
FROM attune.execution
WHERE id = 101
);
✅ Reason 4: Clear Semantics
parent= "What execution spawned me?"workflow_task.workflow_execution= "What workflow orchestration state do I belong to?"
These are related but semantically different questions.
Use Cases and Query Patterns
Use Case 1: Generic Execution Tree Traversal
// Get ALL child executions (workflow tasks, child actions, anything)
async fn get_children(pool: &PgPool, parent_id: Id) -> Result<Vec<Execution>> {
sqlx::query_as::<_, Execution>(
"SELECT * FROM attune.execution WHERE parent = $1"
)
.bind(parent_id)
.fetch_all(pool)
.await
.map_err(Into::into)
}
// Works for workflows, actions, any execution type
let all_children = get_children(&pool, parent_exec_id).await?;
Use Case 2: Workflow Task Queries
// Get all tasks for a workflow execution
let tasks = ExecutionRepository::find_by_workflow_execution(
&pool,
workflow_execution_id
).await?;
// Implementation uses direct JSONB query:
// WHERE workflow_task->>'workflow_execution' = $1
Use Case 3: Workflow State Access
// From a task execution, get the workflow state
async fn get_workflow_state(
pool: &PgPool,
task_exec: &Execution
) -> Result<Option<WorkflowExecution>> {
if let Some(wt) = &task_exec.workflow_task {
let workflow_exec = WorkflowExecutionRepository::find_by_id(
pool,
wt.workflow_execution
).await?;
Ok(Some(workflow_exec))
} else {
Ok(None)
}
}
// Without direct link, would need to:
// 1. Get parent execution via task_exec.parent
// 2. Find workflow_execution WHERE execution = parent
Use Case 4: Hierarchical Display
// Display execution tree with proper indentation
async fn display_execution_tree(pool: &PgPool, root_id: Id, indent: usize) {
let exec = ExecutionRepository::find_by_id(pool, root_id).await.unwrap();
println!("{:indent$}├─ {} ({})", "", exec.action_ref, exec.status, indent = indent);
// Get children using generic parent relationship
let children = sqlx::query_as::<_, Execution>(
"SELECT * FROM attune.execution WHERE parent = $1"
)
.bind(root_id)
.fetch_all(pool)
.await
.unwrap();
for child in children {
display_execution_tree(pool, child.id, indent + 2).await;
}
}
The Redundancy Trade-off
For Workflow Tasks: Yes, There's Redundancy
task.parent
→ parent_execution (id: 100)
← workflow_execution.execution
task.workflow_task.workflow_execution
→ workflow_execution (id: 50)
→ parent_execution (id: 100)
Both ultimately point to the same workflow, just through different paths.
Why This Is Acceptable
- Performance: Direct link avoids JOINs (PostgreSQL JSONB is fast)
- Clarity: Explicit workflow relationship vs generic parent relationship
- Flexibility:
parentcan be used for non-workflow patterns - Consistency: All executions use
parentthe same way
Alternatives Considered
❌ Alternative 1: Remove workflow_execution from metadata
Problem: Forces 2-JOIN queries to access workflow state
-- Every workflow task query becomes complex
SELECT we.*
FROM attune.execution task
JOIN attune.execution parent ON task.parent = parent.id
JOIN attune.workflow_execution we ON we.execution = parent.id
WHERE task.workflow_task IS NOT NULL;
❌ Alternative 2: Remove parent for workflow tasks
Problem: Breaks generic execution tree queries
-- Would need complex COALESCE logic
SELECT * FROM attune.execution
WHERE parent = $1
OR (workflow_task IS NOT NULL
AND (workflow_task->>'parent_execution')::bigint = $1);
✅ Current Approach: Keep Both
Small redundancy in exchange for:
- Simple generic queries via
parent - Efficient workflow queries via
workflow_task.workflow_execution - Clear separation of concerns
Validation and Best Practices
Validation Logic (Optional)
For data integrity, you could validate consistency:
async fn validate_workflow_task_consistency(
pool: &PgPool,
task_exec: &Execution
) -> Result<()> {
if let Some(wt) = &task_exec.workflow_task {
// Get workflow_execution record
let workflow_exec = WorkflowExecutionRepository::find_by_id(
pool,
wt.workflow_execution
).await?;
// Ensure parent matches workflow_execution.execution
if task_exec.parent != Some(workflow_exec.execution) {
return Err(Error::validation(format!(
"Inconsistent parent: task.parent={:?}, workflow_exec.execution={}",
task_exec.parent, workflow_exec.execution
)));
}
}
Ok(())
}
Helper Methods (Recommended)
Add convenience methods to the Execution model:
impl Execution {
/// Check if this execution is a workflow task
pub fn is_workflow_task(&self) -> bool {
self.workflow_task.is_some()
}
/// Get the workflow_execution record if this is a workflow task
pub async fn get_workflow_execution(
&self,
pool: &PgPool
) -> Result<Option<WorkflowExecution>> {
if let Some(wt) = &self.workflow_task {
let we = WorkflowExecutionRepository::find_by_id(pool, wt.workflow_execution).await?;
Ok(Some(we))
} else {
Ok(None)
}
}
/// Get the parent execution
pub async fn get_parent(&self, pool: &PgPool) -> Result<Option<Execution>> {
if let Some(parent_id) = self.parent {
ExecutionRepository::find_by_id(pool, parent_id).await
} else {
Ok(None)
}
}
/// Get all child executions (generic, works for any execution type)
pub async fn get_children(&self, pool: &PgPool) -> Result<Vec<Execution>> {
sqlx::query_as::<_, Execution>(
"SELECT * FROM attune.execution WHERE parent = $1 ORDER BY created"
)
.bind(self.id)
.fetch_all(pool)
.await
.map_err(Into::into)
}
}
Summary
Key Takeaways
-
parentis a generic field for ALL execution hierarchies (workflows, child actions, nested workflows) -
workflow_task.workflow_executionis a workflow-specific optimization for direct access to orchestration state -
Both are needed because:
parentmust remain generic for non-workflow use cases- Direct workflow_execution link avoids expensive JOINs
- Different query patterns benefit from each approach
-
The redundancy is acceptable because:
- It's limited to workflow tasks only (not all executions)
- Performance gain from avoiding JOINs
- Clearer semantics for different use cases
When to Use Which
| Scenario | Use parent |
Use workflow_task.workflow_execution |
|---|---|---|
| Get child executions (any type) | ✅ | ❌ |
| Build execution tree | ✅ | ❌ |
| Find all workflow tasks | ❌ | ✅ |
| Access workflow state | ❌ | ✅ |
| Non-workflow parent-child | ✅ | N/A |
Design Principle
Separation of concerns:
parent: Structural relationship (execution hierarchy)workflow_task.workflow_execution: Semantic relationship (workflow orchestration)
This follows the principle that a workflow task has TWO relationships:
- As a child in the execution tree (
parent) - As a task in a workflow (
workflow_task.workflow_execution)
Both are valid, serve different purposes, and should coexist.
References
- Migration:
migrations/20260127212500_consolidate_workflow_task_execution.sql - Models:
crates/common/src/models.rs(Execution, WorkflowTaskMetadata) - Repositories:
crates/common/src/repositories/execution.rs - Workflow Coordinator:
crates/executor/src/workflow/coordinator.rs