13 KiB
Workflow Loader Implementation Summary
Date: 2025-01-13
Phase: 1.4 - Workflow Loading & Registration
Status: Partially Complete (Loader: ✅ | Registrar: ⏸️)
Executive Summary
Implemented the workflow loading subsystem that scans pack directories for YAML workflow files, parses them, and validates them. The loader module is complete, tested, and production-ready. The registrar module is implemented but requires schema alignment with the actual database structure before it can be used.
What Was Built
1. WorkflowLoader Module (executor/src/workflow/loader.rs)
Purpose: Scan pack directories, load workflow YAML files, parse and validate them.
Components:
WorkflowLoader
Main service for loading workflows from the filesystem.
pub struct WorkflowLoader {
config: LoaderConfig,
validator: WorkflowValidator,
}
Key Methods:
load_all_workflows()- Scans all pack directories and loads all workflowsload_pack_workflows(pack_name, pack_dir)- Loads workflows from a specific packload_workflow_file(file)- Loads and validates a single workflow filereload_workflow(ref_name)- Reloads a specific workflow by reference
LoaderConfig
Configuration for workflow loading behavior.
pub struct LoaderConfig {
pub packs_base_dir: PathBuf, // Base directory (default: /opt/attune/packs)
pub skip_validation: bool, // Skip validation errors
pub max_file_size: usize, // Max file size (default: 1MB)
}
LoadedWorkflow
Represents a successfully loaded workflow with metadata.
pub struct LoadedWorkflow {
pub file: WorkflowFile, // File metadata
pub workflow: WorkflowDefinition, // Parsed workflow
pub validation_errors: Vec<String>, // Any validation errors
}
WorkflowFile
Metadata about a workflow file.
pub struct WorkflowFile {
pub path: PathBuf, // Full path to YAML file
pub pack: String, // Pack name
pub name: String, // Workflow name
pub ref_name: String, // Full reference (pack.name)
}
Features:
- ✅ Async file I/O using Tokio
- ✅ Supports both
.yamland.ymlextensions - ✅ File size validation (prevents loading huge files)
- ✅ Integrated with Phase 1.3 parser and validator
- ✅ Comprehensive error handling
- ✅ Idiomatic Rust error types (
Error::validation(),Error::not_found())
Directory Structure Expected:
/opt/attune/packs/
├── core/
│ └── workflows/
│ ├── example.yaml
│ └── another.yaml
├── monitoring/
│ └── workflows/
│ └── healthcheck.yaml
└── deployment/
└── workflows/
├── deploy_app.yaml
└── rollback.yaml
Test Coverage:
- ✅ Scan pack directories
- ✅ Scan workflow files (both .yaml and .yml)
- ✅ Load single workflow file
- ✅ Load all workflows from all packs
- ✅ Reload specific workflow by reference
- ✅ File size limit enforcement
- ✅ Error handling for missing files/directories
2. WorkflowRegistrar Module (executor/src/workflow/registrar.rs)
Purpose: Register loaded workflows as actions in the database.
Status: ⏸️ Implemented but needs schema alignment
Components:
WorkflowRegistrar
Service for registering workflows in the database.
pub struct WorkflowRegistrar {
pool: PgPool,
action_repo: ActionRepository,
workflow_repo: WorkflowDefinitionRepository,
pack_repo: PackRepository,
options: RegistrationOptions,
}
Intended Methods:
register_workflow(loaded)- Register a single workflowregister_workflows(workflows)- Register multiple workflowsunregister_workflow(ref_name)- Remove a workflow from database
RegistrationOptions
Configuration for workflow registration.
pub struct RegistrationOptions {
pub update_existing: bool, // Update if workflow exists
pub skip_invalid: bool, // Skip workflows with validation errors
pub default_runner: String, // Default runner type
pub default_timeout: i32, // Default timeout in seconds
}
RegistrationResult
Result of a workflow registration operation.
pub struct RegistrationResult {
pub ref_name: String, // Workflow reference
pub created: bool, // true = created, false = updated
pub action_id: i64, // Action ID
pub workflow_def_id: i64, // Workflow definition ID
pub warnings: Vec<String>, // Any warnings
}
Intended Flow:
- Verify pack exists in database
- Check if workflow action already exists
- Create or update action with
is_workflow=true - Create or update workflow_definition record
- Link action to workflow_definition
- Return result with IDs
Current Blockers:
- Schema field name mismatches (see Issues section)
- Repository usage pattern differences
- Missing conventions for workflow-specific fields
3. Module Integration
Updated Files:
executor/src/workflow/mod.rs- Added loader and registrar exportsexecutor/src/workflow/parser.rs- AddedFrom<ParseError>for Error conversionexecutor/Cargo.toml- Addedtempfiledev-dependency
Exports:
pub use loader::{LoadedWorkflow, LoaderConfig, WorkflowFile, WorkflowLoader};
pub use registrar::{RegistrationOptions, RegistrationResult, WorkflowRegistrar};
Issues Discovered
Schema Incompatibility
The workflow design (Phases 1.2/1.3) assumed Action model fields that don't match the actual database schema:
Expected vs Actual:
| Field (Expected) | Field (Actual) | Type Difference |
|---|---|---|
pack_id |
pack |
Field name |
ref_name |
ref |
Field name |
name |
label |
Field name |
| N/A | pack_ref |
Missing field |
description |
description |
Option vs String |
runner_type |
runtime |
String vs ID |
entry_point |
entrypoint |
Option vs String |
parameters |
param_schema |
Field name |
output_schema |
out_schema |
Field name |
tags |
N/A | Not in schema |
metadata |
N/A | Not in schema |
enabled |
N/A | Not in schema |
timeout |
N/A | Not in schema |
Impact:
- Registrar cannot directly create Action records
- Need to use
CreateActionInputstruct - Must decide conventions for workflow-specific fields
Repository Pattern Differences
Expected: Instance methods
self.action_repo.find_by_ref(ref).await?
Actual: Trait-based static methods
ActionRepository::find_by_ref(&pool, ref).await?
Impact:
- All repository calls in registrar need updating
- Pattern is actually cleaner and more idiomatic
Design Decisions Needed
1. Workflow Entrypoint
Question: What should action.entrypoint be for workflows?
Options:
- A)
"workflow"- Simple constant - B)
"internal://workflow"- URL-like scheme - C) Workflow definition ID reference
Recommendation: "internal://workflow" - Clear distinction from regular actions
2. Workflow Runtime
Question: How to handle action.runtime for workflows?
Options:
- A) NULL - Workflows don't use runtimes
- B) Create special "workflow" runtime in database
Recommendation: NULL - Workflows are orchestrated, not executed in runtimes
3. Required vs Optional Fields
Question: How to handle fields required in DB but optional in YAML?
Affected Fields:
description- Required in DB, optional in workflow YAMLentrypoint- Required in DB, N/A for workflows
Recommendation:
- Description: Default to empty string or derive from label
- Entrypoint: Use
"internal://workflow"convention
What Works
Loader Module (Production Ready)
- ✅ Scans pack directories recursively
- ✅ Finds all workflow YAML files
- ✅ Parses workflow definitions
- ✅ Validates workflows using Phase 1.3 validator
- ✅ Handles errors gracefully
- ✅ Async/concurrent file operations
- ✅ Well-tested (6 test cases, all passing)
- ✅ Proper error types and messages
Usage Example:
use attune_executor::workflow::{WorkflowLoader, LoaderConfig};
let config = LoaderConfig {
packs_base_dir: PathBuf::from("/opt/attune/packs"),
skip_validation: false,
max_file_size: 1024 * 1024,
};
let loader = WorkflowLoader::new(config);
let workflows = loader.load_all_workflows().await?;
for (ref_name, loaded) in workflows {
println!("Loaded workflow: {}", ref_name);
if !loaded.validation_errors.is_empty() {
println!(" Warnings: {:?}", loaded.validation_errors);
}
}
What Needs Work
Registrar Module (Blocked on Schema)
To Complete:
- Update to use
CreateActionInputstruct - Map WorkflowDefinition fields to actual Action schema
- Convert repository calls to trait static methods
- Implement workflow field conventions
- Add database integration tests
- Verify workflow_definition table schema
Estimated Effort: 2-3 hours
API Integration (Blocked on Registrar)
Needed:
- Workflow CRUD endpoints in API service
- Pack integration for auto-loading workflows
- Workflow catalog/search functionality
Estimated Effort: 5-6 hours
Files Created
-
crates/executor/src/workflow/loader.rs(483 lines)- WorkflowLoader implementation
- Configuration types
- 6 unit tests with tempfile-based fixtures
-
crates/executor/src/workflow/registrar.rs(462 lines)- WorkflowRegistrar implementation (needs schema fix)
- Registration types and options
- Transaction-based database operations
-
work-summary/phase-1.4-loader-registration-progress.md- Detailed progress tracking
- Schema compatibility analysis
- Next steps and decisions
-
work-summary/workflow-loader-summary.md(this file)- Implementation summary
- What works and what doesn't
Files Modified
-
crates/executor/src/workflow/mod.rs- Added loader and registrar module declarations
- Added public exports
-
crates/executor/src/workflow/parser.rs- Added
From<ParseError>forattune_common::error::Error
- Added
-
crates/executor/Cargo.toml- Added
tempfile = "3.8"dev-dependency
- Added
-
work-summary/PROBLEM.md- Added schema alignment issue tracking
Testing
Unit Tests (Loader)
All tests passing ✅
test_scan_pack_directories- Verifies pack directory scanningtest_scan_workflow_files- Verifies workflow file discoverytest_load_workflow_file- Verifies single file loadingtest_load_all_workflows- Verifies batch loadingtest_reload_workflow- Verifies reload by referencetest_file_size_limit- Verifies size limit enforcement
Integration Tests (Registrar)
Not yet implemented ⏸️
Needed:
- Database fixture setup
- Pack creation for testing
- Workflow registration flow
- Update workflow flow
- Unregister workflow flow
- Transaction rollback on error
Performance Considerations
Current Implementation
- Uses async I/O for concurrent file operations
- No caching of loaded workflows
- Re-parses YAML on every load
Future Optimizations
- Caching: Cache parsed workflows in memory
- Lazy Loading: Load workflows on-demand rather than at startup
- File Watching: Use inotify/fsnotify for hot-reloading
- Parallel Loading: Use
join_allfor concurrent pack scanning - Incremental Updates: Only reload changed workflows
Scalability Estimates
- Small deployment: 10 packs × 5 workflows = 50 workflows (~1-2 seconds load time)
- Medium deployment: 50 packs × 10 workflows = 500 workflows (~5-10 seconds)
- Large deployment: 200 packs × 20 workflows = 4000 workflows (~30-60 seconds)
Recommendation: Implement caching and lazy loading for deployments > 100 workflows
Next Actions
Immediate (P0)
- Fix registrar schema alignment
- Test workflow registration with database
- Verify workflow_definition table compatibility
Short Term (P1)
- Add API endpoints for workflow CRUD
- Integrate with pack management
- Implement workflow catalog/search
Medium Term (P2)
- Add workflow caching
- Implement hot-reloading
- Add metrics and monitoring
- Performance optimization for large deployments
Dependencies
- ✅ Phase 1.2: Models and repositories (complete)
- ✅ Phase 1.3: YAML parsing and validation (complete)
- ⏸️ Database schema review (workflow_definition table)
- ⏸️ Pack management integration
- ⏸️ API service endpoints
Conclusion
The workflow loader is complete and works well. It successfully:
- Scans pack directories
- Loads and parses workflow YAML files
- Validates workflows
- Handles errors gracefully
The registrar is logically complete but needs adaptation to the actual database schema. Once the schema alignment is fixed (estimated 2-3 hours), Phase 1.4 can be completed and workflows can be registered and executed.
Overall Progress: ~60% complete Blocker: Schema field mapping Risk Level: Low (well-understood issue with clear solution path)