working on workflows
This commit is contained in:
76
AGENTS.md
76
AGENTS.md
@@ -102,6 +102,7 @@ docker compose logs -f <svc> # View logs
|
||||
- **BuildKit cache mounts**: Persist cargo registry and compilation artifacts between builds
|
||||
- **Cache strategy**: `sharing=shared` for registry/git (concurrent-safe), service-specific IDs for target caches
|
||||
- **Parallel builds**: 4x faster than old `sharing=locked` strategy - no serialization overhead
|
||||
- **Rustc stack size**: All Rust Dockerfiles set `ENV RUST_MIN_STACK=16777216` (16 MiB) in the build stage to prevent `rustc` SIGSEGV crashes during release compilation. The `Makefile` also exports this variable for local builds.
|
||||
- **Documentation**: See `docs/docker-layer-optimization.md`, `docs/QUICKREF-docker-optimization.md`, `docs/QUICKREF-buildkit-cache-strategy.md`
|
||||
|
||||
### Docker Runtime Standardization
|
||||
@@ -242,14 +243,14 @@ Completion listener advances workflow → Schedules successor tasks → Complete
|
||||
**Migration Count**: 10 migrations (`000001` through `000010`) — see `migrations/` directory
|
||||
- **Artifact System**: The `artifact` table stores metadata + structured data (progress entries via JSONB `data` column). The `artifact_version` table stores immutable content snapshots — either on disk (via `file_path` column) or in DB (via `content` BYTEA / `content_json` JSONB). Version numbering is auto-assigned via `next_artifact_version()` SQL function. A DB trigger (`enforce_artifact_retention`) auto-deletes oldest versions when count exceeds the artifact's `retention_limit`. `artifact.execution` is a plain BIGINT (no FK — execution is a hypertable). Progress-type artifacts use `artifact.data` (atomic JSON array append); file-type artifacts use `artifact_version` rows with `file_path` set. Binary content is excluded from default queries for performance (`SELECT_COLUMNS` vs `SELECT_COLUMNS_WITH_CONTENT`). **Visibility**: Each artifact has a `visibility` column (`artifact_visibility_enum`: `public` or `private`, DB default `private`). The `CreateArtifactRequest` DTO accepts `visibility` as `Option<ArtifactVisibility>` — when omitted the API route handler applies a **type-aware default**: `public` for Progress artifacts (informational status indicators), `private` for all other types. Callers can always override explicitly. Public artifacts are viewable by all authenticated users; private artifacts are restricted based on the artifact's `scope` (Identity, Pack, Action, Sensor) and `owner` fields. The visibility field is filterable via the search/list API (`?visibility=public`). Full RBAC enforcement is deferred — the column and basic query filtering are in place for future permission checks. **Notifications**: `artifact_created` and `artifact_updated` DB triggers (in migration `000008`) fire PostgreSQL NOTIFY with entity_type `artifact` and include `visibility` in the payload. The `artifact_updated` trigger extracts a progress summary (`progress_percent`, `progress_message`, `progress_entries`) from the last entry of the `data` JSONB array for progress-type artifacts. The Web UI `ExecutionProgressBar` component (`web/src/components/executions/ExecutionProgressBar.tsx`) renders an inline progress bar in the Execution Details card using the `useArtifactStream` hook (`web/src/hooks/useArtifactStream.ts`) for real-time WebSocket updates, with polling fallback via `useExecutionArtifacts`.
|
||||
- **File-Based Artifact Storage**: File-type artifacts (FileBinary, FileDataTable, FileImage, FileText) use a shared filesystem volume instead of PostgreSQL BYTEA. The `artifact_version.file_path` column stores the relative path from the `artifacts_dir` root (e.g., `mypack/build_log/v1.txt`). Pattern: `{ref_with_dots_as_dirs}/v{version}.{ext}`. The artifact ref (globally unique) is used as the directory key — no execution ID in the path, so artifacts can outlive executions and be shared across them. **Endpoint**: `POST /api/v1/artifacts/{id}/versions/file` allocates a version number and file path without any file content; the execution process writes the file to `$ATTUNE_ARTIFACTS_DIR/{file_path}`. **Download**: `GET /api/v1/artifacts/{id}/download` and version-specific downloads check `file_path` first (read from disk), fall back to DB BYTEA/JSON. **Finalization**: After execution exits, the worker stats all file-backed versions for that execution and updates `size_bytes` on both `artifact_version` and parent `artifact` rows via direct DB access. **Cleanup**: Delete endpoints remove disk files before deleting DB rows; empty parent directories are cleaned up. **Backward compatible**: Existing DB-stored artifacts (`file_path = NULL`) continue to work unchanged.
|
||||
- **Pack Component Loading Order**: Runtimes → Triggers → Actions → Sensors (dependency order). Both `PackComponentLoader` (Rust) and `load_core_pack.py` (Python) follow this order.
|
||||
- **Pack Component Loading Order**: Runtimes → Triggers → Actions (+ workflow definitions) → Sensors (dependency order). Both `PackComponentLoader` (Rust) and `load_core_pack.py` (Python) follow this order. When an action YAML contains a `workflow_file` field, the loader creates/updates the referenced `workflow_definition` record and links it to the action during the Actions phase.
|
||||
|
||||
### Workflow Execution Orchestration
|
||||
- **Detection**: The `ExecutionScheduler` checks `action.workflow_def.is_some()` before dispatching to a worker. Workflow actions are orchestrated by the executor, not sent to workers.
|
||||
- **Orchestration Flow**: Scheduler loads the `WorkflowDefinition`, builds a `TaskGraph`, creates a `workflow_execution` record, marks the parent execution as Running, builds an initial `WorkflowContext` from execution parameters and workflow vars, then dispatches entry-point tasks as child executions via MQ with rendered inputs.
|
||||
- **Template Resolution**: Task inputs are rendered through `WorkflowContext.render_json()` before dispatching. Uses the expression engine for full operator/function support inside `{{ }}`. Canonical namespaces: `parameters`, `workflow` (mutable vars), `task` (results), `config` (pack config), `keystore` (secrets), `item`, `index`, `system`. Backward-compat aliases: `vars`/`variables` → `workflow`, `tasks` → `task`, bare names → `workflow` fallback. **Type-preserving**: pure template expressions like `"{{ item }}"` preserve the JSON type (integer `5` stays as `5`, not string `"5"`). Mixed expressions like `"Sleeping for {{ item }} seconds"` remain strings.
|
||||
- **Function Expressions**: `{{ result() }}` returns the last completed task's result. `{{ result().field.subfield }}` navigates into it. `{{ succeeded() }}`, `{{ failed() }}`, `{{ timed_out() }}` return booleans. These are evaluated by `WorkflowContext.try_evaluate_function_call()`.
|
||||
- **Publish Directives**: Transition `publish` directives (e.g., `number_list: "{{ result().data.items }}"`) are evaluated when a transition fires. Published variables are persisted to the `workflow_execution.variables` column and available to subsequent tasks via the `workflow` namespace (e.g., `{{ workflow.number_list }}`). Uses type-preserving rendering so arrays/numbers/booleans retain their types.
|
||||
- **Publish Directives**: Transition `publish` directives are evaluated when a transition fires. Published variables are persisted to the `workflow_execution.variables` column and available to subsequent tasks via the `workflow` namespace (e.g., `{{ workflow.number_list }}`). Values can be **any JSON-compatible type**: string templates (e.g., `number_list: "{{ result().data.items }}"`), booleans (`validation_passed: true`), numbers (`count: 42`), arrays, objects, or null. The `PublishDirective::Simple` variant stores `HashMap<String, serde_json::Value>`. String values are template-rendered with type preservation (pure `{{ }}` expressions preserve the underlying JSON type); non-string values (booleans, numbers, null) pass through `render_json` unchanged — `true` stays as boolean `true`, not string `"true"`. The `PublishVar` struct in `graph.rs` uses a `value: JsonValue` field (with `#[serde(alias = "expression")]` for backward compat with stored task graphs).
|
||||
- **Child Task Dispatch**: Each workflow task becomes a child execution with the task's actual action ref (e.g., `core.echo`), `workflow_task` metadata linking it to the `workflow_execution` record, and a parent reference to the workflow execution. Child executions re-enter the normal scheduling pipeline, so nested workflows work recursively.
|
||||
- **with_items Expansion**: Tasks declaring `with_items: "{{ expr }}"` are expanded into child executions. The expression is resolved via the `WorkflowContext` to produce a JSON array, then each item gets its own child execution with `item`/`index` set on the context and `task_index` in `WorkflowTaskMetadata`. Completion tracking waits for ALL sibling items to finish before marking the task as completed/failed and advancing the workflow.
|
||||
- **with_items Concurrency Limiting**: ALL child execution records are created in the database up front (with fully-rendered inputs), but only the first `N` are published to the message queue where `N` is the task's `concurrency` value (**default: 1**, i.e. serial execution). The remaining children stay at `Requested` status in the DB. As each item completes, `advance_workflow` counts in-flight siblings (`scheduling`/`scheduled`/`running`), calculates free slots (`concurrency - in_flight`), and calls `publish_pending_with_items_children()` which queries for `Requested`-status siblings ordered by `task_index` and publishes them. The DB `status = 'requested'` query is the authoritative source of undispatched items — no auxiliary state in workflow variables needed. The task is only marked complete when all siblings reach a terminal state. To run all items in parallel, explicitly set `concurrency` to the list length or a suitably large number.
|
||||
@@ -264,7 +265,13 @@ Completion listener advances workflow → Schedules successor tasks → Complete
|
||||
- Development packs in `./packs.dev/` are bind-mounted directly for instant updates
|
||||
- **Pack Binaries**: Native binaries (sensors) built separately with `./scripts/build-pack-binaries.sh`
|
||||
- **Action Script Resolution**: Worker constructs file paths as `{packs_base_dir}/{pack_ref}/actions/{entrypoint}`
|
||||
- **Workflow File Storage**: Visual workflow builder saves files to `{packs_base_dir}/{pack_ref}/actions/workflows/{name}.workflow.yaml` via `POST /api/v1/packs/{pack_ref}/workflow-files` and `PUT /api/v1/workflows/{ref}/file` endpoints
|
||||
- **Workflow Action YAML (`workflow_file` field)**: An action YAML may include a `workflow_file` field (e.g., `workflow_file: workflows/timeline_demo.yaml`) pointing to a workflow definition file relative to the `actions/` directory. When present, the `PackComponentLoader` reads and parses the referenced workflow YAML, creates/updates a `workflow_definition` record, and links the action to it via `action.workflow_def`. This separates action-level metadata (ref, label, parameters, policies) from the workflow graph (tasks, transitions, variables), and allows **multiple actions to reference the same workflow file** with different parameter schemas or policy configurations. Workflow actions have no `runner_type` (runtime is `None`) — the executor orchestrates child task executions rather than sending to a worker.
|
||||
- **Action-linked workflow files omit action-level metadata**: Workflow files referenced via `workflow_file` should contain **only the execution graph**: `version`, `vars`, `tasks`, `output_map`. The `ref`, `label`, `description`, `parameters`, `output`, and `tags` fields are omitted — the action YAML is the single authoritative source for those values. The `WorkflowDefinition` parser accepts empty `ref`/`label` (defaults to `""`), and the loader / registrar fall back to the action YAML (or filename-derived values) when they are missing. Standalone workflow files (in `workflows/`) still carry their own `ref`/`label` since they have no companion action YAML.
|
||||
- **Workflow File Storage**: The visual workflow builder save endpoints (`POST /api/v1/packs/{pack_ref}/workflow-files` and `PUT /api/v1/workflows/{ref}/file`) write **two files** per workflow:
|
||||
1. **Action YAML** at `{packs_base_dir}/{pack_ref}/actions/{name}.yaml` — action-level metadata (`ref`, `label`, `description`, `parameters`, `output`, `tags`, `workflow_file` reference, `enabled`). Built by `build_action_yaml()` in `crates/api/src/routes/workflows.rs`.
|
||||
2. **Workflow YAML** at `{packs_base_dir}/{pack_ref}/actions/workflows/{name}.workflow.yaml` — graph-only (`version`, `vars`, `tasks`, `output_map`). The `strip_action_level_fields()` function removes `ref`, `label`, `description`, `parameters`, `output`, and `tags` from the definition before writing.
|
||||
Pack-bundled workflows use the same directory layout and are discovered during pack registration when their companion action YAML contains `workflow_file`.
|
||||
- **Workflow File Discovery (dual-directory scanning)**: The `WorkflowLoader` scans **two** directories when loading workflows for a pack: (1) `{pack_dir}/workflows/` (legacy standalone workflow files), and (2) `{pack_dir}/actions/workflows/` (visual-builder and action-linked workflow files). Files with `.workflow.yaml` suffix have the `.workflow` portion stripped when deriving the workflow name/ref (e.g., `deploy.workflow.yaml` → name `deploy`, ref `pack.deploy`). If the same ref appears in both directories, `actions/workflows/` wins. The `reload_workflow` method searches `actions/workflows/` first, trying `.workflow.yaml`, `.yaml`, `.workflow.yml`, and `.yml` extensions.
|
||||
- **Task Model (Orquesta-aligned)**: Tasks are purely action invocations — there is no task `type` field or task-level `when` condition in the UI model. Parallelism is implicit (multiple `do` targets in a transition fan out into parallel branches). Conditions belong exclusively on transitions (`next[].when`). Each task has: `name`, `action`, `input`, `next` (transitions), `delay`, `retry`, `timeout`, `with_items`, `batch_size`, `concurrency`, `join`.
|
||||
- The backend `Task` struct (`crates/common/src/workflow/parser.rs`) still supports `type` and task-level `when` for backward compatibility, but the UI never sets them.
|
||||
- **Task Transition Model (Orquesta-style)**: Tasks use an ordered `next` array of transitions instead of flat `on_success`/`on_failure`/`on_complete`/`on_timeout` fields. Each transition has:
|
||||
@@ -315,6 +322,7 @@ Completion listener advances workflow → Schedules successor tasks → Complete
|
||||
- **Web UI**: `extractProperties()` in `ParamSchemaForm.tsx` is the single extraction function for all schema types. Only handles flat format.
|
||||
- **SchemaBuilder**: Visual schema editor reads and writes flat format with `required` and `secret` checkboxes per parameter.
|
||||
- **Backend Validation**: `flat_to_json_schema()` in `crates/api/src/validation/params.rs` converts flat format to JSON Schema internally for `jsonschema` crate validation. This conversion is an implementation detail — external interfaces always use flat format.
|
||||
- **Execution Config Format (Flat)**: The `execution.config` JSONB column always stores parameters in **flat format** — the object itself IS the parameters map (e.g., `{"url": "https://...", "method": "GET"}`). This is consistent across all execution sources: manual API calls, rule-triggered enforcements, and workflow task children. There is **no `{"parameters": {...}}` wrapper** — never nest parameters under a `"parameters"` key. The worker reads `config` as a flat object and passes each key-value pair as an action parameter. The scheduler's `extract_workflow_params()` helper treats the config object directly as the parameters map.
|
||||
- **Parameter Delivery**: Actions receive parameters via stdin as JSON (never environment variables)
|
||||
- **Output Format**: Actions declare output format (text/json/yaml) - json/yaml are parsed into execution.result JSONB
|
||||
- **Standard Environment Variables**: Worker provides execution context via `ATTUNE_*` environment variables:
|
||||
@@ -444,12 +452,24 @@ input:
|
||||
- **Styling**: Tailwind utility classes
|
||||
- **Dev Server**: `npm run dev` (typically :3000 or :5173)
|
||||
- **Build**: `npm run build`
|
||||
- **Workflow Timeline DAG**: Prefect-style workflow run timeline visualization on the execution detail page for workflow executions
|
||||
- Components in `web/src/components/executions/workflow-timeline/` (WorkflowTimelineDAG, TimelineRenderer, types, data, layout)
|
||||
- Pure SVG renderer — no D3, no React Flow, no additional npm dependencies
|
||||
- Renders child task executions as horizontal duration bars on a time axis with curved Bezier dependency edges
|
||||
- **Data flow**: `WorkflowTimelineDAG` (orchestrator) fetches child executions via `useChildExecutions` + workflow definition via `useWorkflow(actionRef)` → `data.ts` transforms into `TimelineTask[]`/`TimelineEdge[]`/`TimelineMilestone[]` → `layout.ts` computes lane assignments + positions → `TimelineRenderer` renders SVG
|
||||
- **Edge coloring from workflow metadata**: Fetches the workflow definition's `next` transition array, classifies `when` expressions (`{{ succeeded() }}` → green, `{{ failed() }}` → red dashed, `{{ timed_out() }}` → orange dash-dot, unconditional → gray), and reads `__chart_meta__` custom labels/colors
|
||||
- **Task bars**: Colored by state (green=completed, blue=running with pulse animation, red=failed, gray=pending, orange=timeout). Left accent bar, text label with ellipsis clipping, timeout indicator badge.
|
||||
- **Milestones**: Synthetic start/end diamond nodes + merge/fork junctions when fan-in/fan-out exceeds 3 tasks
|
||||
- **Lane packing**: Greedy algorithm assigns tasks to non-overlapping y-lanes sorted by start time, with optional reordering to cluster tasks sharing upstream dependencies
|
||||
- **Interactions**: Hover tooltip (name, state, times, duration, retries, upstream/downstream counts), click-to-select with BFS path highlighting, double-click to navigate to child execution, horizontal zoom (mouse wheel anchored to cursor), alt+drag pan, expand/compact toggle
|
||||
- **Fallback**: When no workflow definition is available, infers dependency edges from task timing heuristics
|
||||
- **Integration**: Rendered in `ExecutionDetailPage.tsx` above `WorkflowTasksPanel`, conditioned on `isWorkflow`. Shares TanStack Query cache with WorkflowTasksPanel. Accepts `ParentExecutionInfo` interface (satisfied by both `ExecutionResponse` and `ExecutionSummary`).
|
||||
- **Workflow Builder**: Visual node-based workflow editor at `/actions/workflows/new` and `/actions/workflows/:ref/edit`
|
||||
- Components in `web/src/components/workflows/` (ActionPalette, WorkflowCanvas, TaskNode, WorkflowEdges, TaskInspector)
|
||||
- Types and conversion utilities in `web/src/types/workflow.ts`
|
||||
- Hooks in `web/src/hooks/useWorkflows.ts`
|
||||
- Saves workflow files to `{packs_base_dir}/{pack_ref}/actions/workflows/{name}.workflow.yaml` via dedicated API endpoints
|
||||
- **Visual / Raw YAML toggle**: Toolbar has a segmented toggle to switch between the visual node-based builder and a full-width read-only YAML preview (generated via `js-yaml`). Raw YAML mode replaces the canvas, palette, and inspector with the effective workflow definition.
|
||||
- **Visual / Raw YAML toggle**: Toolbar has a segmented toggle to switch between the visual node-based builder and a two-panel read-only YAML preview (generated via `js-yaml`). Raw YAML mode replaces the canvas, palette, and inspector with side-by-side panels: **Action YAML** (left, blue — `actions/{name}.yaml`: ref, label, parameters, output, tags, `workflow_file` reference) and **Workflow YAML** (right, green — `actions/workflows/{name}.workflow.yaml`: version, vars, tasks, output_map — graph only). Each panel has its own copy button and a description bar explaining the file's role. The `builderStateToGraph()` function extracts the graph-only definition, and `builderStateToActionYaml()` extracts the action metadata.
|
||||
- **Drag-handle connections**: TaskNode has output handles (green=succeeded, red=failed, gray=always) and an input handle (top). Drag from an output handle to another node's input handle to create a transition.
|
||||
- **Transition customization**: Users can rename transitions (custom `label`) and assign custom colors (CSS color string or preset swatches) via the TaskInspector. Custom colors/labels are persisted in the workflow YAML and rendered on the canvas edges.
|
||||
- **Edge waypoints & label dragging**: Transition edges support intermediate waypoints for custom routing. Click an edge to select it, then:
|
||||
@@ -509,16 +529,33 @@ make db-reset # Drop & recreate DB
|
||||
cargo install --path crates/cli # Install CLI
|
||||
attune auth login # Login
|
||||
attune pack list # List packs
|
||||
attune pack create --ref my_pack # Create empty pack (non-interactive)
|
||||
attune pack create -i # Create empty pack (interactive prompts)
|
||||
attune pack upload ./path/to/pack # Upload local pack to API (works with Docker)
|
||||
attune pack register /opt/attune/packs/mypak # Register from API-visible path
|
||||
attune action execute <ref> --param key=value
|
||||
attune execution list # Monitor executions
|
||||
attune workflow upload actions/deploy.yaml # Upload workflow action to existing pack
|
||||
attune workflow upload actions/deploy.yaml --force # Update existing workflow
|
||||
attune workflow list # List all workflows
|
||||
attune workflow list --pack core # List workflows in a pack
|
||||
attune workflow show core.install_packs # Show workflow details + task summary
|
||||
attune workflow delete core.my_workflow --yes # Delete a workflow
|
||||
```
|
||||
|
||||
**Pack Upload vs Register**:
|
||||
- `attune pack upload <local-path>` — Tarballs the local directory and POSTs it to `POST /api/v1/packs/upload`. Works regardless of whether the API is local or in Docker. This is the primary way to install packs from your local machine into a Dockerized system.
|
||||
- `attune pack register <server-path>` — Sends a filesystem path string to the API (`POST /api/v1/packs/register`). Only works if the path is accessible from inside the API container (e.g. `/opt/attune/packs/...` or `/opt/attune/packs.dev/...`).
|
||||
|
||||
**Workflow Upload** (`attune workflow upload <action-yaml-path>`):
|
||||
- Reads the local action YAML file and extracts the `workflow_file` field to find the companion workflow YAML
|
||||
- Determines the pack from the action ref (e.g., `mypack.deploy` → pack `mypack`, name `deploy`)
|
||||
- The `workflow_file` path is resolved relative to the action YAML's parent directory (same as how pack loaders resolve it relative to the `actions/` directory)
|
||||
- Constructs a `SaveWorkflowFileRequest` JSON payload combining action metadata (label, parameters, output, tags) with the workflow definition (version, vars, tasks, output_map) and POSTs to `POST /api/v1/packs/{pack_ref}/workflow-files`
|
||||
- On 409 Conflict (workflow already exists), fails unless `--force` is passed, in which case it PUTs to `PUT /api/v1/workflows/{ref}/file` to update
|
||||
- Does NOT require a full pack upload — individual workflow actions can be added to existing packs independently
|
||||
- **Important**: The action YAML MUST contain a `workflow_file` field; regular (non-workflow) actions should be uploaded as part of a pack
|
||||
|
||||
**Pack Upload API endpoint**: `POST /api/v1/packs/upload` — accepts `multipart/form-data` with:
|
||||
- `pack` (required): a `.tar.gz` archive of the pack directory
|
||||
- `force` (optional, text): `"true"` to overwrite an existing pack with the same ref
|
||||
@@ -606,20 +643,21 @@ When reporting, ask: "Should I fix this first or continue with [original task]?"
|
||||
4. **NEVER** commit secrets in config files (use env vars in production)
|
||||
5. **NEVER** hardcode schema prefixes in SQL queries - rely on PostgreSQL `search_path` mechanism
|
||||
6. **NEVER** copy packs into Dockerfiles - they are mounted as volumes
|
||||
7. **ALWAYS** use PostgreSQL enum type mappings for custom enums
|
||||
8. **ALWAYS** use transactions for multi-table operations
|
||||
9. **ALWAYS** start with `attune/` or correct crate name when specifying file paths
|
||||
10. **ALWAYS** convert runtime names to lowercase for comparison (database may store capitalized)
|
||||
11. **ALWAYS** use optimized Dockerfiles for new services (selective crate copying)
|
||||
12. **REMEMBER** IDs are `i64`, not `i32` or `uuid`
|
||||
13. **REMEMBER** schema is determined by `search_path`, not hardcoded in queries (production uses `attune`, development uses `public`)
|
||||
14. **REMEMBER** to regenerate SQLx metadata after schema-related changes: `cargo sqlx prepare`
|
||||
15. **REMEMBER** packs are volumes - update with restart, not rebuild
|
||||
16. **REMEMBER** to build pack binaries separately: `./scripts/build-pack-binaries.sh`
|
||||
17. **REMEMBER** when adding mutable columns to `execution` or `worker`, add a corresponding `IS DISTINCT FROM` check to the entity's history trigger function in the TimescaleDB migration. Events and enforcements are hypertables without history tables — do NOT add frequently-mutated columns to them. Execution is both a hypertable AND has an `execution_history` table (because it is mutable with ~4 updates per row).
|
||||
18. **REMEMBER** for large JSONB columns in history triggers (like `execution.result`), use `_jsonb_digest_summary()` instead of storing the raw value — see migration `000009_timescaledb_history`
|
||||
19. **NEVER** use `SELECT *` on tables that have DB-only columns not in the Rust `FromRow` struct (e.g., `execution.is_workflow`, `execution.workflow_def` exist in SQL but not in the `Execution` model). Define a `SELECT_COLUMNS` constant in the repository (see `execution.rs`, `pack.rs`, `runtime_version.rs` for examples) and reference it from all queries — including queries outside the repository (e.g., `timeout_monitor.rs` imports `execution::SELECT_COLUMNS`).ause runtime deserialization failures.
|
||||
20. **REMEMBER** `execution`, `event`, and `enforcement` are all TimescaleDB hypertables — they **cannot be the target of FK constraints**. Any column referencing them (e.g., `inquiry.execution`, `workflow_execution.execution`, `execution.parent`) is a plain BIGINT with no FK and may become a dangling reference.
|
||||
7. **NEVER** put workflow definition content directly in action YAML — use a separate `.workflow.yaml` file in `actions/workflows/` and reference it via `workflow_file` in the action YAML
|
||||
8. **ALWAYS** use PostgreSQL enum type mappings for custom enums
|
||||
9. **ALWAYS** use transactions for multi-table operations
|
||||
10. **ALWAYS** start with `attune/` or correct crate name when specifying file paths
|
||||
11. **ALWAYS** convert runtime names to lowercase for comparison (database may store capitalized)
|
||||
12. **ALWAYS** use optimized Dockerfiles for new services (selective crate copying)
|
||||
13. **REMEMBER** IDs are `i64`, not `i32` or `uuid`
|
||||
14. **REMEMBER** schema is determined by `search_path`, not hardcoded in queries (production uses `attune`, development uses `public`)
|
||||
15. **REMEMBER** to regenerate SQLx metadata after schema-related changes: `cargo sqlx prepare`
|
||||
16. **REMEMBER** packs are volumes - update with restart, not rebuild
|
||||
17. **REMEMBER** to build pack binaries separately: `./scripts/build-pack-binaries.sh`
|
||||
18. **REMEMBER** when adding mutable columns to `execution` or `worker`, add a corresponding `IS DISTINCT FROM` check to the entity's history trigger function in the TimescaleDB migration. Events and enforcements are hypertables without history tables — do NOT add frequently-mutated columns to them. Execution is both a hypertable AND has an `execution_history` table (because it is mutable with ~4 updates per row).
|
||||
19. **REMEMBER** for large JSONB columns in history triggers (like `execution.result`), use `_jsonb_digest_summary()` instead of storing the raw value — see migration `000009_timescaledb_history`
|
||||
20. **NEVER** use `SELECT *` on tables that have DB-only columns not in the Rust `FromRow` struct (e.g., `execution.is_workflow`, `execution.workflow_def` exist in SQL but not in the `Execution` model). Define a `SELECT_COLUMNS` constant in the repository (see `execution.rs`, `pack.rs`, `runtime_version.rs` for examples) and reference it from all queries — including queries outside the repository (e.g., `timeout_monitor.rs` imports `execution::SELECT_COLUMNS`).ause runtime deserialization failures.
|
||||
21. **REMEMBER** `execution`, `event`, and `enforcement` are all TimescaleDB hypertables — they **cannot be the target of FK constraints**. Any column referencing them (e.g., `inquiry.execution`, `workflow_execution.execution`, `execution.parent`) is a plain BIGINT with no FK and may become a dangling reference.
|
||||
|
||||
## Deployment
|
||||
- **Target**: Distributed deployment with separate service instances
|
||||
@@ -630,7 +668,7 @@ When reporting, ask: "Should I fix this first or continue with [original task]?"
|
||||
- **Web UI**: Static files served separately or via API service
|
||||
|
||||
## Current Development Status
|
||||
- ✅ **Complete**: Database migrations (21 tables, 10 migration files), API service (most endpoints), common library, message queue infrastructure, repository layer, JWT auth, CLI tool, Web UI (basic + workflow builder), Executor service (core functionality + workflow orchestration), Worker service (shell/Python execution), Runtime version data model, constraint matching, worker version selection pipeline, version verification at startup, per-version environment isolation, TimescaleDB entity history tracking (execution, worker), Event, enforcement, and execution tables as TimescaleDB hypertables (time-series with retention/compression), History API endpoints (generic + entity-specific with pagination & filtering), History UI panels on entity detail pages (execution), TimescaleDB continuous aggregates (6 hourly rollup views with auto-refresh policies), Analytics API endpoints (7 endpoints under `/api/v1/analytics/` — dashboard, execution status/throughput/failure-rate, event volume, worker status, enforcement volume), Analytics dashboard widgets (bar charts, stacked status charts, failure rate ring gauge, time range selector), Workflow execution orchestration (scheduler detects workflow actions, creates child task executions, completion listener advances workflow via transitions), Workflow template resolution (type-preserving `{{ }}` rendering in task inputs), Workflow `with_items` expansion (parallel child executions per item), Workflow `with_items` concurrency limiting (sliding-window dispatch with pending items stored in workflow variables), Workflow `publish` directive processing (variable propagation between tasks), Workflow function expressions (`result()`, `succeeded()`, `failed()`, `timed_out()`), Workflow expression engine (full arithmetic/comparison/boolean/membership operators, 30+ built-in functions, recursive-descent parser), Canonical workflow namespaces (`parameters`, `workflow`, `task`, `config`, `keystore`, `item`, `index`, `system`), Artifact content system (versioned file/JSON storage, progress-append semantics, binary upload/download, retention enforcement, execution-linked artifacts, 18 API endpoints under `/api/v1/artifacts/`, file-backed disk storage via shared volume for file-type artifacts), CLI `--wait` flag (WebSocket-first with polling fallback — connects to notifier on port 8081, subscribes to execution, returns immediately on terminal status; falls back to exponential-backoff REST polling if WS unavailable; polling always gets at least 10s budget regardless of how long WS path ran)
|
||||
- ✅ **Complete**: Database migrations (21 tables, 10 migration files), API service (most endpoints), common library, message queue infrastructure, repository layer, JWT auth, CLI tool, Web UI (basic + workflow builder + workflow timeline DAG), Executor service (core functionality + workflow orchestration), Worker service (shell/Python execution), Runtime version data model, constraint matching, worker version selection pipeline, version verification at startup, per-version environment isolation, TimescaleDB entity history tracking (execution, worker), Event, enforcement, and execution tables as TimescaleDB hypertables (time-series with retention/compression), History API endpoints (generic + entity-specific with pagination & filtering), History UI panels on entity detail pages (execution), TimescaleDB continuous aggregates (6 hourly rollup views with auto-refresh policies), Analytics API endpoints (7 endpoints under `/api/v1/analytics/` — dashboard, execution status/throughput/failure-rate, event volume, worker status, enforcement volume), Analytics dashboard widgets (bar charts, stacked status charts, failure rate ring gauge, time range selector), Workflow execution orchestration (scheduler detects workflow actions, creates child task executions, completion listener advances workflow via transitions), Workflow template resolution (type-preserving `{{ }}` rendering in task inputs), Workflow `with_items` expansion (parallel child executions per item), Workflow `with_items` concurrency limiting (sliding-window dispatch with pending items stored in workflow variables), Workflow `publish` directive processing (variable propagation between tasks), Workflow function expressions (`result()`, `succeeded()`, `failed()`, `timed_out()`), Workflow expression engine (full arithmetic/comparison/boolean/membership operators, 30+ built-in functions, recursive-descent parser), Canonical workflow namespaces (`parameters`, `workflow`, `task`, `config`, `keystore`, `item`, `index`, `system`), Artifact content system (versioned file/JSON storage, progress-append semantics, binary upload/download, retention enforcement, execution-linked artifacts, 18 API endpoints under `/api/v1/artifacts/`, file-backed disk storage via shared volume for file-type artifacts), CLI `--wait` flag (WebSocket-first with polling fallback — connects to notifier on port 8081, subscribes to execution, returns immediately on terminal status; falls back to exponential-backoff REST polling if WS unavailable; polling always gets at least 10s budget regardless of how long WS path ran), Workflow Timeline DAG visualization (Prefect-style time-aligned Gantt+DAG on execution detail page, pure SVG, transition-aware edge coloring from workflow definition metadata, hover tooltips, click-to-highlight path, zoom/pan)
|
||||
- 🔄 **In Progress**: Sensor service, advanced workflow features (nested workflow context propagation), Python runtime dependency management, API/UI endpoints for runtime version management, Artifact UI (web UI for browsing/downloading artifacts), Notifier service WebSocket (functional but lacks auth — the WS connection is unauthenticated; the subscribe filter controls visibility)
|
||||
- 📋 **Planned**: Execution policies, monitoring, pack registry system, configurable retention periods via admin settings, export/archival to external storage
|
||||
|
||||
|
||||
Reference in New Issue
Block a user