proper sql filtering

This commit is contained in:
2026-03-01 20:43:48 -06:00
parent 6b9d7d6cf2
commit bbe94d75f8
54 changed files with 6692 additions and 928 deletions

View File

@@ -0,0 +1,100 @@
# SQL-Side Filtering & Pagination Audit
**Date**: 2026-02-05
**Scope**: All list/search API endpoints across every table resource
## Problem
Following the discovery that the execution search endpoint was performing filtering in memory rather than in the database, an audit of all other list/search endpoints revealed the same pattern was pervasive across the codebase. Two categories of issues were found:
### 1. In-Memory Filtering (fetch all rows, then `.retain()`)
- **Events** (`list_events`): `source` and `rule_ref` filters applied in memory
- **Enforcements** (`list_enforcements`): `trigger_ref` filter applied in memory; filters were mutually exclusive (if/else-if) rather than combinable
- **Keys** (`list_keys`): `owner` filter applied in memory
- **Inquiries** (`list_inquiries`): `assigned_to` filter applied in memory; `status`/`execution` were mutually exclusive
### 2. In-Memory Pagination (fetch ALL rows, then slice)
- **Actions**: `list_actions`, `list_actions_by_pack`
- **Triggers**: `list_triggers`, `list_enabled_triggers`, `list_triggers_by_pack`
- **Sensors**: `list_sensors`, `list_enabled_sensors`, `list_sensors_by_pack`, `list_sensors_by_trigger`
- **Rules**: `list_rules`, `list_enabled_rules`, `list_rules_by_pack`, `list_rules_by_action`, `list_rules_by_trigger`
- **Events**: `list_events`
- **Enforcements**: `list_enforcements`
- **Keys**: `list_keys`
- **Inquiries**: `list_inquiries`, `list_inquiries_by_status`, `list_inquiries_by_execution`
- **Executions**: `list_executions_by_status`, `list_executions_by_enforcement` (older path-based endpoints)
- **Workflows**: `list_workflows`, `list_workflows_by_pack`
- **Execution Stats**: `get_execution_stats` fetched all rows to count by status
Only `list_packs` was already correct (using `list_paginated` + `count`).
## Solution
Added `search()`/`list_search()` methods to every entity repository, following the pattern established by `ExecutionRepository::search()`:
- Uses `sqlx::QueryBuilder` to dynamically build WHERE clauses
- All filters are combinable (AND) rather than mutually exclusive
- Pagination (LIMIT/OFFSET) is applied in SQL
- A parallel COUNT query shares the same WHERE clause for accurate total counts
- Returns a `SearchResult { rows, total }` struct
### Repository Changes
| Repository | New Types | New Method |
|---|---|---|
| `EventRepository` | `EventSearchFilters`, `EventSearchResult` | `search()` |
| `EnforcementRepository` | `EnforcementSearchFilters`, `EnforcementSearchResult` | `search()` |
| `KeyRepository` | `KeySearchFilters`, `KeySearchResult` | `search()` |
| `InquiryRepository` | `InquirySearchFilters`, `InquirySearchResult` | `search()` |
| `ActionRepository` | `ActionSearchFilters`, `ActionSearchResult` | `list_search()` |
| `TriggerRepository` | `TriggerSearchFilters`, `TriggerSearchResult` | `list_search()` |
| `SensorRepository` | `SensorSearchFilters`, `SensorSearchResult` | `list_search()` |
| `RuleRepository` | `RuleSearchFilters`, `RuleSearchResult` | `list_search()` |
| `WorkflowDefinitionRepository` | `WorkflowSearchFilters`, `WorkflowSearchResult` | `list_search()` |
### Route Handler Changes
Every list endpoint was updated to:
1. Build a filters struct from query/path parameters
2. Call the repository's `search()`/`list_search()` method
3. Map the result rows to DTOs
4. Return `PaginatedResponse` using the SQL-provided total count
### Additional Fixes
- **`get_execution_stats`**: Replaced fetch-all-and-count-in-Rust with `SELECT status::text, COUNT(*) FROM execution GROUP BY status` — a single SQL query.
- **`EnforcementQueryParams`**: Added `rule_ref` filter field (was missing from the DTO).
- **Enforcement filters**: Changed from mutually exclusive (if/else-if for status/rule/event) to combinable (AND).
- **Inquiry filters**: Changed from mutually exclusive (if/else-if for status/execution) to combinable (AND).
- **Workflow tag filtering**: Replaced multi-query-then-dedup approach with PostgreSQL array overlap operator (`tags && ARRAY[...]`) in a single query.
## Files Changed
### Repositories (`crates/common/src/repositories/`)
- `event.rs` — Added `EventSearchFilters`, `EventSearchResult`, `EnforcementSearchFilters`, `EnforcementSearchResult`, `EventRepository::search()`, `EnforcementRepository::search()`
- `key.rs` — Added `KeySearchFilters`, `KeySearchResult`, `KeyRepository::search()`
- `inquiry.rs` — Added `InquirySearchFilters`, `InquirySearchResult`, `InquiryRepository::search()`
- `action.rs` — Added `ActionSearchFilters`, `ActionSearchResult`, `ActionRepository::list_search()`
- `trigger.rs` — Added `TriggerSearchFilters`, `TriggerSearchResult`, `SensorSearchFilters`, `SensorSearchResult`, `TriggerRepository::list_search()`, `SensorRepository::list_search()`
- `rule.rs` — Added `RuleSearchFilters`, `RuleSearchResult`, `RuleRepository::list_search()`
- `workflow.rs` — Added `WorkflowSearchFilters`, `WorkflowSearchResult`, `WorkflowDefinitionRepository::list_search()`
### Routes (`crates/api/src/routes/`)
- `actions.rs``list_actions`, `list_actions_by_pack`
- `triggers.rs` — All 9 list endpoints (triggers + sensors)
- `rules.rs` — All 5 list endpoints
- `events.rs``list_events`, `list_enforcements`
- `keys.rs``list_keys`
- `inquiries.rs``list_inquiries`, `list_inquiries_by_status`, `list_inquiries_by_execution`
- `executions.rs``list_executions_by_status`, `list_executions_by_enforcement`, `get_execution_stats`
- `workflows.rs``list_workflows`, `list_workflows_by_pack`
### DTOs (`crates/api/src/dto/`)
- `event.rs` — Added `rule_ref` field to `EnforcementQueryParams`
## Impact
- **Correctness**: Filters that were silently ignored or mutually exclusive now work correctly and are combinable
- **Performance**: Endpoints no longer fetch entire tables into memory — pagination and filtering happen in PostgreSQL
- **Scalability**: Total counts are accurate (not capped at 1000 by the `List` trait's hardcoded LIMIT)
- **Consistency**: Every list endpoint now follows the same pattern as the execution search

View File

@@ -0,0 +1,106 @@
# Expression Engine for Workflow Templates
**Date**: 2026-02-28
**Scope**: `crates/common/src/workflow/expression/`, `crates/executor/src/workflow/context.rs`
## Summary
Implemented a complete expression evaluation engine for workflow templates, replacing the previous simple dot-path variable lookup and string-based condition evaluation with a full recursive-descent parser and evaluator operating over JSON values.
## Motivation
Workflows previously had no way to perform calculations, comparisons, or transformations within template expressions (`{{ }}`). The `evaluate_condition()` method could only check if a rendered string was truthy — it couldn't evaluate expressions like `result().code == 200 and succeeded()` or `length(items) > 3`. Publish directives and task inputs had no way to compute derived values.
## Architecture
The engine follows a classic three-phase interpreter design, located in `crates/common/src/workflow/expression/`:
1. **Lexer** (`tokenizer.rs`) — converts expression strings into tokens
2. **Parser** (`parser.rs`) — recursive-descent parser producing an AST
3. **Evaluator** (`evaluator.rs`) — walks the AST against an `EvalContext` trait to produce `JsonValue` results
The `EvalContext` trait decouples the expression engine from the `WorkflowContext`, allowing it to be used in other contexts (e.g., rule condition evaluation, template resolver) in the future.
## Operator Precedence (lowest to highest)
1. `or`
2. `and`
3. `not` (unary)
4. `==`, `!=`, `<`, `>`, `<=`, `>=`, `in`
5. `+`, `-`
6. `*`, `/`, `%`
7. Unary `-`
8. Postfix: `.field`, `[index]`, `(args)`
## Supported Operators
### Arithmetic
- `+` — number addition, string concatenation, array concatenation
- `-`, `*`, `/`, `%` — standard numeric operations
- Unary `-` — negation
- Integer division returns float when not evenly divisible (e.g., `10 / 4``2.5`, `10 / 5``2`)
### Comparison
- `==`, `!=` — deep equality (recursive for objects/arrays, cross-type for int/float)
- `>`, `<`, `>=`, `<=` — ordering for numbers, strings, and lists
- No implicit type coercion: `"3" == 3``false`
### Boolean
- `and`, `or` — short-circuit evaluation
- `not` — logical negation
### Membership & Access
- `.` — object property access
- `[n]` — array index (supports negative indexing), object bracket access, string character access
- `in` — membership test (item in array, key in object, substring in string)
## Built-in Functions
### Type Conversion
`string(v)`, `number(v)`, `int(v)`, `bool(v)`
### Introspection
`type_of(v)`, `length(v)`, `keys(obj)`, `values(obj)`
### Math
`abs(n)`, `floor(n)`, `ceil(n)`, `round(n)`, `min(a,b)`, `max(a,b)`, `sum(arr)`
### String
`lower(s)`, `upper(s)`, `trim(s)`, `split(s, sep)`, `join(arr, sep)`, `replace(s, old, new)`, `starts_with(s, prefix)`, `ends_with(s, suffix)`, `match(pattern, s)` (regex)
### Collection
`contains(haystack, needle)`, `reversed(v)` (arrays and strings), `sort(arr)`, `unique(arr)`, `flat(arr)`, `zip(a, b)`, `range(n)` / `range(start, end)`, `slice(v, start, end)`, `index_of(haystack, needle)`, `count(haystack, needle)`, `merge(obj_a, obj_b)`, `chunks(arr, size)`
### Workflow-specific (via `EvalContext`)
`result()`, `succeeded()`, `failed()`, `timed_out()`
## Design Decisions
- **No implicit type coercion**: Arithmetic between strings and numbers is an error. Only int/float cross-comparison is allowed (e.g., `3 == 3.0``true`).
- **Python-like truthiness**: `null`, `false`, `0`, `""`, `[]`, `{}` are falsy; everything else is truthy.
- **Integer preservation**: Operations on two integers produce integers; operations involving any float produce floats. Integer division that isn't evenly divisible promotes to float.
- **Array literals**: Supported in expressions: `[1, 2, 3]` (with optional trailing comma).
- **Negative array indexing**: `arr[-1]` returns the last element.
## Integration with WorkflowContext
- `WorkflowContext` implements the `EvalContext` trait, bridging variable resolution (`parameters`, `item`, `index`, `system`, `task`, `variables`, and direct variable names) and workflow functions (`result()`, `succeeded()`, `failed()`, `timed_out()`)
- `evaluate_expression()` now delegates entirely to the expression engine
- `evaluate_condition()` tries the expression engine first (for bare expressions like `x > 5`), with fallback to template rendering for backward compatibility with `{{ }}` wrapper syntax
- All existing tests pass without modification — backward compatibility is fully preserved
## Test Coverage
- **63 tests** in `crates/common/src/workflow/expression/` covering tokenizer, parser, and evaluator
- **28 tests** in `crates/executor/src/workflow/context.rs` (17 existing + 11 new) covering integration with WorkflowContext
- New tests cover: condition comparisons, boolean operators, `in` operator, function calls in conditions, `length()` in conditions, arithmetic in templates, string concatenation, nested function calls, bracket access, and type conversion
## Files Changed
- **Created**: `crates/common/src/workflow/expression/mod.rs` — module entry point and integration tests
- **Created**: `crates/common/src/workflow/expression/ast.rs` — AST node types
- **Created**: `crates/common/src/workflow/expression/tokenizer.rs` — lexer
- **Created**: `crates/common/src/workflow/expression/parser.rs` — recursive-descent parser
- **Created**: `crates/common/src/workflow/expression/evaluator.rs` — AST evaluator with all built-in functions
- **Modified**: `crates/common/src/workflow/mod.rs` — added `expression` module
- **Modified**: `crates/executor/src/workflow/context.rs` — integrated expression engine, replaced old evaluate_expression/evaluate_condition, implemented EvalContext trait, removed unused get_nested_value, added new tests