Files

David Culbreth 3b14c65998 re-uploading work

2026-02-04 17:46:30 -06:00

49 KiB

Raw Blame History

End-to-End Test Plan

Status: 📋 Planning Phase
Last Updated: 2026-01-27
Purpose: Comprehensive test plan for validating complete automation flows across all Attune services

Executive Summary

This document outlines the end-to-end (E2E) test strategy for the Attune automation platform. E2E tests validate the complete event flow from trigger detection through action execution, ensuring all five microservices work together correctly.

Critical Event Flow:

Sensor → Trigger fires → Event created → Rule evaluates → 
Enforcement created → Execution scheduled → Worker executes Action → 
Results captured → Notifications sent

Test Priorities:

Tier 1 (Core Flows): 8 tests - Basic automation lifecycle, essential for MVP
Tier 2 (Orchestration): 13 tests - Workflows, data flow, error handling
Tier 3 (Advanced): 19 tests - Edge cases, performance, advanced features

Total Test Scenarios: 40 comprehensive tests covering all platform capabilities

Test Infrastructure Requirements

Services Required

All E2E tests require the following services to be running:

PostgreSQL 14+ - Main data store
RabbitMQ 3.12+ - Message queue for inter-service communication
attune-api - REST API gateway (port 18080 for tests)
attune-executor - Execution orchestration
attune-worker - Action execution engine
attune-sensor - Event monitoring and trigger detection
attune-notifier - Real-time notifications (WebSocket)

Test Environment Configuration

Config File: config.e2e.yaml

Separate database: attune_e2e
Test-specific ports to avoid conflicts
Reduced timeouts for faster test execution
Verbose logging for debugging
Test fixtures directory: tests/fixtures/packs/

Test Fixtures

Location: tests/fixtures/

packs/test_pack/ - Simple test pack with echo action
packs/workflow_pack/ - Pack with workflow definitions
packs/timer_pack/ - Pack with timer triggers
packs/webhook_pack/ - Pack with webhook triggers
secrets/ - Test secrets for secure value injection
seed_data.sql - Baseline test data

Test Tier Breakdown

Tier 1: Core Automation Flows (MVP Essential)

These tests validate the fundamental automation lifecycle and must pass before MVP release.

T1.1: Interval Timer Automation

Priority: Critical
Duration: ~30 seconds
Description: Action executes repeatedly on interval timer

Test Steps:

Register test pack via API
Create interval timer trigger (every 5 seconds)
Create simple echo action
Create rule linking timer → action
Wait for 3 trigger events (15 seconds)
Verify 3 enforcements created
Verify 3 executions completed successfully

Success Criteria:

✅ Timer fires every 5 seconds (±500ms tolerance)
✅ Each timer event creates enforcement
✅ Each enforcement creates execution
✅ All executions reach 'succeeded' status
✅ Action output captured in execution results
✅ No errors in any service logs

Dependencies: Sensor, Executor, Worker services functional

T1.2: Date Timer (One-Shot Execution)

Priority: Critical
Duration: ~10 seconds
Description: Action executes once at specific future time

Test Steps:

Create date timer trigger (5 seconds from now)
Create action with unique marker output
Create rule linking timer → action
Wait 7 seconds
Verify exactly 1 execution occurred
Wait additional 10 seconds
Verify no additional executions

Success Criteria:

✅ Timer fires once at scheduled time (±1 second)
✅ Exactly 1 enforcement created
✅ Exactly 1 execution created
✅ No duplicate executions after timer expires
✅ Timer marked as expired/completed

Edge Cases Tested:

Date in past (should execute immediately)
Date timer cleanup after firing

T1.3: Cron Timer Execution

Priority: Critical
Duration: ~70 seconds
Description: Action executes on cron schedule

Test Steps:

Create cron timer trigger (at 0, 3, 6, 12 seconds of each minute)
Create action with timestamp output
Create rule linking timer → action
Wait for one minute + 15 seconds
Verify executions at correct second marks

Success Criteria:

✅ Executions occur at seconds: 0, 3, 6, 12 (first minute)
✅ Executions occur at seconds: 0, 3, 6, 12 (second minute if test runs long)
✅ No executions at other second marks
✅ Cron expression correctly parsed
✅ Timezone handling correct

Cron Expression Examples to Test:

*/5 * * * * * - Every 5 seconds
0,3,6,12 * * * * * - At specific seconds
0 * * * * * - Top of every minute

T1.4: Webhook Trigger with Payload

Priority: Critical
Duration: ~15 seconds
Description: Webhook POST triggers action with payload data

Test Steps:

Create webhook trigger (generates unique URL)
Create action that echoes webhook payload
Create rule linking webhook → action
POST JSON payload to webhook URL
Verify event created with correct payload
Verify execution receives payload as parameters
Verify action output includes webhook data

Success Criteria:

✅ Webhook trigger generates unique URL (/api/v1/webhooks/{trigger_id})
✅ POST to webhook creates event immediately
✅ Event payload matches POST body
✅ Rule evaluates and creates enforcement
✅ Execution receives webhook data as input
✅ Action can access webhook payload fields

Test Payloads:

{
  "event_type": "user.signup",
  "user_id": 12345,
  "email": "test@example.com",
  "metadata": {
    "source": "web",
    "campaign": "spring2024"
  }
}

T1.5: Workflow with Array Iteration (with-items)

Priority: Critical
Duration: ~20 seconds
Description: Workflow action spawns child executions for array items

Test Steps:

Create workflow action with with-items on array parameter
Create rule to trigger workflow
Execute workflow with array: ["apple", "banana", "cherry"]
Verify parent execution created
Verify 3 child executions created (one per item)
Verify each child receives single item as input
Verify parent completes after all children succeed

Success Criteria:

✅ Parent execution status: 'running' while children execute
✅ Exactly 3 child executions created
✅ Each child execution has parent_execution_id set
✅ Each child receives single item: "apple", "banana", "cherry"
✅ Children can run in parallel
✅ Parent status becomes 'succeeded' after all children succeed
✅ Child execution count matches array length

Workflow Definition:

actions:
  - name: process_items
    runner_type: python3
    entry_point: actions/process.py
    parameters:
      items:
        type: array
        required: true
    with_items: "{{ items }}"

T1.6: Action Reads from Key-Value Store

Priority: Critical
Duration: ~10 seconds
Description: Action retrieves configuration value from datastore

Test Steps:

Create key-value pair via API: {"key": "api_url", "value": "https://api.example.com"}
Create action that reads from datastore
Execute action with datastore key parameter
Verify action retrieves correct value
Verify action output includes retrieved value

Success Criteria:

✅ Action can read from attune.datastore_item table
✅ Scoped to tenant/user (multi-tenancy)
✅ Non-existent keys return null (no error)
✅ Action receives value in expected format
✅ Encrypted values decrypted before passing to action

API Endpoints Used:

POST /api/v1/datastore - Create key-value
GET /api/v1/datastore/{key} - Retrieve value
Action reads via worker's datastore helper

T1.7: Multi-Tenant Isolation

Priority: Critical
Duration: ~20 seconds
Description: Users cannot access other tenant's resources

Test Steps:

Create User A (tenant_id=1) and User B (tenant_id=2)
User A creates pack, action, rule
User B attempts to list User A's packs
Verify User B sees empty list
User B attempts to execute User A's action by ID
Verify request returns 404 or 403 error
User A can see and execute their own resources

Success Criteria:

✅ All API endpoints filter by tenant_id
✅ Cross-tenant resource access returns 404 (not 403 to avoid info leak)
✅ Executions scoped to tenant
✅ Events scoped to tenant
✅ Enforcements scoped to tenant
✅ Datastore scoped to tenant
✅ Secrets scoped to tenant

Security Test Cases:

Direct ID manipulation (guessing IDs)
SQL injection attempts
JWT token manipulation

T1.8: Action Execution Failure Handling

Priority: Critical
Duration: ~15 seconds
Description: Failed action execution handled gracefully

Test Steps:

Create action that always exits with error (exit code 1)
Create rule to trigger action
Execute action
Verify execution status becomes 'failed'
Verify error message captured
Verify exit code recorded
Verify execution doesn't retry (no retry policy)

Success Criteria:

✅ Execution status: 'requested' → 'scheduled' → 'running' → 'failed'
✅ Exit code captured: exit_code = 1
✅ stderr captured in execution result
✅ Execution result includes error details
✅ Worker marks execution as failed
✅ Executor updates enforcement status
✅ System remains stable (no crashes)

Test Action:

#!/usr/bin/env python3
import sys
print("Starting action...", file=sys.stderr)
sys.exit(1)  # Force failure

Tier 2: Orchestration & Data Flow

These tests validate workflow orchestration, data passing, and error recovery mechanisms.

T2.1: Nested Workflow Execution

Priority: High
Duration: ~30 seconds
Description: Parent workflow calls child workflow (multi-level)

Test Steps:

Create child workflow with 2 tasks
Create parent workflow that calls child workflow
Execute parent workflow
Verify parent creates child execution
Verify child creates its own task executions
Verify all executions complete in correct order

Success Criteria:

✅ 3 execution levels: parent → child → grandchild tasks
✅ parent_execution_id chain correct
✅ Execution tree structure maintained
✅ Results propagate up from grandchildren to parent
✅ Parent waits for all descendants to complete

Execution Hierarchy:

Parent Workflow (execution_id=1)
└─ Child Workflow (execution_id=2, parent=1)
   ├─ Task 1 (execution_id=3, parent=2)
   └─ Task 2 (execution_id=4, parent=2)

T2.2: Workflow with Failure Handling

Priority: High
Duration: ~25 seconds
Description: Child execution fails, parent handles error

Test Steps:

Create workflow with 3 child actions
Configure second child to fail
Configure on-failure behavior (continue vs. abort)
Execute workflow
Verify second child fails
Verify first and third children succeed
Verify parent status based on policy

Success Criteria:

✅ First child completes successfully
✅ Second child fails as expected
✅ Policy continue: third child still executes
✅ Policy abort: third child never starts
✅ Parent status reflects policy: 'failed' (abort) or 'succeeded_with_errors' (continue)
✅ All execution statuses correct

Failure Policies to Test:

on_failure: abort - Stop all subsequent tasks
on_failure: continue - Continue with remaining tasks
on_failure: retry - Retry failed task N times

T2.3: Action Writes to Key-Value Store

Priority: High
Duration: ~15 seconds
Description: Action writes value, subsequent action reads it

Test Steps:

Create Action A that writes to datastore
Create Action B that reads from datastore
Create workflow: Action A → Action B
Execute workflow with test data
Verify Action A writes value
Verify Action B reads same value
Verify data persists in database

Success Criteria:

✅ Action A can write to datastore via API or helper
✅ Value persisted to attune.datastore_item table
✅ Action B retrieves exact value written by Action A
✅ Values scoped to tenant
✅ Encryption applied if marked as secret
✅ TTL honored if specified

Test Data Flow:

Action A: write("config.api_url", "https://api.production.com")
Action B: url = read("config.api_url")  # Returns "https://api.production.com"

T2.4: Parameter Templating and Context

Priority: High
Duration: ~20 seconds
Description: Action uses Jinja2 templates to access execution context

Test Steps:

Create Action A that returns structured output
Create Action B with templated parameters: {{ task_1.result.api_key }}
Create workflow: Action A → Action B
Execute workflow
Verify Action B receives resolved parameter values
Verify template variables replaced correctly

Success Criteria:

✅ Context includes: trigger.data, execution.params, task_N.result
✅ Jinja2 expressions evaluated correctly
✅ Nested JSON paths resolved: {{ event.data.user.email }}
✅ Missing values handled gracefully (null or error)
✅ Template errors fail execution with clear message

Template Examples:

parameters:
  user_email: "{{ trigger.data.user.email }}"
  api_url: "{{ datastore.config.api_url }}"
  previous_result: "{{ task_1.result.status }}"
  iteration_item: "{{ item }}"  # In with-items context

T2.5: Rule Criteria Evaluation

Priority: High
Duration: ~20 seconds
Description: Rule only fires when criteria match

Test Steps:

Create webhook trigger
Create rule with criteria: {{ trigger.data.status == "critical" }}
POST webhook with status: "info" → No execution
POST webhook with status: "critical" → Execution created
Verify only second webhook triggered action

Success Criteria:

✅ Rule criteria evaluated as Jinja2 expression
✅ Event created for both webhooks
✅ Enforcement only created when criteria true
✅ No execution for non-matching events
✅ Complex criteria work: {{ trigger.data.value > datastore.threshold }}

Criteria Examples:

criteria: "{{ trigger.data.severity == 'high' }}"
criteria: "{{ trigger.data.count > 100 }}"
criteria: "{{ trigger.data.environment in ['prod', 'staging'] }}"

T2.6: Approval Workflow (Inquiry)

Priority: High
Duration: ~30 seconds
Description: Action creates inquiry, execution pauses until response

Test Steps:

Create action that creates inquiry (approval request)
Execute action
Verify execution status becomes 'paused'
Verify inquiry created with status 'pending'
Submit inquiry response via API
Verify execution resumes
Verify action receives response data
Verify execution completes successfully

Success Criteria:

✅ Execution pauses with status 'paused'
✅ Inquiry created in attune.inquiry table
✅ Inquiry timeout set (TTL)
✅ Response submission updates inquiry status
✅ Execution resumes after response
✅ Action receives response in structured format
✅ Timeout causes default action if no response

Inquiry Types to Test:

Simple yes/no approval
Multi-field form input
Multiple choice selection
Inquiry timeout with default value

T2.7: Inquiry Timeout Handling

Priority: Medium
Duration: ~35 seconds
Description: Inquiry expires after TTL, execution proceeds with default

Test Steps:

Create action with inquiry (TTL=5 seconds)
Set default response for timeout
Execute action
Do NOT respond to inquiry
Wait 7 seconds
Verify inquiry status becomes 'expired'
Verify execution resumes with default value
Verify execution completes successfully

Success Criteria:

✅ Inquiry expires after TTL seconds
✅ Status changes: 'pending' → 'expired'
✅ Execution receives default response
✅ Execution proceeds without user input
✅ Timeout event logged

T2.8: Retry Policy Execution

Priority: High
Duration: ~30 seconds
Description: Failed action retries with exponential backoff

Test Steps:

Create action that fails first 2 times, succeeds on 3rd
Configure retry policy: max_retries=3, delay=2s, backoff=2.0
Execute action
Verify execution fails twice
Verify delays between retries: ~2s, ~4s
Verify third attempt succeeds
Verify execution status becomes 'succeeded'

Success Criteria:

✅ Execution retried 3 times total
✅ Exponential backoff applied: 2s, 4s, 8s
✅ Each retry logged separately
✅ Execution succeeds on final retry
✅ Retry count tracked in execution metadata
✅ Max retries honored (stops after limit)

Retry Configuration:

retry:
  max_attempts: 3
  delay_seconds: 2
  backoff_multiplier: 2.0
  max_delay_seconds: 60

T2.9: Execution Timeout Policy

Priority: High
Duration: ~25 seconds
Description: Long-running action killed after timeout

Test Steps:

Create action that sleeps for 60 seconds
Configure timeout policy: 5 seconds
Execute action
Verify execution starts
Wait 7 seconds
Verify worker kills action process
Verify execution status becomes 'failed'
Verify timeout error message recorded

Success Criteria:

✅ Action process killed after timeout
✅ Execution status: 'running' → 'failed'
✅ Error message indicates timeout
✅ Exit code indicates SIGTERM/SIGKILL
✅ Worker remains stable after kill
✅ No zombie processes

Timeout Levels:

Action-level timeout (per action)
Workflow-level timeout (entire workflow)
System default timeout (fallback)

T2.10: Parallel Execution (with-items)

Priority: Medium
Duration: ~20 seconds
Description: Multiple child executions run concurrently

Test Steps:

Create action with 5-second sleep
Configure workflow with with-items on array of 5 items
Configure concurrency: 5 (all parallel)
Execute workflow
Measure total execution time
Verify ~5 seconds total (not 25 seconds sequential)
Verify all 5 children ran concurrently

Success Criteria:

✅ All 5 child executions start immediately
✅ Total time ~5 seconds (parallel) not ~25 seconds (sequential)
✅ Worker handles concurrent executions
✅ No resource contention issues
✅ All children complete successfully

Concurrency Limits to Test:

concurrency: 1 - Sequential execution
concurrency: 3 - Limited parallelism
concurrency: unlimited - No limit

T2.11: Sequential Workflow with Dependencies

Priority: Medium
Duration: ~20 seconds
Description: Tasks execute in order with on-success transitions

Test Steps:

Create workflow with 3 tasks:
- Task A: outputs {"step": 1}
- Task B: depends on A, outputs {"step": 2}
- Task C: depends on B, outputs {"step": 3}
Execute workflow
Verify execution order: A → B → C
Verify B waits for A to complete
Verify C waits for B to complete

Success Criteria:

✅ Tasks execute in correct order
✅ No task starts before dependency completes
✅ Each task accesses previous task results
✅ Total execution time = sum of individual times
✅ Workflow status reflects sequential progress

Workflow Definition:

tasks:
  - name: task_a
    action: core.echo
  - name: task_b
    action: core.echo
    depends_on: [task_a]
  - name: task_c
    action: core.echo
    depends_on: [task_b]

T2.12: Python Action with Dependencies

Priority: Medium
Duration: ~30 seconds
Description: Python action uses third-party packages

Test Steps:

Create pack with requirements.txt: requests==2.28.0
Create action that imports and uses requests library
Worker creates isolated virtualenv for pack
Execute action
Verify venv created at expected path
Verify action successfully imports requests
Verify action executes HTTP request

Success Criteria:

✅ Virtualenv created in venvs/{pack_name}/
✅ Dependencies installed from requirements.txt
✅ Action imports third-party packages
✅ Isolation prevents conflicts with other packs
✅ Venv cached for subsequent executions

Pack Structure:

test_pack/
├── pack.yaml
├── requirements.txt  # requests==2.28.0
└── actions/
    └── http_call.py  # import requests

T2.13: Node.js Action Execution

Priority: Medium
Duration: ~25 seconds
Description: JavaScript action executes with Node.js runtime

Test Steps:

Create pack with package.json: {"dependencies": {"axios": "^1.0.0"}}
Create Node.js action that requires axios
Worker installs npm dependencies
Execute action
Verify node_modules created
Verify action successfully requires axios
Verify action completes successfully

Success Criteria:

✅ npm install runs for pack dependencies
✅ node_modules created in pack directory
✅ Action can require packages
✅ Dependencies isolated per pack
✅ Worker supports Node.js runtime type

Action Example:

const axios = require('axios');

async function run(params) {
  const response = await axios.get(params.url);
  return response.data;
}

module.exports = { run };

Tier 3: Advanced Features & Edge Cases

These tests cover advanced scenarios, edge cases, and performance requirements.

T3.1: Date Timer with Past Date

Priority: Low
Duration: ~5 seconds
Description: Timer with past date executes immediately or fails gracefully

Test Steps:

Create date timer trigger with date 1 hour in past
Create action
Create rule linking timer → action
Verify behavior (execute immediately OR fail with clear error)

Success Criteria:

✅ Either: execution created immediately
✅ Or: rule creation fails with clear error message
✅ No silent failures
✅ Behavior documented and consistent

T3.2: Timer Cancellation

Priority: Low
Duration: ~15 seconds
Description: Disabled rule stops timer from executing

Test Steps:

Create interval timer (every 5 seconds)
Create rule (enabled=true)
Wait for 2 executions
Disable rule via API
Wait 15 seconds
Verify no additional executions occurred

Success Criteria:

✅ Disabling rule stops future executions
✅ In-flight executions complete normally
✅ Sensor stops generating events for disabled rules
✅ Re-enabling rule resumes executions

T3.3: Multiple Concurrent Timers

Priority: Low
Duration: ~30 seconds
Description: Multiple rules with different timers run independently

Test Steps:

Create 3 interval timers: 3s, 5s, 7s
Create 3 rules with unique actions
Wait 21 seconds (LCM of intervals)
Verify Timer A fired 7 times (every 3s)
Verify Timer B fired 4-5 times (every 5s)
Verify Timer C fired 3 times (every 7s)

Success Criteria:

✅ Timers don't interfere with each other
✅ Each timer fires on its own schedule
✅ Sensor handles multiple concurrent timers
✅ No timer drift over time

T3.4: Webhook with Multiple Rules

Priority: Low
Duration: ~15 seconds
Description: Single webhook trigger fires multiple rules

Test Steps:

Create 1 webhook trigger
Create 3 rules, all using same webhook trigger
POST to webhook URL
Verify 1 event created
Verify 3 enforcements created (one per rule)
Verify 3 executions created
Verify all executions succeed

Success Criteria:

✅ Single event triggers multiple rules
✅ Rules evaluated independently
✅ Execution count = rule count
✅ All rules see same event payload

T3.5: Webhook with Rule Criteria Filtering

Priority: Medium
Duration: ~20 seconds
Description: Multiple rules with different criteria on same trigger

Test Steps:

Create webhook trigger
Create Rule A: criteria {{ trigger.data.level == 'info' }}
Create Rule B: criteria {{ trigger.data.level == 'error' }}
POST webhook with level: 'info' → only Rule A fires
POST webhook with level: 'error' → only Rule B fires
POST webhook with level: 'debug' → no rules fire

Success Criteria:

✅ Event created for all webhooks
✅ Only matching rules create enforcements
✅ Non-matching rules don't execute
✅ Multiple criteria evaluated correctly

T3.6: Sensor-Generated Custom Event

Priority: Low
Duration: ~30 seconds
Description: Custom sensor monitors external system and generates events

Test Steps:

Create custom sensor (polls file for changes)
Deploy sensor code to sensor service
Create trigger for sensor event type
Create rule linked to trigger
Modify monitored file
Verify sensor detects change
Verify event generated
Verify execution triggered

Success Criteria:

✅ Custom sensor code loaded dynamically
✅ Sensor polls on configured interval
✅ Sensor generates event when condition met
✅ Event payload includes sensor data
✅ Rule evaluates and triggers execution

T3.7: Complex Workflow Orchestration

Priority: Medium
Duration: ~45 seconds
Description: Full automation loop with multiple stages

Test Steps:

Webhook triggers initial action
Action checks datastore for threshold
If threshold exceeded, create inquiry (approval)
After approval, execute multi-step workflow
Workflow updates datastore with results
Final action sends notification

Success Criteria:

✅ All stages execute in correct order
✅ Data flows through entire pipeline
✅ Conditional logic works correctly
✅ Inquiry pauses execution
✅ Datastore updates persist
✅ Notification delivered

Flow Diagram:

Webhook → Check Threshold → Inquiry → Multi-Step Workflow → Update Datastore → Notify

T3.8: Chained Webhook Triggers

Priority: Low
Duration: ~20 seconds
Description: Action completion triggers webhook, which triggers next action

Test Steps:

Create Action A that POSTs to webhook URL on completion
Create Webhook Trigger B
Create Rule B: Webhook B → Action B
Execute Action A
Verify Action A completes
Verify Action A POSTs to Webhook B
Verify Webhook B creates event
Verify Action B executes

Success Criteria:

✅ Action can trigger webhooks programmatically
✅ Webhook event created from action POST
✅ Downstream rule fires correctly
✅ No circular dependencies causing infinite loops

T3.9: Multi-Step Approval Workflow

Priority: Low
Duration: ~60 seconds
Description: Workflow pauses twice for different approvals

Test Steps:

Create workflow with 2 inquiry steps:
- Inquiry A: Manager approval
- Inquiry B: Security approval
Execute workflow
Verify first pause at Inquiry A
Respond to Inquiry A
Verify workflow continues
Verify second pause at Inquiry B
Respond to Inquiry B
Verify workflow completes

Success Criteria:

✅ Workflow pauses at each inquiry
✅ Workflow resumes after each response
✅ Multiple inquiries handled correctly
✅ Inquiry responses accessible in subsequent tasks

T3.10: RBAC Permission Checks

Priority: Medium
Duration: ~20 seconds
Description: User with viewer role cannot create/execute actions

Test Steps:

Create User A with role 'admin'
Create User B with role 'viewer'
User A creates pack and action successfully
User B attempts to create action → 403 Forbidden
User B attempts to execute action → 403 Forbidden
User B can view (GET) actions successfully

Success Criteria:

✅ Role permissions enforced on all endpoints
✅ Viewer role: GET only, no POST/PUT/DELETE
✅ Admin role: Full CRUD access
✅ Clear error messages for permission denials
✅ Permissions checked before processing request

Roles to Test:

admin - Full access
editor - Create/update resources
viewer - Read-only access
executor - Execute actions only

T3.11: System vs User Packs

Priority: Medium
Duration: ~15 seconds
Description: System packs available to all tenants

Test Steps:

Install system pack (tenant_id=NULL or special marker)
Create User A (tenant_id=1)
Create User B (tenant_id=2)
Both users list packs
Verify both see system pack
Verify users only see their own user packs
Both users can execute system pack actions

Success Criteria:

✅ System packs visible to all tenants
✅ System packs executable by all tenants
✅ User packs isolated per tenant
✅ System pack actions use shared venv
✅ Core pack is system pack

T3.12: Worker Crash Recovery

Priority: Medium
Duration: ~30 seconds
Description: Killed worker process triggers execution rescheduling

Test Steps:

Start execution of long-running action (30 seconds)
After 5 seconds, kill worker process (SIGKILL)
Verify execution stuck in 'running' state
Executor detects timeout or heartbeat failure
Verify executor marks execution as 'failed'
Verify execution can be retried

Success Criteria:

✅ Executor detects worker failure
✅ Execution marked as failed with clear error
✅ No executions lost due to crash
✅ New worker can pick up work
✅ System recovers automatically

Recovery Mechanisms:

Execution heartbeat monitoring
Timeout detection
Queue message redelivery

T3.13: Invalid Action Parameters

Priority: Medium
Duration: ~5 seconds
Description: Missing required parameter fails execution immediately

Test Steps:

Create action with required parameter: url
Create rule with action parameters missing url
Execute action
Verify execution fails immediately (not sent to worker)
Verify clear validation error message

Success Criteria:

✅ Parameter validation before worker scheduling
✅ Clear error: "Missing required parameter: url"
✅ Execution status: 'requested' → 'failed' (skips worker)
✅ No resources wasted on invalid execution
✅ Validation uses JSON Schema from action definition

T3.14: Execution Completion Notification

Priority: Medium
Duration: ~20 seconds
Description: WebSocket client receives real-time execution updates

Test Steps:

Connect WebSocket client to notifier
Subscribe to execution events
Create and execute action
Verify WebSocket receives messages:
- Execution created
- Execution scheduled
- Execution running
- Execution succeeded
Verify message format and payload

Success Criteria:

✅ WebSocket connection established
✅ All status transitions notified
✅ Notification latency <100ms
✅ Message includes full execution object
✅ Notifications scoped to tenant

Notification Format:

{
  "event": "execution.status_changed",
  "entity_type": "execution",
  "entity_id": 123,
  "data": {
    "execution_id": 123,
    "status": "succeeded",
    "action_ref": "core.echo"
  },
  "timestamp": "2026-01-27T10:30:00Z"
}

T3.15: Inquiry Creation Notification

Priority: Low
Duration: ~15 seconds
Description: Real-time notification when inquiry created

Test Steps:

Connect WebSocket client
Subscribe to inquiry events
Execute action that creates inquiry
Verify WebSocket receives inquiry.created message
Respond to inquiry via API
Verify WebSocket receives inquiry.responded message

Success Criteria:

✅ Inquiry creation notified immediately
✅ Inquiry response notified immediately
✅ Notification includes inquiry details
✅ UI can show real-time approval requests

T3.16: Rule Trigger Notification (Optional)

Priority: Low
Duration: ~15 seconds
Description: Optional notification when specific rule fires

Test Steps:

Create rule with notify_on_trigger: true
Connect WebSocket client
Trigger rule via webhook
Verify WebSocket receives rule.triggered notification

Success Criteria:

✅ Notification only sent if enabled on rule
✅ Notification includes event details
✅ Notification scoped to tenant
✅ High-frequency rules don't flood notifications

T3.17: Container Runner Execution

Priority: Low
Duration: ~40 seconds
Description: Action executes inside Docker container

Test Steps:

Create action with runner_type: 'container'
Specify Docker image: 'python:3.11-slim'
Execute action
Worker pulls image if not cached
Worker starts container with action code
Verify action executes in container
Verify container cleaned up after execution

Success Criteria:

✅ Docker image pulled (cached for future runs)
✅ Container started with correct image
✅ Action code mounted into container
✅ Execution succeeds in container
✅ Container stopped and removed after execution
✅ No container leaks

Security Considerations:

Container resource limits (CPU, memory)
Network isolation
No privileged mode

T3.18: HTTP Runner Execution

Priority: Medium
Duration: ~10 seconds
Description: HTTP action makes REST API call

Test Steps:

Create action with runner_type: 'http'
Configure action: method=POST, url, headers, body
Set up mock HTTP server to receive request
Execute action
Verify worker makes HTTP request
Verify response captured in execution result

Success Criteria:

✅ Worker makes HTTP request with correct method
✅ Headers passed correctly
✅ Body templated with parameters
✅ Response status code captured
✅ Response body captured
✅ HTTP errors handled gracefully

Action Configuration:

name: api_call
runner_type: http
http_config:
  method: POST
  url: "https://api.example.com/users"
  headers:
    Content-Type: "application/json"
    Authorization: "Bearer {{ secret.api_token }}"
  body: "{{ params | tojson }}"

T3.19: Dependency Conflict Isolation

Priority: Low
Duration: ~50 seconds
Description: Two packs with conflicting dependencies run successfully

Test Steps:

Create Pack A: requires requests==2.25.0
Create Pack B: requires requests==2.28.0
Create actions in both packs that import requests
Execute Action A
Verify requests 2.25.0 used
Execute Action B
Verify requests 2.28.0 used
Execute both concurrently
Verify no conflicts

Success Criteria:

✅ Separate virtualenvs per pack
✅ Pack A uses requests 2.25.0
✅ Pack B uses requests 2.28.0
✅ Concurrent executions don't interfere
✅ Dependencies isolated completely

T3.20: Secret Injection Security

Priority: High
Duration: ~20 seconds
Description: Secrets passed via stdin, not environment variables

Test Steps:

Create secret via API: {"key": "api_key", "value": "secret123"}
Create action that uses secret
Execute action
Verify secret passed to action via stdin
Inspect worker process environment
Verify secret NOT in environment variables
Verify secret NOT in execution logs

Success Criteria:

✅ Secret passed via stdin (secure channel)
✅ Secret NOT in env vars (/proc/{pid}/environ)
✅ Secret NOT in process command line
✅ Secret NOT in execution output/logs
✅ Secret retrieved from database encrypted
✅ Action receives secret securely

Security Rationale:

Environment variables visible in /proc/{pid}/environ
Stdin not exposed to other processes
Prevents secret leakage via ps/top

T3.21: Action Log Size Limits

Priority: Low
Duration: ~15 seconds
Description: Large action output truncated to prevent database bloat

Test Steps:

Create action that outputs 10MB of data
Configure log size limit: 1MB
Execute action
Verify execution captures first 1MB
Verify truncation marker added
Verify database record reasonable size

Success Criteria:

✅ Output truncated at configured limit
✅ Truncation indicator added: "... (output truncated)"
✅ Execution doesn't fail due to large output
✅ Database write succeeds
✅ Worker memory usage bounded

Configuration:

worker:
  max_output_size_bytes: 1048576  # 1MB
  output_truncation_message: "... (output truncated after 1MB)"

T3.22: Execution History Pagination

Priority: Low
Duration: ~30 seconds
Description: Large execution lists paginated correctly

Test Steps:

Create 100 executions rapidly
Query executions with limit=20
Verify first page returns 20 executions
Verify pagination metadata (total, next_page)
Request next page
Verify next 20 executions returned
Iterate through all pages

Success Criteria:

✅ Pagination parameters: limit, offset
✅ Total count accurate
✅ No duplicate executions across pages
✅ No missing executions
✅ Consistent ordering (created_desc)

API Query:

GET /api/v1/executions?limit=20&offset=0
GET /api/v1/executions?limit=20&offset=20

T3.23: Execution Cancellation

Priority: Medium
Duration: ~20 seconds
Description: User cancels running execution

Test Steps:

Start long-running action (60 seconds)
After 5 seconds, cancel via API: POST /api/v1/executions/{id}/cancel
Verify executor sends cancel message to worker
Verify worker kills action process (SIGTERM)
Verify execution status becomes 'canceled'
Verify graceful shutdown (cleanup runs)

Success Criteria:

✅ Cancel request accepted while execution running
✅ Worker receives cancel message
✅ Action process receives SIGTERM
✅ Execution status: 'running' → 'canceled'
✅ Partial results captured
✅ Resources cleaned up

Graceful Shutdown:

SIGTERM sent first (30 second grace period)
SIGKILL sent if process doesn't exit

T3.24: High-Frequency Trigger Performance

Priority: Low
Duration: ~60 seconds
Description: Timer firing every second handled efficiently

Test Steps:

Create interval timer: every 1 second
Create simple action (echo)
Create rule
Let system run for 60 seconds
Verify ~60 executions created
Verify no backlog buildup
Verify system remains responsive

Success Criteria:

✅ 60 executions in 60 seconds (±5)
✅ Message queue doesn't accumulate backlog
✅ Worker keeps up with execution rate
✅ API remains responsive
✅ No memory leaks
✅ CPU usage reasonable

Performance Targets:

Queue latency <100ms
Execution scheduling latency <500ms
API p95 response time <100ms

T3.25: Large Workflow (100+ Tasks)

Priority: Low
Duration: ~60 seconds
Description: Workflow with many tasks executes correctly

Test Steps:

Create workflow with 100 sequential tasks
Each task echoes its task number
Execute workflow
Monitor execution tree creation
Verify all 100 tasks execute in order
Verify workflow completes successfully
Verify reasonable memory usage

Success Criteria:

✅ All 100 tasks execute
✅ Correct sequential order maintained
✅ Execution tree correct (parent-child relationships)
✅ Memory usage scales linearly
✅ Database handles large execution count
✅ No stack overflow or recursion issues

T3.26: Pack Update/Reload

Priority: Low
Duration: ~30 seconds
Description: Pack update reloads actions without restart

Test Steps:

Register pack version 1.0.0
Execute action from pack
Update pack to version 1.1.0 (modify action code)
Reload pack via API: POST /api/v1/packs/{id}/reload
Execute action again
Verify updated code executed
Verify no service restart required

Success Criteria:

✅ Pack reload picks up code changes
✅ Virtualenv updated with new dependencies
✅ In-flight executions complete with old code
✅ New executions use new code
✅ No downtime during reload

T3.27: Datastore Encryption at Rest

Priority: Low
Duration: ~10 seconds
Description: Encrypted datastore values stored encrypted

Test Steps:

Create encrypted datastore value: {"key": "password", "value": "secret", "encrypted": true}
Query database directly
Verify value column contains encrypted blob (not plaintext)
Read value via API
Verify API returns decrypted value
Verify action receives decrypted value

Success Criteria:

✅ Encrypted values not visible in database
✅ Encryption key not stored in database
✅ API decrypts transparently
✅ Actions receive plaintext values
✅ Encryption algorithm documented (AES-256-GCM)

T3.28: Execution Audit Trail

Priority: Low
Duration: ~15 seconds
Description: Complete audit trail for execution lifecycle

Test Steps:

Execute action
Query audit log API
Verify audit entries for:
- Execution created (by user X)
- Execution scheduled (by executor)
- Execution started (by worker Y)
- Execution completed (by worker Y)
Verify each entry has timestamp, actor, action

Success Criteria:

✅ All lifecycle events audited
✅ Actor identified (user, service, worker)
✅ Timestamps accurate
✅ Audit log immutable
✅ Audit log queryable by execution_id

T3.29: Rate Limiting

Priority: Low
Duration: ~30 seconds
Description: API rate limiting prevents abuse

Test Steps:

Configure rate limit: 10 requests/second per user
Make 100 requests rapidly
Verify first 10 succeed
Verify subsequent requests return 429 Too Many Requests
Wait 1 second
Verify next 10 requests succeed

Success Criteria:

✅ Rate limit enforced per user
✅ 429 status code returned
✅ Retry-After header provided
✅ Rate limit resets after window
✅ Admin users exempt from rate limiting

T3.30: Graceful Service Shutdown

Priority: Low
Duration: ~30 seconds
Description: Services shutdown cleanly without data loss

Test Steps:

Start execution of long-running action
Send SIGTERM to worker service
Verify worker finishes current execution
Verify worker stops accepting new executions
Verify worker exits cleanly after completion
Verify execution results saved

Success Criteria:

✅ SIGTERM triggers graceful shutdown
✅ In-flight work completes
✅ No new work accepted
✅ Message queue messages requeued
✅ Database connections closed cleanly
✅ Exit code 0

Test Execution Strategy

Test Ordering

Phase 1: Foundation (Run First)

T1.1-T1.8: Core flows - Must all pass before proceeding

Phase 2: Orchestration

T2.1-T2.13: Workflow and data flow tests

Phase 3: Integration

T3.1-T3.15: Advanced features and edge cases

Phase 4: Performance & Scale

T3.16-T3.30: Performance, security, and operational tests

Test Environment Setup

Prerequisites:

PostgreSQL 14+ running
RabbitMQ 3.12+ running
Test database created: attune_e2e
Migrations applied
Test configuration: config.e2e.yaml
Test fixtures loaded

Service Startup Order:

API service (port 18080)
Executor service
Worker service
Sensor service
Notifier service

Automated Test Runner

Script: tests/run_e2e_tests.sh

#!/bin/bash
set -e

echo "=== Attune E2E Test Suite ==="

# 1. Setup
echo "[1/7] Setting up test environment..."
./tests/scripts/setup-test-env.sh

# 2. Start services
echo "[2/7] Starting services..."
./tests/scripts/start-services.sh

# 3. Wait for services
echo "[3/7] Waiting for services to be ready..."
./tests/scripts/wait-for-services.sh

# 4. Run Tier 1 tests
echo "[4/7] Running Tier 1 tests (Core Flows)..."
pytest tests/e2e/tier1/ -v

# 5. Run Tier 2 tests
echo "[5/7] Running Tier 2 tests (Orchestration)..."
pytest tests/e2e/tier2/ -v

# 6. Run Tier 3 tests
echo "[6/7] Running Tier 3 tests (Advanced)..."
pytest tests/e2e/tier3/ -v

# 7. Cleanup
echo "[7/7] Cleaning up..."
./tests/scripts/stop-services.sh

echo "=== Test Suite Complete ==="

CI/CD Integration

GitHub Actions Workflow:

name: E2E Tests

on: [push, pull_request]

jobs:
  e2e:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:14
        env:
          POSTGRES_PASSWORD: postgres
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
      
      rabbitmq:
        image: rabbitmq:3-management
        options: >-
          --health-cmd "rabbitmq-diagnostics -q ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup Rust
        uses: actions-rs/toolchain@v1
      
      - name: Build services
        run: cargo build --release
      
      - name: Run migrations
        run: sqlx migrate run
      
      - name: Run E2E tests
        run: ./tests/run_e2e_tests.sh
        timeout-minutes: 30

Test Reporting

Success Metrics

Per Test:

Pass/Fail status
Execution time
Resource usage (CPU, memory)
Service logs (if failed)

Overall Suite:

Total tests: 40
Passed: X
Failed: Y
Skipped: Z
Total time: N minutes
Success rate: X%

Test Report Format

=== Attune E2E Test Report ===
Date: 2026-01-27 10:30:00
Duration: 15 minutes 32 seconds

Tier 1: Core Flows (8 tests)
  ✅ T1.1 Interval Timer Automation (28.3s)
  ✅ T1.2 Date Timer Execution (9.1s)
  ✅ T1.3 Cron Timer Execution (68.5s)
  ✅ T1.4 Webhook Trigger (14.2s)
  ✅ T1.5 Workflow with Array Iteration (19.8s)
  ✅ T1.6 Key-Value Store Read (8.9s)
  ✅ T1.7 Multi-Tenant Isolation (18.3s)
  ✅ T1.8 Action Failure Handling (13.7s)
  
Tier 2: Orchestration (13 tests)
  ✅ T2.1 Nested Workflow (29.4s)
  ✅ T2.2 Failure Handling (24.1s)
  ❌ T2.3 Datastore Write (FAILED - timeout)
  ...

Tier 3: Advanced (19 tests)
  ⏭️  T3.17 Container Runner (SKIPPED - Docker not available)
  ...

Summary:
  Total: 40 tests
  Passed: 38 (95%)
  Failed: 1 (2.5%)
  Skipped: 1 (2.5%)
  Success Rate: 95%

Failed Tests:
  T2.3: Datastore Write
    Error: Execution timeout after 30 seconds
    Logs: /tmp/attune-e2e/logs/t2.3-failure.log

Maintenance and Updates

Adding New Tests

Document test in this plan with:
- Priority tier
- Duration estimate
- Description and steps
- Success criteria
Create test fixture if needed:
- Add to tests/fixtures/
- Document fixture setup
Implement test in appropriate tier:
- tests/e2e/tier1/test_*.py
- Use test helpers from tests/helpers/
Update test count in summary

Updating Existing Tests

When platform features change:

Review affected tests
Update test steps and criteria
Update expected outcomes
Re-run test to validate

Deprecating Tests

When features are removed:

Mark test as deprecated
Move to tests/e2e/deprecated/
Update test count in summary

Troubleshooting

Common Test Failures

Symptom: Test timeout
Causes:

Service not running
Message queue not connected
Database migration issue Solution: Check service logs, verify connectivity

Symptom: Execution stuck in 'scheduled' status
Causes:

Worker not consuming queue
Worker crashed
Queue message not delivered Solution: Check worker logs, verify RabbitMQ queues

Symptom: Multi-tenant test fails
Causes:

Missing tenant_id filter in query
JWT token for wrong tenant Solution: Verify repository filters, check JWT claims

Debug Mode

Run tests with verbose logging:

RUST_LOG=debug ./tests/run_e2e_tests.sh

Capture service logs:

./tests/scripts/start-services.sh --log-dir=/tmp/attune-logs

Test Data Cleanup

Reset test database between runs:

./tests/scripts/reset-test-db.sh

Appendix

Test Fixture Catalog

Packs:

test_pack - Simple echo action for basic tests
timer_pack - Timer trigger examples
webhook_pack - Webhook trigger examples
workflow_pack - Multi-task workflows
failing_pack - Actions that fail for error testing

Users:

test_admin - Admin role, tenant_id=1
test_viewer - Viewer role, tenant_id=1
test_user_2 - Admin role, tenant_id=2

Secrets:

test_api_key - For secret injection tests
test_password - Encrypted datastore value

API Endpoints Reference

All tests use these core endpoints:

Authentication:

POST /auth/register - Create test user
POST /auth/login - Get JWT token
POST /auth/refresh - Refresh token

Packs:

GET /api/v1/packs - List packs
POST /api/v1/packs - Register pack
POST /api/v1/packs/{id}/reload - Reload pack

Actions:

POST /api/v1/actions - Create action
GET /api/v1/actions - List actions

Triggers:

POST /api/v1/triggers - Create trigger
POST /api/v1/webhooks/{id} - Fire webhook

Rules:

POST /api/v1/rules - Create rule
PATCH /api/v1/rules/{id} - Update rule (enable/disable)

Executions:

GET /api/v1/executions - List executions
GET /api/v1/executions/{id} - Get execution details
POST /api/v1/executions/{id}/cancel - Cancel execution

Inquiries:

GET /api/v1/inquiries - List pending inquiries
POST /api/v1/inquiries/{id}/respond - Respond to inquiry

Datastore:

GET /api/v1/datastore/{key} - Read value
POST /api/v1/datastore - Write value

Performance Benchmarks

Target Latencies:

API response time (p95): <100ms
Webhook to event: <50ms
Event to enforcement: <100ms
Enforcement to execution: <500ms
Total trigger-to-execution: <1000ms

Throughput Targets:

Executions per second: 100+
Concurrent workflows: 50+
Timer precision: ±500ms

Document Version: 1.0
Last Review: 2026-01-27
Next Review: After Tier 1 tests implemented
Owner: Attune Development Team

49 KiB Raw Blame History

End-to-End Test Plan

Executive Summary

Test Infrastructure Requirements

Services Required

Test Environment Configuration

Test Fixtures

Test Tier Breakdown

Tier 1: Core Automation Flows (MVP Essential)

T1.1: Interval Timer Automation

T1.2: Date Timer (One-Shot Execution)

T1.3: Cron Timer Execution

T1.4: Webhook Trigger with Payload

T1.5: Workflow with Array Iteration (with-items)

T1.6: Action Reads from Key-Value Store

T1.7: Multi-Tenant Isolation

T1.8: Action Execution Failure Handling

Tier 2: Orchestration & Data Flow

T2.1: Nested Workflow Execution

T2.2: Workflow with Failure Handling

T2.3: Action Writes to Key-Value Store

T2.4: Parameter Templating and Context

T2.5: Rule Criteria Evaluation

T2.6: Approval Workflow (Inquiry)

T2.7: Inquiry Timeout Handling

T2.8: Retry Policy Execution

T2.9: Execution Timeout Policy

T2.10: Parallel Execution (with-items)

T2.11: Sequential Workflow with Dependencies

T2.12: Python Action with Dependencies

T2.13: Node.js Action Execution

Tier 3: Advanced Features & Edge Cases

T3.1: Date Timer with Past Date

T3.2: Timer Cancellation

T3.3: Multiple Concurrent Timers

T3.4: Webhook with Multiple Rules

T3.5: Webhook with Rule Criteria Filtering

T3.6: Sensor-Generated Custom Event

T3.7: Complex Workflow Orchestration

T3.8: Chained Webhook Triggers

T3.9: Multi-Step Approval Workflow

T3.10: RBAC Permission Checks

T3.11: System vs User Packs

T3.12: Worker Crash Recovery

T3.13: Invalid Action Parameters

T3.14: Execution Completion Notification

T3.15: Inquiry Creation Notification

T3.16: Rule Trigger Notification (Optional)

T3.17: Container Runner Execution

T3.18: HTTP Runner Execution

T3.19: Dependency Conflict Isolation

T3.20: Secret Injection Security

T3.21: Action Log Size Limits

T3.22: Execution History Pagination

T3.23: Execution Cancellation

T3.24: High-Frequency Trigger Performance

T3.25: Large Workflow (100+ Tasks)

T3.26: Pack Update/Reload

T3.27: Datastore Encryption at Rest

T3.28: Execution Audit Trail

T3.29: Rate Limiting

T3.30: Graceful Service Shutdown

Test Execution Strategy

Test Ordering

Test Environment Setup

Automated Test Runner

CI/CD Integration

Test Reporting

Success Metrics

Test Report Format

Maintenance and Updates

Adding New Tests

Updating Existing Tests

Deprecating Tests

Troubleshooting

Common Test Failures

Debug Mode

Test Data Cleanup

Appendix

Test Fixture Catalog

49 KiB

Raw Blame History