re-uploading work
This commit is contained in:
311
work-summary/unified-runtime-detection.md
Normal file
311
work-summary/unified-runtime-detection.md
Normal file
@@ -0,0 +1,311 @@
|
||||
# Work Summary: Unified Runtime Detection System
|
||||
|
||||
**Date:** 2024-02-03
|
||||
**Status:** ✅ Complete
|
||||
**Related Documentation:**
|
||||
- `docs/sensors/database-driven-runtime-detection.md`
|
||||
- `docs/sensors/sensor-worker-registration.md`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Consolidated runtime detection logic from sensor and worker services into a unified system in the `attune-common` crate. Removed the redundant `runtime_type` distinction between action and sensor runtimes, as both use identical binaries and verification processes.
|
||||
|
||||
---
|
||||
|
||||
## Key Changes
|
||||
|
||||
### 1. Removed `runtime_type` Field from Database Schema
|
||||
|
||||
**Migration:** `migrations/20260203000001_unify_runtimes.sql`
|
||||
|
||||
- Dropped `runtime_type` enum column from `runtime` table
|
||||
- Dropped `runtime_type_enum` PostgreSQL enum type
|
||||
- Consolidated duplicate runtime records:
|
||||
- `core.action.python` + `core.sensor.python` → `core.python`
|
||||
- `core.action.nodejs` + `core.sensor.nodejs` → `core.nodejs`
|
||||
- `core.action.shell` + `core.sensor.shell` → `core.shell`
|
||||
- `core.action.native` + `core.sensor.native` → `core.native`
|
||||
- Migrated all foreign key references in `action` and `sensor` tables
|
||||
- Updated indexes to remove `runtime_type` references
|
||||
- Kept `core.sensor.builtin` as sensor-specific (for internal timers/triggers)
|
||||
|
||||
### 2. Created Unified Runtime Detection Module
|
||||
|
||||
**New File:** `crates/common/src/runtime_detection.rs`
|
||||
|
||||
Provides a single `RuntimeDetector` service used by both worker and sensor services:
|
||||
|
||||
```rust
|
||||
pub struct RuntimeDetector {
|
||||
pool: PgPool,
|
||||
}
|
||||
|
||||
impl RuntimeDetector {
|
||||
pub async fn detect_capabilities(
|
||||
&self,
|
||||
config: &Config,
|
||||
env_var_name: &str,
|
||||
config_capabilities: Option<&HashMap<String, serde_json::Value>>,
|
||||
) -> Result<HashMap<String, serde_json::Value>>
|
||||
}
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Three-tier priority configuration:
|
||||
1. Environment variable override (e.g., `ATTUNE_WORKER_RUNTIMES`, `ATTUNE_SENSOR_RUNTIMES`)
|
||||
2. Config file specification
|
||||
3. Database-driven detection with verification
|
||||
- Queries unified `runtime` table (no `runtime_type` filter needed)
|
||||
- Uses verification metadata from database to check runtime availability
|
||||
- Supports verification commands, exit codes, regex patterns, and "always available" flags
|
||||
- Returns detected capabilities as HashMap for worker registration
|
||||
|
||||
### 3. Updated Runtime Model
|
||||
|
||||
**File:** `crates/common/src/models.rs`
|
||||
|
||||
- Removed `runtime_type` field from `Runtime` struct
|
||||
- Removed `RuntimeType` enum from `enums` module
|
||||
- Updated all SQLx `FromRow` derives to exclude the field
|
||||
|
||||
### 4. Updated Runtime Repository
|
||||
|
||||
**File:** `crates/common/src/repositories/runtime.rs`
|
||||
|
||||
- Removed `runtime_type` from `CreateRuntimeInput`
|
||||
- Updated all SQL queries to exclude `runtime_type` column
|
||||
- Removed `find_by_type()` method (no longer needed)
|
||||
- Updated `find_by_pack()` to work without type filtering
|
||||
|
||||
### 5. Refactored Worker Service
|
||||
|
||||
**File:** `crates/worker/src/registration.rs`
|
||||
|
||||
**Before:**
|
||||
- Hardcoded `auto_detect_runtimes()` function
|
||||
- Checked for binaries using `Command::new("python3").arg("--version")`
|
||||
- No database integration
|
||||
|
||||
**After:**
|
||||
- Uses `RuntimeDetector::detect_capabilities()` from common crate
|
||||
- Calls detection in `detect_capabilities()` async method
|
||||
- Worker service calls this before registration in `start()` method
|
||||
- Removed hardcoded detection logic
|
||||
|
||||
### 6. Refactored Sensor Service
|
||||
|
||||
**File:** `crates/sensor/src/sensor_worker_registration.rs`
|
||||
|
||||
**Before:**
|
||||
- Inline runtime detection with `verify_runtime_available()` and `try_verification_command()`
|
||||
- Duplicated verification logic from what would be needed in worker
|
||||
|
||||
**After:**
|
||||
- Uses `RuntimeDetector::detect_capabilities()` from common crate
|
||||
- Removed 200+ lines of duplicate verification code
|
||||
- Simplified to just call shared detector
|
||||
- Updated `register()` signature to accept `&Config` parameter
|
||||
|
||||
**File:** `crates/sensor/src/service.rs`
|
||||
- Changed `_config` to `config` (stored, not discarded)
|
||||
- Passes config to registration during `start()`
|
||||
|
||||
### 7. Updated Test Infrastructure
|
||||
|
||||
**Files:**
|
||||
- `crates/common/tests/helpers.rs` - Removed `RuntimeType` from `RuntimeFixture`
|
||||
- `crates/common/tests/repository_runtime_tests.rs` - Updated all tests to remove `runtime_type` parameter
|
||||
|
||||
**Changes:**
|
||||
- `RuntimeFixture::new()` signature: removed `runtime_type` parameter
|
||||
- Updated runtime ref format from `core.action.python` to `core.python`
|
||||
- Fixed 15+ test functions to work without runtime type
|
||||
|
||||
---
|
||||
|
||||
## Rationale: Why Unify Runtimes?
|
||||
|
||||
### Problem with Separate Runtime Types
|
||||
|
||||
1. **Duplicate Records:** Same binary (e.g., `python3`) had separate database records for actions and sensors
|
||||
2. **Duplicate Logic:** Worker and sensor services had identical verification code
|
||||
3. **Maintenance Burden:** Changes to runtime detection required updates in 2+ places
|
||||
4. **Conceptual Mismatch:** Runtime type (`action` vs `sensor`) describes *usage*, not *capability*
|
||||
|
||||
### Benefits of Unified System
|
||||
|
||||
1. **Single Source of Truth:** One database record per actual runtime binary
|
||||
2. **Code Reuse:** Shared detection logic in common crate
|
||||
3. **Easier Extension:** Adding new runtimes (Ruby, Go, etc.) requires only database records
|
||||
4. **Clearer Semantics:** Runtime describes what it *is* (Python, Node.js), not what uses it
|
||||
|
||||
### Example: Before vs After
|
||||
|
||||
**Before:**
|
||||
```sql
|
||||
-- Two separate records for the same Python binary
|
||||
INSERT INTO runtime (ref, runtime_type, name) VALUES ('core.action.python', 'action', 'Python');
|
||||
INSERT INTO runtime (ref, runtime_type, name) VALUES ('core.sensor.python', 'sensor', 'Python');
|
||||
```
|
||||
|
||||
**After:**
|
||||
```sql
|
||||
-- One unified record
|
||||
INSERT INTO runtime (ref, name) VALUES ('core.python', 'Python');
|
||||
-- Used by both actions and sensors
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### Environment Variable Override
|
||||
```bash
|
||||
# Worker runtimes
|
||||
export ATTUNE_WORKER_RUNTIMES="python,shell,node"
|
||||
|
||||
# Sensor runtimes
|
||||
export ATTUNE_SENSOR_RUNTIMES="python,shell,builtin"
|
||||
```
|
||||
|
||||
### Config File
|
||||
```yaml
|
||||
worker:
|
||||
capabilities:
|
||||
runtimes: ["python", "shell", "node"]
|
||||
|
||||
sensor:
|
||||
capabilities:
|
||||
runtimes: ["python", "shell", "builtin"]
|
||||
```
|
||||
|
||||
### Database-Driven (Default)
|
||||
If no env var or config, services query `runtime` table and verify each:
|
||||
- Check `distributions->verification->always_available`
|
||||
- Or execute verification commands and check exit codes/patterns
|
||||
- Report only available runtimes in worker capabilities
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
### For Existing Deployments
|
||||
|
||||
1. **Run Migration:** `migrations/20260203000001_unify_runtimes.sql`
|
||||
- Automatically consolidates runtime records
|
||||
- Migrates all foreign key references
|
||||
- No manual intervention required
|
||||
|
||||
2. **Restart Services:** Worker and sensor services pick up changes automatically
|
||||
|
||||
3. **Verify:** Check worker capabilities in database:
|
||||
```sql
|
||||
SELECT name, capabilities->'runtimes' FROM worker;
|
||||
```
|
||||
|
||||
### For New Deployments
|
||||
|
||||
- Migration runs as part of standard setup
|
||||
- Core pack loads unified runtimes (`core.python`, `core.shell`, etc.)
|
||||
- Services auto-detect capabilities on first start
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Compilation
|
||||
```bash
|
||||
cargo check --workspace
|
||||
# ✅ All services compile successfully
|
||||
```
|
||||
|
||||
### Unit Tests
|
||||
```bash
|
||||
cargo test -p attune-common runtime
|
||||
# ✅ All runtime repository tests pass
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
```bash
|
||||
cargo test --test repository_runtime_tests
|
||||
# ✅ CRUD operations work without runtime_type
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Files Changed
|
||||
|
||||
### New Files
|
||||
- `migrations/20260203000001_unify_runtimes.sql` (338 lines)
|
||||
- `crates/common/src/runtime_detection.rs` (338 lines)
|
||||
|
||||
### Modified Files
|
||||
- `crates/common/src/lib.rs` - Added runtime_detection module export
|
||||
- `crates/common/src/models.rs` - Removed RuntimeType enum and field
|
||||
- `crates/common/src/repositories/runtime.rs` - Removed runtime_type column references
|
||||
- `crates/common/Cargo.toml` - Added regex dependency
|
||||
- `crates/worker/src/registration.rs` - Replaced auto-detection with RuntimeDetector
|
||||
- `crates/worker/src/service.rs` - Call detect_capabilities before registration
|
||||
- `crates/sensor/src/sensor_worker_registration.rs` - Replaced inline detection with RuntimeDetector
|
||||
- `crates/sensor/src/service.rs` - Store config and pass to registration
|
||||
- `crates/common/tests/helpers.rs` - Updated RuntimeFixture
|
||||
- `crates/common/tests/repository_runtime_tests.rs` - Removed runtime_type from tests
|
||||
|
||||
### Lines Changed
|
||||
- **Added:** ~676 lines (migration + runtime_detection module)
|
||||
- **Removed:** ~550 lines (duplicate detection code, enum, test updates)
|
||||
- **Net:** +126 lines (mostly migration documentation)
|
||||
|
||||
---
|
||||
|
||||
## Breaking Changes
|
||||
|
||||
### Database Schema
|
||||
- `runtime_type` column removed from `runtime` table
|
||||
- `runtime_type_enum` PostgreSQL type dropped
|
||||
- Runtime refs changed format: `core.action.python` → `core.python`
|
||||
|
||||
### Rust API
|
||||
- `RuntimeType` enum removed from `attune_common::models::enums`
|
||||
- `CreateRuntimeInput` no longer has `runtime_type` field
|
||||
- `RuntimeRepository::find_by_type()` method removed
|
||||
- `RuntimeFixture::new()` signature changed (removed `runtime_type` parameter)
|
||||
|
||||
### Acceptable Because
|
||||
- Project is pre-production (no deployments, no users)
|
||||
- Migration automatically handles data transformation
|
||||
- No external API contracts affected (internal services only)
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Immediate
|
||||
- ✅ Code compiles and tests pass
|
||||
- ✅ Migration tested and verified
|
||||
- ✅ Documentation updated
|
||||
|
||||
### Future Enhancements
|
||||
1. **Version Detection:** Parse runtime versions from verification output
|
||||
2. **Version Constraints:** Allow actions/sensors to require specific versions
|
||||
3. **Distributed Scheduling:** Route executions to workers with compatible runtimes
|
||||
4. **Health Monitoring:** Periodic re-verification of runtime availability
|
||||
5. **API Endpoints:** Expose runtime capabilities via REST API
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
Successfully unified runtime detection across worker and sensor services by:
|
||||
1. Removing artificial `runtime_type` distinction
|
||||
2. Consolidating detection logic into shared module
|
||||
3. Enabling database-driven runtime configuration
|
||||
4. Reducing code duplication and maintenance burden
|
||||
|
||||
The system is now more maintainable, extensible, and better reflects the underlying reality that runtimes are shared infrastructure used by multiple service types.
|
||||
|
||||
**Compilation Status:** ✅ Clean
|
||||
**Test Status:** ✅ Passing
|
||||
**Migration Status:** ✅ Ready for deployment
|
||||
Reference in New Issue
Block a user