6.9 KiB
6.9 KiB
Sensor Worker Registration - Completion Checklist
Feature: Runtime capability reporting for sensor workers
Date: 2026-01-31
Status: Implementation Complete - Requires DB Migration
Implementation Status
✅ COMPLETE - Code Implementation
- Database migration created (
20260131000001_add_worker_role.sql) WorkerRoleenum added to modelsWorkermodel updated withworker_rolefieldSensorConfigstruct added to config systemSensorWorkerRegistrationmodule implemented- Service integration in
SensorService - Runtime detection with 3-tier priority system
- Heartbeat mechanism implemented
- Graceful shutdown/deregistration
- Unit tests included
- Comprehensive documentation written
⚠️ PENDING - Database & Testing
- Database migration applied
- SQLx metadata regenerated
- Integration tests run
- Manual testing with live sensor service
Required Steps to Complete
Step 1: Start Database
# Ensure PostgreSQL is running
sudo systemctl start postgresql
# OR
docker-compose up -d postgres
Step 2: Apply Migration
cd attune
# Set database URL
export DATABASE_URL="postgresql://postgres:postgres@localhost:5432/attune"
# Run migrations
sqlx migrate run
# Verify migration applied
psql $DATABASE_URL -c "\d worker"
# Should see: worker_role worker_role_enum NOT NULL
Step 3: Regenerate SQLx Metadata
# With database running
cargo sqlx prepare --workspace
# Verify no compilation errors
cargo check --workspace
Step 4: Manual Testing
# Terminal 1: Start sensor service
cargo run --bin attune-sensor
# Expected logs:
# - "Registering sensor worker..."
# - "Sensor worker registered with ID: X"
# - "Sensor worker heartbeat sent" (every 30s)
# Terminal 2: Query database
psql $DATABASE_URL -c "
SELECT id, name, worker_role, status,
capabilities->'runtimes' as runtimes,
last_heartbeat
FROM worker
WHERE worker_role = 'sensor';
"
# Expected output:
# - One row with worker_role = 'sensor'
# - status = 'active'
# - runtimes array (e.g., ["python", "shell", "node", "native"])
# - Recent last_heartbeat timestamp
# Terminal 1: Stop sensor service (Ctrl+C)
# Expected log: "Deregistering sensor worker..."
# Terminal 2: Verify status changed
psql $DATABASE_URL -c "
SELECT status FROM worker WHERE worker_role = 'sensor';
"
# Expected: status = 'inactive'
Step 5: Test Runtime Detection
# Test auto-detection
cargo run --bin attune-sensor
# Check logs for "Auto-detected runtimes: ..."
# Test environment variable override
export ATTUNE_SENSOR_RUNTIMES="shell,native"
cargo run --bin attune-sensor
# Verify capabilities only include shell and native
# Test config file
cat > config.test-sensor.yaml <<EOF
sensor:
worker_name: "test-sensor-01"
capabilities:
runtimes: ["python"]
max_concurrent_sensors: 5
EOF
ATTUNE_CONFIG=config.test-sensor.yaml cargo run --bin attune-sensor
# Verify worker_name and runtimes from config
Step 6: Test Heartbeat
# Start sensor service
cargo run --bin attune-sensor &
SENSOR_PID=$!
# Wait 2 minutes
sleep 120
# Check heartbeat updates
psql $DATABASE_URL -c "
SELECT name, last_heartbeat,
NOW() - last_heartbeat as age
FROM worker
WHERE worker_role = 'sensor';
"
# Expected: age should be < 30 seconds
# Cleanup
kill $SENSOR_PID
Step 7: Integration Tests
# Run sensor service tests
cargo test --package attune-sensor
# Run integration tests (if DB available)
cargo test --package attune-sensor -- --ignored
# Verify all tests pass
Verification Checklist
Database Schema
worker_role_enumtype exists with values: action, sensor, hybridworkertable hasworker_rolecolumn (NOT NULL)- Indexes created:
idx_worker_role,idx_worker_role_status - Existing workers have
worker_role = 'action'
Configuration
- Can parse
sensorconfig section from YAML ATTUNE_SENSOR_RUNTIMESenv var worksATTUNE__SENSOR__*env var overrides work- Auto-detection falls back correctly
Registration
- Sensor service registers on startup
- Creates worker record with
worker_role = 'sensor' - Sets
status = 'active' - Populates
capabilitieswith detected runtimes - Records hostname in
hostfield
Heartbeat
- Heartbeat loop starts after registration
last_heartbeatupdates every 30s (default)- Heartbeat interval configurable via config
- Errors logged but don't crash service
Deregistration
- Service shutdown sets
status = 'inactive' - Worker record remains in database (not deleted)
- Deregistration logged
Runtime Detection
- Auto-detects Python if
python3orpythonavailable - Auto-detects Node.js if
nodeavailable - Always includes "shell" and "native"
- Env var
ATTUNE_SENSOR_RUNTIMESoverrides all - Config file
sensor.capabilities.runtimesoverrides auto-detection - Detection priority: env var > config > auto-detect
Known Issues / Limitations
Current
- ✅ None - implementation is feature-complete
Future Work
- 🔮 Distributed sensor scheduling not yet implemented (foundation is ready)
- 🔮 No automatic cleanup of stale workers (manual SQL required)
- 🔮 No API endpoints for querying sensor workers yet
- 🔮 Hybrid workers (action + sensor) not tested
Rollback Plan
If issues arise:
# Rollback migration
sqlx migrate revert
# Remove worker_role column and enum
psql $DATABASE_URL -c "
ALTER TABLE worker DROP COLUMN worker_role;
DROP TYPE worker_role_enum;
"
# Revert code changes
git revert <commit-hash>
Documentation Review
docs/sensors/sensor-worker-registration.md- Full documentationdocs/QUICKREF-sensor-worker-registration.md- Quick referencework-summary/sensor-worker-registration.md- Implementation summary- This checklist created
Sign-off
- Database migration applied and verified
- SQLx metadata regenerated
- All compilation warnings resolved
- Manual testing completed
- Integration tests pass
- Documentation reviewed
- AGENTS.md updated (if needed)
- Ready for production use
Post-Deployment Monitoring
Once deployed, monitor:
-- Active sensor workers
SELECT COUNT(*) FROM worker
WHERE worker_role = 'sensor' AND status = 'active';
-- Workers with stale heartbeat (> 2 minutes)
SELECT name, last_heartbeat, NOW() - last_heartbeat AS lag
FROM worker
WHERE worker_role = 'sensor'
AND status = 'active'
AND last_heartbeat < NOW() - INTERVAL '2 minutes';
-- Runtime distribution
SELECT
jsonb_array_elements_text(capabilities->'runtimes') AS runtime,
COUNT(*) AS worker_count
FROM worker
WHERE worker_role = 'sensor' AND status = 'active'
GROUP BY runtime;
Next Session: Apply migration, test with live database, verify all checks pass