11 KiB
Sensor Worker Registration
Version: 1.0
Last Updated: 2026-01-31
Overview
The Sensor Worker Registration system enables sensor service instances to register themselves in the database, report their runtime capabilities (Python, Node.js, Shell, etc.), and maintain heartbeat status. This mirrors the action worker registration system but is tailored for sensor services.
This feature allows for:
- Runtime capability reporting: Each sensor worker reports which runtimes it has available
- Distributed sensor execution: Future support for scheduling sensors on workers with required runtimes
- Service monitoring: Track active sensor workers and their health status
- Resource management: Understand sensor worker capacity and availability
Architecture
Database Schema
Sensor workers use the unified worker table with a worker_role discriminator:
CREATE TABLE worker (
id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL,
worker_type worker_type_enum NOT NULL, -- 'local', 'remote', 'container'
worker_role worker_role_enum NOT NULL, -- 'action', 'sensor', 'hybrid'
runtime BIGINT REFERENCES runtime(id),
host TEXT,
port INTEGER,
status worker_status_enum DEFAULT 'inactive',
capabilities JSONB, -- {"runtimes": ["python", "shell", "node"]}
meta JSONB,
last_heartbeat TIMESTAMPTZ,
created TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Worker Role Enum:
CREATE TYPE worker_role_enum AS ENUM ('action', 'sensor', 'hybrid');
action: Executes actions onlysensor: Monitors triggers and executes sensors onlyhybrid: Can execute both actions and sensors (future use)
Capabilities Structure
The capabilities JSONB field contains:
{
"runtimes": ["python", "shell", "node", "native"],
"max_concurrent_sensors": 10,
"sensor_version": "0.1.0"
}
Configuration
YAML Configuration
Add sensor configuration to your config.yaml:
sensor:
# Sensor worker name (defaults to "sensor-{hostname}")
worker_name: "sensor-production-01"
# Sensor worker host (defaults to hostname)
host: "10.0.1.42"
# Heartbeat interval in seconds
heartbeat_interval: 30
# Sensor poll interval
poll_interval: 30
# Sensor execution timeout
sensor_timeout: 30
# Maximum concurrent sensors
max_concurrent_sensors: 10
# Capabilities (optional - will auto-detect if not specified)
capabilities:
runtimes: ["python", "shell", "node"]
custom_feature: true
Environment Variables
Override runtime detection with:
# Specify available runtimes (comma-separated)
export ATTUNE_SENSOR_RUNTIMES="python,shell"
# Or via config override
export ATTUNE__SENSOR__WORKER_NAME="sensor-custom"
export ATTUNE__SENSOR__HEARTBEAT_INTERVAL="60"
Runtime Detection
Sensor workers auto-detect available runtimes using a priority system:
Priority Order
-
Environment Variable (highest priority)
ATTUNE_SENSOR_RUNTIMES="python,shell,node" -
Config File
sensor: capabilities: runtimes: ["python", "shell"] -
Auto-Detection (lowest priority)
- Checks for
python3orpythonbinary - Checks for
nodebinary - Always includes
shell(bash/sh) - Always includes
native(compiled Rust sensors)
- Checks for
Auto-Detection Logic
// Check for Python
if Command::new("python3").arg("--version").output().is_ok() {
runtimes.push("python".to_string());
}
// Check for Node.js
if Command::new("node").arg("--version").output().is_ok() {
runtimes.push("node".to_string());
}
// Always available
runtimes.push("shell".to_string());
runtimes.push("native".to_string());
Registration Lifecycle
1. Service Startup
When the sensor service starts:
// Create registration manager
let registration = SensorWorkerRegistration::new(db.clone(), &config);
// Register in database
let worker_id = registration.register().await?;
// Sets status to 'active', records capabilities, sets last_heartbeat
Database Operations:
- If worker with same name exists: Update to active status
- If new worker: Insert new record with
worker_role = 'sensor'
2. Heartbeat Loop
While running, sends periodic heartbeats:
// Every 30 seconds (configurable)
registration.heartbeat().await?;
// Updates last_heartbeat, ensures status is 'active'
3. Service Shutdown
On graceful shutdown:
// Mark as inactive
registration.deregister().await?;
// Sets status to 'inactive'
Usage Example
Sensor Service Integration
The SensorService automatically handles registration:
use attune_sensor::SensorService;
#[tokio::main]
async fn main() -> Result<()> {
let config = Config::load()?;
let service = SensorService::new(config).await?;
// Automatically registers sensor worker on start
service.start().await?;
// Automatically deregisters on stop
Ok(())
}
Manual Registration (Advanced)
For custom integrations:
use attune_sensor::SensorWorkerRegistration;
let mut registration = SensorWorkerRegistration::new(pool, &config);
// Register
let worker_id = registration.register().await?;
println!("Registered as worker ID: {}", worker_id);
// Add custom capability
registration.add_capability("gpu_enabled".to_string(), json!(true));
registration.update_capabilities().await?;
// Send heartbeats
loop {
tokio::time::sleep(Duration::from_secs(30)).await;
registration.heartbeat().await?;
}
// Deregister on shutdown
registration.deregister().await?;
Querying Sensor Workers
Find Active Sensor Workers
SELECT id, name, host, capabilities, last_heartbeat
FROM worker
WHERE worker_role = 'sensor' AND status = 'active';
Find Sensor Workers with Python Runtime
SELECT id, name, host, capabilities->'runtimes' as runtimes
FROM worker
WHERE worker_role = 'sensor'
AND status = 'active'
AND capabilities->'runtimes' ? 'python';
Find Stale Sensor Workers (No Heartbeat in 5 Minutes)
SELECT id, name, last_heartbeat
FROM worker
WHERE worker_role = 'sensor'
AND status = 'active'
AND last_heartbeat < NOW() - INTERVAL '5 minutes';
Monitoring
Health Checks
Monitor sensor worker health by checking last_heartbeat:
-- Workers that haven't sent heartbeat in 2x heartbeat interval
SELECT
name,
host,
status,
last_heartbeat,
NOW() - last_heartbeat AS time_since_heartbeat
FROM worker
WHERE worker_role = 'sensor'
AND status = 'active'
AND last_heartbeat < NOW() - INTERVAL '60 seconds'
ORDER BY last_heartbeat;
Metrics to Track
- Active sensor workers: Count of workers with
status = 'active' - Runtime distribution: Which runtimes are available across workers
- Heartbeat lag: Time since last heartbeat for each worker
- Worker capacity: Sum of
max_concurrent_sensorsacross all active workers
Future Enhancements
Distributed Sensor Scheduling
Once sensor worker registration is in place, we can implement:
- Runtime-based scheduling: Schedule sensors only on workers with required runtime
- Load balancing: Distribute sensors across multiple workers
- Failover: Automatically reassign sensors if a worker goes down
- Geographic distribution: Run sensors close to monitored resources
Example: Sensor Scheduling Logic
// Find sensor workers with required runtime
let workers = sqlx::query_as!(
Worker,
r#"
SELECT * FROM worker
WHERE worker_role IN ('sensor', 'hybrid')
AND status = 'active'
AND capabilities->'runtimes' ? $1
ORDER BY last_heartbeat DESC
"#,
required_runtime
)
.fetch_all(&pool)
.await?;
// Schedule sensor on least-loaded worker
let target_worker = select_least_loaded_worker(workers)?;
schedule_sensor_on_worker(sensor, target_worker).await?;
Troubleshooting
Worker Not Registering
Symptom: Sensor service starts but no worker record in database
Checks:
- Verify database connection:
DATABASE_URLis correct - Check logs for registration errors:
grep "Registering sensor worker" logs - Verify migrations applied: Check for
worker_rolecolumn
Solution:
# Check migration status
sqlx migrate info
# Apply migrations
sqlx migrate run
Runtime Not Detected
Symptom: Expected runtime not in capabilities.runtimes
Checks:
- Verify binary is in PATH:
which python3,which node - Check environment variable:
echo $ATTUNE_SENSOR_RUNTIMES - Review sensor service logs for auto-detection output
Solution:
# Explicitly set runtimes
export ATTUNE_SENSOR_RUNTIMES="python,shell,node"
# Or in config.yaml
sensor:
capabilities:
runtimes: ["python", "shell", "node"]
Heartbeat Not Updating
Symptom: last_heartbeat timestamp is stale
Checks:
- Verify sensor service is running
- Check for database connection issues in logs
- Verify heartbeat interval configuration
Solution:
# Check sensor service status
systemctl status attune-sensor
# Review logs
journalctl -u attune-sensor -f | grep heartbeat
Migration from Legacy System
If you have existing sensor services without registration:
- Apply migration:
20260131000001_add_worker_role.sql - Restart sensor services: They will auto-register on startup
- Verify registration: Query
workertable forworker_role = 'sensor'
Existing action workers are automatically marked as worker_role = 'action' by the migration.
Security Considerations
Worker Naming
- Use hostname-based naming for automatic uniqueness
- Avoid hardcoding credentials in worker names
- Consider using UUIDs for ephemeral/containerized workers
Capabilities
- Capabilities are self-reported (trust boundary)
- In distributed setups, validate runtime availability before execution
- Consider runtime verification/attestation for high-security environments
Heartbeat Monitoring
- Stale workers (no heartbeat) should be marked inactive automatically
- Implement worker health checks before scheduling sensors
- Set appropriate heartbeat intervals (too frequent = DB load, too infrequent = slow failover)
API Reference
SensorWorkerRegistration
impl SensorWorkerRegistration {
/// Create new registration manager
pub fn new(pool: PgPool, config: &Config) -> Self;
/// Register sensor worker in database
pub async fn register(&mut self) -> Result<i64>;
/// Send heartbeat to update last_heartbeat
pub async fn heartbeat(&self) -> Result<()>;
/// Mark sensor worker as inactive
pub async fn deregister(&self) -> Result<()>;
/// Get registered worker ID
pub fn worker_id(&self) -> Option<i64>;
/// Get worker name
pub fn worker_name(&self) -> &str;
/// Add custom capability
pub fn add_capability(&mut self, key: String, value: serde_json::Value);
/// Update capabilities in database
pub async fn update_capabilities(&self) -> Result<()>;
}
See Also
- Sensor Service Architecture
- Sensor Runtime Execution
- Worker Service Documentation
- Configuration Guide
Status: ✅ Implemented
Next Steps: Implement distributed sensor scheduling based on worker capabilities