[wip] universal workers

This commit is contained in:
2026-03-21 07:32:11 -05:00
parent 0782675a2b
commit 8ba7e3bb84
59 changed files with 4971 additions and 34 deletions

View File

@@ -0,0 +1,219 @@
# Quick Reference: Agent-Based Workers
> **TL;DR**: Inject the `attune-agent` binary into _any_ container image to turn it into an Attune worker. No Dockerfiles. No Rust compilation. ~12 lines of YAML.
## How It Works
1. The `init-agent` service (in `docker-compose.yaml`) builds the statically-linked `attune-agent` binary and copies it into the `agent_bin` volume
2. Your worker service mounts `agent_bin` read-only and uses the agent as its entrypoint
3. On startup, the agent auto-detects available runtimes (Python, Ruby, Node.js, Shell, etc.)
4. The worker registers with Attune and starts processing executions
## Quick Start
### Option A: Use the override file
```bash
# Start all services including the example Ruby agent worker
docker compose -f docker-compose.yaml -f docker-compose.agent.yaml up -d
```
The `docker-compose.agent.yaml` file includes a ready-to-use Ruby worker and commented-out templates for Python 3.12, GPU, and custom images.
### Option B: Add to docker-compose.override.yaml
Create a `docker-compose.override.yaml` in the project root:
```yaml
services:
worker-my-runtime:
image: my-org/my-custom-image:latest
container_name: attune-worker-my-runtime
depends_on:
init-agent:
condition: service_completed_successfully
init-packs:
condition: service_completed_successfully
migrations:
condition: service_completed_successfully
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
entrypoint: ["/opt/attune/agent/attune-agent"]
stop_grace_period: 45s
environment:
RUST_LOG: info
ATTUNE_CONFIG: /opt/attune/config/config.yaml
ATTUNE_WORKER_NAME: worker-my-runtime-01
ATTUNE_WORKER_TYPE: container
ATTUNE__SECURITY__JWT_SECRET: ${JWT_SECRET:-docker-dev-secret-change-in-production}
ATTUNE__SECURITY__ENCRYPTION_KEY: ${ENCRYPTION_KEY:-docker-dev-encryption-key-please-change-in-production-32plus}
ATTUNE__DATABASE__URL: postgresql://attune:attune@postgres:5432/attune
ATTUNE__MESSAGE_QUEUE__URL: amqp://attune:attune@rabbitmq:5672
ATTUNE_API_URL: http://attune-api:8080
volumes:
- agent_bin:/opt/attune/agent:ro
- ${ATTUNE_DOCKER_CONFIG_PATH:-./config.docker.yaml}:/opt/attune/config/config.yaml:ro
- packs_data:/opt/attune/packs:ro
- runtime_envs:/opt/attune/runtime_envs
- artifacts_data:/opt/attune/artifacts
healthcheck:
test: ["CMD-SHELL", "pgrep -f attune-agent || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 20s
networks:
- attune-network
restart: unless-stopped
```
Then run:
```bash
docker compose up -d
```
Docker Compose automatically merges `docker-compose.override.yaml`.
## Required Volumes
Every agent worker needs these volumes:
| Volume | Mount Path | Mode | Purpose |
|--------|-----------|------|---------|
| `agent_bin` | `/opt/attune/agent` | `ro` | The statically-linked agent binary |
| `packs_data` | `/opt/attune/packs` | `ro` | Pack files (actions, workflows, etc.) |
| `runtime_envs` | `/opt/attune/runtime_envs` | `rw` | Isolated runtime environments (venvs, node_modules) |
| `artifacts_data` | `/opt/attune/artifacts` | `rw` | File-backed artifact storage |
| Config YAML | `/opt/attune/config/config.yaml` | `ro` | Attune configuration |
## Required Environment Variables
| Variable | Description | Example |
|----------|-------------|---------|
| `ATTUNE_CONFIG` | Path to config file inside container | `/opt/attune/config/config.yaml` |
| `ATTUNE_WORKER_NAME` | Unique worker name | `worker-ruby-01` |
| `ATTUNE_WORKER_TYPE` | Worker type | `container` |
| `ATTUNE__DATABASE__URL` | PostgreSQL connection string | `postgresql://attune:attune@postgres:5432/attune` |
| `ATTUNE__MESSAGE_QUEUE__URL` | RabbitMQ connection string | `amqp://attune:attune@rabbitmq:5672` |
| `ATTUNE__SECURITY__JWT_SECRET` | JWT signing secret | (use env var) |
| `ATTUNE__SECURITY__ENCRYPTION_KEY` | Encryption key for secrets | (use env var) |
### Optional Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `ATTUNE_WORKER_RUNTIMES` | Override auto-detection | Auto-detected |
| `ATTUNE_API_URL` | API URL for token generation | `http://attune-api:8080` |
| `RUST_LOG` | Log level | `info` |
## Runtime Auto-Detection
The agent probes for these runtimes automatically:
| Runtime | Probed Binaries |
|---------|----------------|
| Shell | `bash`, `sh` |
| Python | `python3`, `python` |
| Node.js | `node`, `nodejs` |
| Ruby | `ruby` |
| Go | `go` |
| Java | `java` |
| R | `Rscript` |
| Perl | `perl` |
To override, set `ATTUNE_WORKER_RUNTIMES`:
```yaml
environment:
ATTUNE_WORKER_RUNTIMES: python,shell # Only advertise Python and Shell
```
## Testing Detection
Run the agent in detect-only mode to see what it finds:
```bash
# In a running container
docker exec <container> /opt/attune/agent/attune-agent --detect-only
# Or start a throwaway container
docker run --rm -v agent_bin:/opt/attune/agent:ro ruby:3.3-slim /opt/attune/agent/attune-agent --detect-only
```
## Examples
### Ruby Worker
```yaml
worker-ruby:
image: ruby:3.3-slim
entrypoint: ["/opt/attune/agent/attune-agent"]
# ... (standard depends_on, volumes, env, networks)
```
### Node.js 22 Worker
```yaml
worker-node22:
image: node:22-slim
entrypoint: ["/opt/attune/agent/attune-agent"]
# ...
```
### GPU Worker (NVIDIA CUDA)
```yaml
worker-gpu:
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
runtime: nvidia
entrypoint: ["/opt/attune/agent/attune-agent"]
environment:
ATTUNE_WORKER_RUNTIMES: python,shell # Override — CUDA image has python
# ...
```
### Multi-Runtime Custom Image
```yaml
worker-data-science:
image: my-org/data-science:latest # Has Python, R, and Julia
entrypoint: ["/opt/attune/agent/attune-agent"]
# Agent auto-detects all available runtimes
# ...
```
## Comparison: Traditional vs Agent Workers
| Aspect | Traditional Worker | Agent Worker |
|--------|-------------------|--------------|
| Docker build | Required (5+ min) | None |
| Dockerfile | Custom per runtime | Not needed |
| Base image | `debian:bookworm-slim` | Any image |
| Runtime install | Via apt/NodeSource | Pre-installed in image |
| Configuration | Manual `ATTUNE_WORKER_RUNTIMES` | Auto-detected |
| Binary | Compiled into image | Injected via volume |
| Update cycle | Rebuild image | Restart `init-agent` |
## Troubleshooting
### Agent binary not found
```
exec /opt/attune/agent/attune-agent: no such file or directory
```
The `init-agent` service hasn't completed. Check:
```bash
docker compose logs init-agent
```
### "No runtimes detected"
The container image doesn't have any recognized interpreters in `$PATH`. Either:
- Use an image that includes your runtime (e.g., `ruby:3.3-slim`)
- Set `ATTUNE_WORKER_RUNTIMES` manually
### Connection refused to PostgreSQL/RabbitMQ
Ensure your `depends_on` conditions include `postgres` and `rabbitmq` health checks, and that the container is on the `attune-network`.
## See Also
- [Universal Worker Agent Plan](plans/universal-worker-agent.md) — Full architecture document
- [Docker Deployment](docker-deployment.md) — General Docker setup
- [Worker Service](architecture/worker-service.md) — Worker architecture details

View File

@@ -0,0 +1,146 @@
# Quick Reference: Kubernetes Agent Workers
Agent-based workers let you run Attune actions inside **any container image** by injecting a statically-linked `attune-agent` binary via a Kubernetes init container. No custom Dockerfile required — just point at an image that has your runtime installed.
## How It Works
1. An **init container** (`agent-loader`) copies the `attune-agent` binary from the `attune-agent` image into an `emptyDir` volume
2. The **worker container** uses your chosen image (e.g., `ruby:3.3`) and runs the agent binary as its entrypoint
3. The agent **auto-detects** available runtimes (python, ruby, node, shell, etc.) and registers with Attune
4. Actions targeting those runtimes are routed to the agent worker via RabbitMQ
## Helm Values
Add entries to `agentWorkers` in your `values.yaml`:
```yaml
agentWorkers:
- name: ruby
image: ruby:3.3
replicas: 2
- name: python-gpu
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
replicas: 1
runtimes: [python, shell]
runtimeClassName: nvidia
nodeSelector:
gpu: "true"
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
resources:
limits:
nvidia.com/gpu: 1
- name: custom
image: my-org/my-custom-image:latest
replicas: 1
env:
- name: MY_CUSTOM_VAR
value: my-value
```
### Supported Fields
| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `name` | Yes | — | Unique name (used in Deployment and worker names) |
| `image` | Yes | — | Container image with your desired runtime(s) |
| `replicas` | No | `1` | Number of pod replicas |
| `runtimes` | No | `[]` (auto-detect) | List of runtimes to expose (e.g., `[python, shell]`) |
| `resources` | No | `{}` | Kubernetes resource requests/limits |
| `env` | No | — | Extra environment variables (`[{name, value}]`) |
| `imagePullPolicy` | No | — | Pull policy for the worker image |
| `logLevel` | No | `info` | `RUST_LOG` level |
| `runtimeClassName` | No | — | Kubernetes RuntimeClass (e.g., `nvidia`) |
| `nodeSelector` | No | — | Node selector for pod scheduling |
| `tolerations` | No | — | Tolerations for pod scheduling |
| `stopGracePeriod` | No | `45` | Termination grace period (seconds) |
## Install / Upgrade
```bash
helm upgrade --install attune oci://registry.example.com/namespace/helm/attune \
--version 0.3.0 \
--set global.imageRegistry=registry.example.com \
--set global.imageNamespace=namespace \
--set global.imageTag=0.3.0 \
-f my-values.yaml
```
## What Gets Created
For each `agentWorkers` entry, the chart creates a Deployment named `<release>-attune-agent-worker-<name>` with:
- **Init containers**:
- `agent-loader` — copies the agent binary from the `attune-agent` image to an `emptyDir` volume
- `wait-for-schema` — polls PostgreSQL until the Attune schema is ready
- `wait-for-packs` — waits for the core pack to be available on the shared PVC
- **Worker container** — runs `attune-agent` as the entrypoint inside your chosen image
- **Volumes**: `agent-bin` (emptyDir), `config` (ConfigMap), `packs` (PVC, read-only), `runtime-envs` (PVC), `artifacts` (PVC)
## Runtime Auto-Detection
When `runtimes` is empty (the default), the agent probes the container for interpreters:
| Runtime | Probed Binaries |
|---------|----------------|
| Shell | `bash`, `sh` |
| Python | `python3`, `python` |
| Node.js | `node`, `nodejs` |
| Ruby | `ruby` |
| Go | `go` |
| Java | `java` |
| R | `Rscript` |
| Perl | `perl` |
Set `runtimes` explicitly to skip auto-detection and only register the listed runtimes.
## Prerequisites
- The `attune-agent` image must be available in your registry (built from `docker/Dockerfile.agent`, target `agent-init`)
- Shared PVCs (`packs`, `runtime-envs`, `artifacts`) must support `ReadWriteMany` if agent workers run on different nodes than the standard worker
- The Attune database and RabbitMQ must be reachable from agent worker pods
## Differences from the Standard Worker
| Aspect | Standard Worker (`worker`) | Agent Worker (`agentWorkers`) |
|--------|---------------------------|-------------------------------|
| Image | Built from `Dockerfile.worker.optimized` | Any image (ruby, python, cuda, etc.) |
| Binary | Baked into the image | Injected via init container |
| Runtimes | Configured at build time | Auto-detected or explicitly listed |
| Use case | Known, pre-built runtime combos | Custom images, exotic runtimes, GPU |
Both worker types coexist — actions are routed to whichever worker has the matching runtime registered.
## Troubleshooting
**Agent binary not found**: Check that the `agent-loader` init container completed. View its logs:
```bash
kubectl logs <pod> -c agent-loader
```
**Runtime not detected**: Run the agent with `--detect-only` to see what it finds:
```bash
kubectl exec <pod> -c worker -- /opt/attune/agent/attune-agent --detect-only
```
**Worker not registering**: Check the worker container logs for database/MQ connectivity:
```bash
kubectl logs <pod> -c worker
```
**Packs not available**: Ensure the `init-packs` job has completed and the PVC is mounted:
```bash
kubectl get jobs | grep init-packs
kubectl exec <pod> -c worker -- ls /opt/attune/packs/core/
```
## See Also
- [Agent Workers (Docker Compose)](QUICKREF-agent-workers.md)
- [Universal Worker Agent Plan](plans/universal-worker-agent.md)
- [Gitea Registry and Helm](deployment/gitea-registry-and-helm.md)
- [Production Deployment](deployment/production-deployment.md)

View File

@@ -19,6 +19,7 @@ The workflow publishes these images to the Gitea OCI registry:
- `attune-migrations`
- `attune-init-user`
- `attune-init-packs`
- `attune-agent`
The Helm chart is pushed as an OCI chart to:

File diff suppressed because it is too large Load Diff