[wip] universal workers
This commit is contained in:
219
docs/QUICKREF-agent-workers.md
Normal file
219
docs/QUICKREF-agent-workers.md
Normal file
@@ -0,0 +1,219 @@
|
||||
# Quick Reference: Agent-Based Workers
|
||||
|
||||
> **TL;DR**: Inject the `attune-agent` binary into _any_ container image to turn it into an Attune worker. No Dockerfiles. No Rust compilation. ~12 lines of YAML.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. The `init-agent` service (in `docker-compose.yaml`) builds the statically-linked `attune-agent` binary and copies it into the `agent_bin` volume
|
||||
2. Your worker service mounts `agent_bin` read-only and uses the agent as its entrypoint
|
||||
3. On startup, the agent auto-detects available runtimes (Python, Ruby, Node.js, Shell, etc.)
|
||||
4. The worker registers with Attune and starts processing executions
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Option A: Use the override file
|
||||
|
||||
```bash
|
||||
# Start all services including the example Ruby agent worker
|
||||
docker compose -f docker-compose.yaml -f docker-compose.agent.yaml up -d
|
||||
```
|
||||
|
||||
The `docker-compose.agent.yaml` file includes a ready-to-use Ruby worker and commented-out templates for Python 3.12, GPU, and custom images.
|
||||
|
||||
### Option B: Add to docker-compose.override.yaml
|
||||
|
||||
Create a `docker-compose.override.yaml` in the project root:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
worker-my-runtime:
|
||||
image: my-org/my-custom-image:latest
|
||||
container_name: attune-worker-my-runtime
|
||||
depends_on:
|
||||
init-agent:
|
||||
condition: service_completed_successfully
|
||||
init-packs:
|
||||
condition: service_completed_successfully
|
||||
migrations:
|
||||
condition: service_completed_successfully
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
rabbitmq:
|
||||
condition: service_healthy
|
||||
entrypoint: ["/opt/attune/agent/attune-agent"]
|
||||
stop_grace_period: 45s
|
||||
environment:
|
||||
RUST_LOG: info
|
||||
ATTUNE_CONFIG: /opt/attune/config/config.yaml
|
||||
ATTUNE_WORKER_NAME: worker-my-runtime-01
|
||||
ATTUNE_WORKER_TYPE: container
|
||||
ATTUNE__SECURITY__JWT_SECRET: ${JWT_SECRET:-docker-dev-secret-change-in-production}
|
||||
ATTUNE__SECURITY__ENCRYPTION_KEY: ${ENCRYPTION_KEY:-docker-dev-encryption-key-please-change-in-production-32plus}
|
||||
ATTUNE__DATABASE__URL: postgresql://attune:attune@postgres:5432/attune
|
||||
ATTUNE__MESSAGE_QUEUE__URL: amqp://attune:attune@rabbitmq:5672
|
||||
ATTUNE_API_URL: http://attune-api:8080
|
||||
volumes:
|
||||
- agent_bin:/opt/attune/agent:ro
|
||||
- ${ATTUNE_DOCKER_CONFIG_PATH:-./config.docker.yaml}:/opt/attune/config/config.yaml:ro
|
||||
- packs_data:/opt/attune/packs:ro
|
||||
- runtime_envs:/opt/attune/runtime_envs
|
||||
- artifacts_data:/opt/attune/artifacts
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pgrep -f attune-agent || exit 1"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 20s
|
||||
networks:
|
||||
- attune-network
|
||||
restart: unless-stopped
|
||||
```
|
||||
|
||||
Then run:
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Docker Compose automatically merges `docker-compose.override.yaml`.
|
||||
|
||||
## Required Volumes
|
||||
|
||||
Every agent worker needs these volumes:
|
||||
|
||||
| Volume | Mount Path | Mode | Purpose |
|
||||
|--------|-----------|------|---------|
|
||||
| `agent_bin` | `/opt/attune/agent` | `ro` | The statically-linked agent binary |
|
||||
| `packs_data` | `/opt/attune/packs` | `ro` | Pack files (actions, workflows, etc.) |
|
||||
| `runtime_envs` | `/opt/attune/runtime_envs` | `rw` | Isolated runtime environments (venvs, node_modules) |
|
||||
| `artifacts_data` | `/opt/attune/artifacts` | `rw` | File-backed artifact storage |
|
||||
| Config YAML | `/opt/attune/config/config.yaml` | `ro` | Attune configuration |
|
||||
|
||||
## Required Environment Variables
|
||||
|
||||
| Variable | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `ATTUNE_CONFIG` | Path to config file inside container | `/opt/attune/config/config.yaml` |
|
||||
| `ATTUNE_WORKER_NAME` | Unique worker name | `worker-ruby-01` |
|
||||
| `ATTUNE_WORKER_TYPE` | Worker type | `container` |
|
||||
| `ATTUNE__DATABASE__URL` | PostgreSQL connection string | `postgresql://attune:attune@postgres:5432/attune` |
|
||||
| `ATTUNE__MESSAGE_QUEUE__URL` | RabbitMQ connection string | `amqp://attune:attune@rabbitmq:5672` |
|
||||
| `ATTUNE__SECURITY__JWT_SECRET` | JWT signing secret | (use env var) |
|
||||
| `ATTUNE__SECURITY__ENCRYPTION_KEY` | Encryption key for secrets | (use env var) |
|
||||
|
||||
### Optional Environment Variables
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `ATTUNE_WORKER_RUNTIMES` | Override auto-detection | Auto-detected |
|
||||
| `ATTUNE_API_URL` | API URL for token generation | `http://attune-api:8080` |
|
||||
| `RUST_LOG` | Log level | `info` |
|
||||
|
||||
## Runtime Auto-Detection
|
||||
|
||||
The agent probes for these runtimes automatically:
|
||||
|
||||
| Runtime | Probed Binaries |
|
||||
|---------|----------------|
|
||||
| Shell | `bash`, `sh` |
|
||||
| Python | `python3`, `python` |
|
||||
| Node.js | `node`, `nodejs` |
|
||||
| Ruby | `ruby` |
|
||||
| Go | `go` |
|
||||
| Java | `java` |
|
||||
| R | `Rscript` |
|
||||
| Perl | `perl` |
|
||||
|
||||
To override, set `ATTUNE_WORKER_RUNTIMES`:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
ATTUNE_WORKER_RUNTIMES: python,shell # Only advertise Python and Shell
|
||||
```
|
||||
|
||||
## Testing Detection
|
||||
|
||||
Run the agent in detect-only mode to see what it finds:
|
||||
|
||||
```bash
|
||||
# In a running container
|
||||
docker exec <container> /opt/attune/agent/attune-agent --detect-only
|
||||
|
||||
# Or start a throwaway container
|
||||
docker run --rm -v agent_bin:/opt/attune/agent:ro ruby:3.3-slim /opt/attune/agent/attune-agent --detect-only
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Ruby Worker
|
||||
```yaml
|
||||
worker-ruby:
|
||||
image: ruby:3.3-slim
|
||||
entrypoint: ["/opt/attune/agent/attune-agent"]
|
||||
# ... (standard depends_on, volumes, env, networks)
|
||||
```
|
||||
|
||||
### Node.js 22 Worker
|
||||
```yaml
|
||||
worker-node22:
|
||||
image: node:22-slim
|
||||
entrypoint: ["/opt/attune/agent/attune-agent"]
|
||||
# ...
|
||||
```
|
||||
|
||||
### GPU Worker (NVIDIA CUDA)
|
||||
```yaml
|
||||
worker-gpu:
|
||||
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
|
||||
runtime: nvidia
|
||||
entrypoint: ["/opt/attune/agent/attune-agent"]
|
||||
environment:
|
||||
ATTUNE_WORKER_RUNTIMES: python,shell # Override — CUDA image has python
|
||||
# ...
|
||||
```
|
||||
|
||||
### Multi-Runtime Custom Image
|
||||
```yaml
|
||||
worker-data-science:
|
||||
image: my-org/data-science:latest # Has Python, R, and Julia
|
||||
entrypoint: ["/opt/attune/agent/attune-agent"]
|
||||
# Agent auto-detects all available runtimes
|
||||
# ...
|
||||
```
|
||||
|
||||
## Comparison: Traditional vs Agent Workers
|
||||
|
||||
| Aspect | Traditional Worker | Agent Worker |
|
||||
|--------|-------------------|--------------|
|
||||
| Docker build | Required (5+ min) | None |
|
||||
| Dockerfile | Custom per runtime | Not needed |
|
||||
| Base image | `debian:bookworm-slim` | Any image |
|
||||
| Runtime install | Via apt/NodeSource | Pre-installed in image |
|
||||
| Configuration | Manual `ATTUNE_WORKER_RUNTIMES` | Auto-detected |
|
||||
| Binary | Compiled into image | Injected via volume |
|
||||
| Update cycle | Rebuild image | Restart `init-agent` |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Agent binary not found
|
||||
```
|
||||
exec /opt/attune/agent/attune-agent: no such file or directory
|
||||
```
|
||||
The `init-agent` service hasn't completed. Check:
|
||||
```bash
|
||||
docker compose logs init-agent
|
||||
```
|
||||
|
||||
### "No runtimes detected"
|
||||
The container image doesn't have any recognized interpreters in `$PATH`. Either:
|
||||
- Use an image that includes your runtime (e.g., `ruby:3.3-slim`)
|
||||
- Set `ATTUNE_WORKER_RUNTIMES` manually
|
||||
|
||||
### Connection refused to PostgreSQL/RabbitMQ
|
||||
Ensure your `depends_on` conditions include `postgres` and `rabbitmq` health checks, and that the container is on the `attune-network`.
|
||||
|
||||
## See Also
|
||||
|
||||
- [Universal Worker Agent Plan](plans/universal-worker-agent.md) — Full architecture document
|
||||
- [Docker Deployment](docker-deployment.md) — General Docker setup
|
||||
- [Worker Service](architecture/worker-service.md) — Worker architecture details
|
||||
146
docs/QUICKREF-kubernetes-agent-workers.md
Normal file
146
docs/QUICKREF-kubernetes-agent-workers.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Quick Reference: Kubernetes Agent Workers
|
||||
|
||||
Agent-based workers let you run Attune actions inside **any container image** by injecting a statically-linked `attune-agent` binary via a Kubernetes init container. No custom Dockerfile required — just point at an image that has your runtime installed.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. An **init container** (`agent-loader`) copies the `attune-agent` binary from the `attune-agent` image into an `emptyDir` volume
|
||||
2. The **worker container** uses your chosen image (e.g., `ruby:3.3`) and runs the agent binary as its entrypoint
|
||||
3. The agent **auto-detects** available runtimes (python, ruby, node, shell, etc.) and registers with Attune
|
||||
4. Actions targeting those runtimes are routed to the agent worker via RabbitMQ
|
||||
|
||||
## Helm Values
|
||||
|
||||
Add entries to `agentWorkers` in your `values.yaml`:
|
||||
|
||||
```yaml
|
||||
agentWorkers:
|
||||
- name: ruby
|
||||
image: ruby:3.3
|
||||
replicas: 2
|
||||
|
||||
- name: python-gpu
|
||||
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
|
||||
replicas: 1
|
||||
runtimes: [python, shell]
|
||||
runtimeClassName: nvidia
|
||||
nodeSelector:
|
||||
gpu: "true"
|
||||
tolerations:
|
||||
- key: nvidia.com/gpu
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
resources:
|
||||
limits:
|
||||
nvidia.com/gpu: 1
|
||||
|
||||
- name: custom
|
||||
image: my-org/my-custom-image:latest
|
||||
replicas: 1
|
||||
env:
|
||||
- name: MY_CUSTOM_VAR
|
||||
value: my-value
|
||||
```
|
||||
|
||||
### Supported Fields
|
||||
|
||||
| Field | Required | Default | Description |
|
||||
|-------|----------|---------|-------------|
|
||||
| `name` | Yes | — | Unique name (used in Deployment and worker names) |
|
||||
| `image` | Yes | — | Container image with your desired runtime(s) |
|
||||
| `replicas` | No | `1` | Number of pod replicas |
|
||||
| `runtimes` | No | `[]` (auto-detect) | List of runtimes to expose (e.g., `[python, shell]`) |
|
||||
| `resources` | No | `{}` | Kubernetes resource requests/limits |
|
||||
| `env` | No | — | Extra environment variables (`[{name, value}]`) |
|
||||
| `imagePullPolicy` | No | — | Pull policy for the worker image |
|
||||
| `logLevel` | No | `info` | `RUST_LOG` level |
|
||||
| `runtimeClassName` | No | — | Kubernetes RuntimeClass (e.g., `nvidia`) |
|
||||
| `nodeSelector` | No | — | Node selector for pod scheduling |
|
||||
| `tolerations` | No | — | Tolerations for pod scheduling |
|
||||
| `stopGracePeriod` | No | `45` | Termination grace period (seconds) |
|
||||
|
||||
## Install / Upgrade
|
||||
|
||||
```bash
|
||||
helm upgrade --install attune oci://registry.example.com/namespace/helm/attune \
|
||||
--version 0.3.0 \
|
||||
--set global.imageRegistry=registry.example.com \
|
||||
--set global.imageNamespace=namespace \
|
||||
--set global.imageTag=0.3.0 \
|
||||
-f my-values.yaml
|
||||
```
|
||||
|
||||
## What Gets Created
|
||||
|
||||
For each `agentWorkers` entry, the chart creates a Deployment named `<release>-attune-agent-worker-<name>` with:
|
||||
|
||||
- **Init containers**:
|
||||
- `agent-loader` — copies the agent binary from the `attune-agent` image to an `emptyDir` volume
|
||||
- `wait-for-schema` — polls PostgreSQL until the Attune schema is ready
|
||||
- `wait-for-packs` — waits for the core pack to be available on the shared PVC
|
||||
- **Worker container** — runs `attune-agent` as the entrypoint inside your chosen image
|
||||
- **Volumes**: `agent-bin` (emptyDir), `config` (ConfigMap), `packs` (PVC, read-only), `runtime-envs` (PVC), `artifacts` (PVC)
|
||||
|
||||
## Runtime Auto-Detection
|
||||
|
||||
When `runtimes` is empty (the default), the agent probes the container for interpreters:
|
||||
|
||||
| Runtime | Probed Binaries |
|
||||
|---------|----------------|
|
||||
| Shell | `bash`, `sh` |
|
||||
| Python | `python3`, `python` |
|
||||
| Node.js | `node`, `nodejs` |
|
||||
| Ruby | `ruby` |
|
||||
| Go | `go` |
|
||||
| Java | `java` |
|
||||
| R | `Rscript` |
|
||||
| Perl | `perl` |
|
||||
|
||||
Set `runtimes` explicitly to skip auto-detection and only register the listed runtimes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- The `attune-agent` image must be available in your registry (built from `docker/Dockerfile.agent`, target `agent-init`)
|
||||
- Shared PVCs (`packs`, `runtime-envs`, `artifacts`) must support `ReadWriteMany` if agent workers run on different nodes than the standard worker
|
||||
- The Attune database and RabbitMQ must be reachable from agent worker pods
|
||||
|
||||
## Differences from the Standard Worker
|
||||
|
||||
| Aspect | Standard Worker (`worker`) | Agent Worker (`agentWorkers`) |
|
||||
|--------|---------------------------|-------------------------------|
|
||||
| Image | Built from `Dockerfile.worker.optimized` | Any image (ruby, python, cuda, etc.) |
|
||||
| Binary | Baked into the image | Injected via init container |
|
||||
| Runtimes | Configured at build time | Auto-detected or explicitly listed |
|
||||
| Use case | Known, pre-built runtime combos | Custom images, exotic runtimes, GPU |
|
||||
|
||||
Both worker types coexist — actions are routed to whichever worker has the matching runtime registered.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Agent binary not found**: Check that the `agent-loader` init container completed. View its logs:
|
||||
```bash
|
||||
kubectl logs <pod> -c agent-loader
|
||||
```
|
||||
|
||||
**Runtime not detected**: Run the agent with `--detect-only` to see what it finds:
|
||||
```bash
|
||||
kubectl exec <pod> -c worker -- /opt/attune/agent/attune-agent --detect-only
|
||||
```
|
||||
|
||||
**Worker not registering**: Check the worker container logs for database/MQ connectivity:
|
||||
```bash
|
||||
kubectl logs <pod> -c worker
|
||||
```
|
||||
|
||||
**Packs not available**: Ensure the `init-packs` job has completed and the PVC is mounted:
|
||||
```bash
|
||||
kubectl get jobs | grep init-packs
|
||||
kubectl exec <pod> -c worker -- ls /opt/attune/packs/core/
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [Agent Workers (Docker Compose)](QUICKREF-agent-workers.md)
|
||||
- [Universal Worker Agent Plan](plans/universal-worker-agent.md)
|
||||
- [Gitea Registry and Helm](deployment/gitea-registry-and-helm.md)
|
||||
- [Production Deployment](deployment/production-deployment.md)
|
||||
@@ -19,6 +19,7 @@ The workflow publishes these images to the Gitea OCI registry:
|
||||
- `attune-migrations`
|
||||
- `attune-init-user`
|
||||
- `attune-init-packs`
|
||||
- `attune-agent`
|
||||
|
||||
The Helm chart is pushed as an OCI chart to:
|
||||
|
||||
|
||||
1070
docs/plans/universal-worker-agent.md
Normal file
1070
docs/plans/universal-worker-agent.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user