[wip] universal workers
This commit is contained in:
146
docs/QUICKREF-kubernetes-agent-workers.md
Normal file
146
docs/QUICKREF-kubernetes-agent-workers.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Quick Reference: Kubernetes Agent Workers
|
||||
|
||||
Agent-based workers let you run Attune actions inside **any container image** by injecting a statically-linked `attune-agent` binary via a Kubernetes init container. No custom Dockerfile required — just point at an image that has your runtime installed.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. An **init container** (`agent-loader`) copies the `attune-agent` binary from the `attune-agent` image into an `emptyDir` volume
|
||||
2. The **worker container** uses your chosen image (e.g., `ruby:3.3`) and runs the agent binary as its entrypoint
|
||||
3. The agent **auto-detects** available runtimes (python, ruby, node, shell, etc.) and registers with Attune
|
||||
4. Actions targeting those runtimes are routed to the agent worker via RabbitMQ
|
||||
|
||||
## Helm Values
|
||||
|
||||
Add entries to `agentWorkers` in your `values.yaml`:
|
||||
|
||||
```yaml
|
||||
agentWorkers:
|
||||
- name: ruby
|
||||
image: ruby:3.3
|
||||
replicas: 2
|
||||
|
||||
- name: python-gpu
|
||||
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
|
||||
replicas: 1
|
||||
runtimes: [python, shell]
|
||||
runtimeClassName: nvidia
|
||||
nodeSelector:
|
||||
gpu: "true"
|
||||
tolerations:
|
||||
- key: nvidia.com/gpu
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
resources:
|
||||
limits:
|
||||
nvidia.com/gpu: 1
|
||||
|
||||
- name: custom
|
||||
image: my-org/my-custom-image:latest
|
||||
replicas: 1
|
||||
env:
|
||||
- name: MY_CUSTOM_VAR
|
||||
value: my-value
|
||||
```
|
||||
|
||||
### Supported Fields
|
||||
|
||||
| Field | Required | Default | Description |
|
||||
|-------|----------|---------|-------------|
|
||||
| `name` | Yes | — | Unique name (used in Deployment and worker names) |
|
||||
| `image` | Yes | — | Container image with your desired runtime(s) |
|
||||
| `replicas` | No | `1` | Number of pod replicas |
|
||||
| `runtimes` | No | `[]` (auto-detect) | List of runtimes to expose (e.g., `[python, shell]`) |
|
||||
| `resources` | No | `{}` | Kubernetes resource requests/limits |
|
||||
| `env` | No | — | Extra environment variables (`[{name, value}]`) |
|
||||
| `imagePullPolicy` | No | — | Pull policy for the worker image |
|
||||
| `logLevel` | No | `info` | `RUST_LOG` level |
|
||||
| `runtimeClassName` | No | — | Kubernetes RuntimeClass (e.g., `nvidia`) |
|
||||
| `nodeSelector` | No | — | Node selector for pod scheduling |
|
||||
| `tolerations` | No | — | Tolerations for pod scheduling |
|
||||
| `stopGracePeriod` | No | `45` | Termination grace period (seconds) |
|
||||
|
||||
## Install / Upgrade
|
||||
|
||||
```bash
|
||||
helm upgrade --install attune oci://registry.example.com/namespace/helm/attune \
|
||||
--version 0.3.0 \
|
||||
--set global.imageRegistry=registry.example.com \
|
||||
--set global.imageNamespace=namespace \
|
||||
--set global.imageTag=0.3.0 \
|
||||
-f my-values.yaml
|
||||
```
|
||||
|
||||
## What Gets Created
|
||||
|
||||
For each `agentWorkers` entry, the chart creates a Deployment named `<release>-attune-agent-worker-<name>` with:
|
||||
|
||||
- **Init containers**:
|
||||
- `agent-loader` — copies the agent binary from the `attune-agent` image to an `emptyDir` volume
|
||||
- `wait-for-schema` — polls PostgreSQL until the Attune schema is ready
|
||||
- `wait-for-packs` — waits for the core pack to be available on the shared PVC
|
||||
- **Worker container** — runs `attune-agent` as the entrypoint inside your chosen image
|
||||
- **Volumes**: `agent-bin` (emptyDir), `config` (ConfigMap), `packs` (PVC, read-only), `runtime-envs` (PVC), `artifacts` (PVC)
|
||||
|
||||
## Runtime Auto-Detection
|
||||
|
||||
When `runtimes` is empty (the default), the agent probes the container for interpreters:
|
||||
|
||||
| Runtime | Probed Binaries |
|
||||
|---------|----------------|
|
||||
| Shell | `bash`, `sh` |
|
||||
| Python | `python3`, `python` |
|
||||
| Node.js | `node`, `nodejs` |
|
||||
| Ruby | `ruby` |
|
||||
| Go | `go` |
|
||||
| Java | `java` |
|
||||
| R | `Rscript` |
|
||||
| Perl | `perl` |
|
||||
|
||||
Set `runtimes` explicitly to skip auto-detection and only register the listed runtimes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- The `attune-agent` image must be available in your registry (built from `docker/Dockerfile.agent`, target `agent-init`)
|
||||
- Shared PVCs (`packs`, `runtime-envs`, `artifacts`) must support `ReadWriteMany` if agent workers run on different nodes than the standard worker
|
||||
- The Attune database and RabbitMQ must be reachable from agent worker pods
|
||||
|
||||
## Differences from the Standard Worker
|
||||
|
||||
| Aspect | Standard Worker (`worker`) | Agent Worker (`agentWorkers`) |
|
||||
|--------|---------------------------|-------------------------------|
|
||||
| Image | Built from `Dockerfile.worker.optimized` | Any image (ruby, python, cuda, etc.) |
|
||||
| Binary | Baked into the image | Injected via init container |
|
||||
| Runtimes | Configured at build time | Auto-detected or explicitly listed |
|
||||
| Use case | Known, pre-built runtime combos | Custom images, exotic runtimes, GPU |
|
||||
|
||||
Both worker types coexist — actions are routed to whichever worker has the matching runtime registered.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Agent binary not found**: Check that the `agent-loader` init container completed. View its logs:
|
||||
```bash
|
||||
kubectl logs <pod> -c agent-loader
|
||||
```
|
||||
|
||||
**Runtime not detected**: Run the agent with `--detect-only` to see what it finds:
|
||||
```bash
|
||||
kubectl exec <pod> -c worker -- /opt/attune/agent/attune-agent --detect-only
|
||||
```
|
||||
|
||||
**Worker not registering**: Check the worker container logs for database/MQ connectivity:
|
||||
```bash
|
||||
kubectl logs <pod> -c worker
|
||||
```
|
||||
|
||||
**Packs not available**: Ensure the `init-packs` job has completed and the PVC is mounted:
|
||||
```bash
|
||||
kubectl get jobs | grep init-packs
|
||||
kubectl exec <pod> -c worker -- ls /opt/attune/packs/core/
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [Agent Workers (Docker Compose)](QUICKREF-agent-workers.md)
|
||||
- [Universal Worker Agent Plan](plans/universal-worker-agent.md)
|
||||
- [Gitea Registry and Helm](deployment/gitea-registry-and-helm.md)
|
||||
- [Production Deployment](deployment/production-deployment.md)
|
||||
Reference in New Issue
Block a user