146 lines
5.4 KiB
Markdown
146 lines
5.4 KiB
Markdown
# Quick Reference: Kubernetes Agent Workers
|
|
|
|
Agent-based workers let you run Attune actions inside **any container image** by injecting a statically-linked `attune-agent` binary via a Kubernetes init container. No custom Dockerfile required — just point at an image that has your runtime installed.
|
|
|
|
## How It Works
|
|
|
|
1. An **init container** (`agent-loader`) copies the `attune-agent` binary from the `attune-agent` image into an `emptyDir` volume
|
|
2. The **worker container** uses your chosen image (e.g., `ruby:3.3`) and runs the agent binary as its entrypoint
|
|
3. The agent **auto-detects** available runtimes (python, ruby, node, shell, etc.) and registers with Attune
|
|
4. Actions targeting those runtimes are routed to the agent worker via RabbitMQ
|
|
|
|
## Helm Values
|
|
|
|
Add entries to `agentWorkers` in your `values.yaml`:
|
|
|
|
```yaml
|
|
agentWorkers:
|
|
- name: ruby
|
|
image: ruby:3.3
|
|
replicas: 2
|
|
|
|
- name: python-gpu
|
|
image: nvidia/cuda:12.3.1-runtime-ubuntu22.04
|
|
replicas: 1
|
|
runtimes: [python, shell]
|
|
runtimeClassName: nvidia
|
|
nodeSelector:
|
|
gpu: "true"
|
|
tolerations:
|
|
- key: nvidia.com/gpu
|
|
operator: Exists
|
|
effect: NoSchedule
|
|
resources:
|
|
limits:
|
|
nvidia.com/gpu: 1
|
|
|
|
- name: custom
|
|
image: my-org/my-custom-image:latest
|
|
replicas: 1
|
|
env:
|
|
- name: MY_CUSTOM_VAR
|
|
value: my-value
|
|
```
|
|
|
|
### Supported Fields
|
|
|
|
| Field | Required | Default | Description |
|
|
|-------|----------|---------|-------------|
|
|
| `name` | Yes | — | Unique name (used in Deployment and worker names) |
|
|
| `image` | Yes | — | Container image with your desired runtime(s) |
|
|
| `replicas` | No | `1` | Number of pod replicas |
|
|
| `runtimes` | No | `[]` (auto-detect) | List of runtimes to expose (e.g., `[python, shell]`) |
|
|
| `resources` | No | `{}` | Kubernetes resource requests/limits |
|
|
| `env` | No | — | Extra environment variables (`[{name, value}]`) |
|
|
| `imagePullPolicy` | No | — | Pull policy for the worker image |
|
|
| `logLevel` | No | `info` | `RUST_LOG` level |
|
|
| `runtimeClassName` | No | — | Kubernetes RuntimeClass (e.g., `nvidia`) |
|
|
| `nodeSelector` | No | — | Node selector for pod scheduling |
|
|
| `tolerations` | No | — | Tolerations for pod scheduling |
|
|
| `stopGracePeriod` | No | `45` | Termination grace period (seconds) |
|
|
|
|
## Install / Upgrade
|
|
|
|
```bash
|
|
helm upgrade --install attune oci://registry.example.com/namespace/helm/attune \
|
|
--version 0.3.0 \
|
|
--set global.imageRegistry=registry.example.com \
|
|
--set global.imageNamespace=namespace \
|
|
--set global.imageTag=0.3.0 \
|
|
-f my-values.yaml
|
|
```
|
|
|
|
## What Gets Created
|
|
|
|
For each `agentWorkers` entry, the chart creates a Deployment named `<release>-attune-agent-worker-<name>` with:
|
|
|
|
- **Init containers**:
|
|
- `agent-loader` — copies the agent binary from the `attune-agent` image to an `emptyDir` volume
|
|
- `wait-for-schema` — polls PostgreSQL until the Attune schema is ready
|
|
- `wait-for-packs` — waits for the core pack to be available on the shared PVC
|
|
- **Worker container** — runs `attune-agent` as the entrypoint inside your chosen image
|
|
- **Volumes**: `agent-bin` (emptyDir), `config` (ConfigMap), `packs` (PVC, read-only), `runtime-envs` (PVC), `artifacts` (PVC)
|
|
|
|
## Runtime Auto-Detection
|
|
|
|
When `runtimes` is empty (the default), the agent probes the container for interpreters:
|
|
|
|
| Runtime | Probed Binaries |
|
|
|---------|----------------|
|
|
| Shell | `bash`, `sh` |
|
|
| Python | `python3`, `python` |
|
|
| Node.js | `node`, `nodejs` |
|
|
| Ruby | `ruby` |
|
|
| Go | `go` |
|
|
| Java | `java` |
|
|
| R | `Rscript` |
|
|
| Perl | `perl` |
|
|
|
|
Set `runtimes` explicitly to skip auto-detection and only register the listed runtimes.
|
|
|
|
## Prerequisites
|
|
|
|
- The `attune-agent` image must be available in your registry (built from `docker/Dockerfile.agent`, target `agent-init`)
|
|
- Shared PVCs (`packs`, `runtime-envs`, `artifacts`) must support `ReadWriteMany` if agent workers run on different nodes than the standard worker
|
|
- The Attune database and RabbitMQ must be reachable from agent worker pods
|
|
|
|
## Differences from the Standard Worker
|
|
|
|
| Aspect | Standard Worker (`worker`) | Agent Worker (`agentWorkers`) |
|
|
|--------|---------------------------|-------------------------------|
|
|
| Image | Built from `Dockerfile.worker.optimized` | Any image (ruby, python, cuda, etc.) |
|
|
| Binary | Baked into the image | Injected via init container |
|
|
| Runtimes | Configured at build time | Auto-detected or explicitly listed |
|
|
| Use case | Known, pre-built runtime combos | Custom images, exotic runtimes, GPU |
|
|
|
|
Both worker types coexist — actions are routed to whichever worker has the matching runtime registered.
|
|
|
|
## Troubleshooting
|
|
|
|
**Agent binary not found**: Check that the `agent-loader` init container completed. View its logs:
|
|
```bash
|
|
kubectl logs <pod> -c agent-loader
|
|
```
|
|
|
|
**Runtime not detected**: Run the agent with `--detect-only` to see what it finds:
|
|
```bash
|
|
kubectl exec <pod> -c worker -- /opt/attune/agent/attune-agent --detect-only
|
|
```
|
|
|
|
**Worker not registering**: Check the worker container logs for database/MQ connectivity:
|
|
```bash
|
|
kubectl logs <pod> -c worker
|
|
```
|
|
|
|
**Packs not available**: Ensure the `init-packs` job has completed and the PVC is mounted:
|
|
```bash
|
|
kubectl get jobs | grep init-packs
|
|
kubectl exec <pod> -c worker -- ls /opt/attune/packs/core/
|
|
```
|
|
|
|
## See Also
|
|
|
|
- [Agent Workers (Docker Compose)](QUICKREF-agent-workers.md)
|
|
- [Universal Worker Agent Plan](plans/universal-worker-agent.md)
|
|
- [Gitea Registry and Helm](deployment/gitea-registry-and-helm.md)
|
|
- [Production Deployment](deployment/production-deployment.md) |