attune-system/attune

Fork 0

Files

David Culbreth a74e13fa0b working out the worker/execution interface

2026-02-08 12:55:33 -06:00

10 KiB

Raw Blame History

Quick Reference: BuildKit Cache Mount Strategy

TL;DR

Optimized cache sharing for parallel Docker builds:

Cargo registry/git: sharing=shared (concurrent-safe)
Target directory: Service-specific cache IDs (no conflicts)
Result: Safe parallel builds without serialization overhead

`sharing=locked` (Old Strategy)

RUN --mount=type=cache,target=/build/target,sharing=locked \
    cargo build

❌ Only one build can access cache at a time
❌ Serializes parallel builds
❌ Slower when building multiple services
✅ Prevents race conditions (but unnecessary with proper strategy)

`sharing=shared` (New Strategy)

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    cargo build

✅ Multiple builds can access cache concurrently
✅ Faster parallel builds
✅ Cargo registry/git are inherently concurrent-safe
❌ Can cause conflicts if used incorrectly on target directory

`sharing=private` (Not Used)

RUN --mount=type=cache,target=/build/target,sharing=private

Each build gets its own cache copy
No benefit for our use case

Optimized Strategy

Registry and Git Caches: `sharing=shared`

Cargo's package registry and git cache are designed for concurrent access:

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    cargo build

Why it's safe:

Cargo uses file locking internally
Multiple cargo processes can download/cache packages concurrently
Registry is read-only after download
No compilation happens in these directories

Benefits:

Multiple services can download dependencies simultaneously
No waiting for registry lock
Faster parallel builds

Target Directory: Service-Specific Cache IDs

Each service compiles different crates, so use separate cache volumes:

# For API service
RUN --mount=type=cache,target=/build/target,id=target-builder-api \
    cargo build --release --bin attune-api

# For worker service
RUN --mount=type=cache,target=/build/target,id=target-builder-worker \
    cargo build --release --bin attune-worker

Why service-specific IDs:

Each service compiles different crates (api, executor, worker, etc.)
No shared compilation artifacts between services
Prevents conflicts when building in parallel
Each service gets its own optimized cache

Cache ID naming:

target-planner-${SERVICE}: Planner stage (dummy builds)
target-builder-${SERVICE}: Builder stage (actual builds)
target-worker-planner: Worker planner (shared by all workers)
target-worker-builder: Worker builder (shared by all workers)
target-pack-binaries: Pack binaries (separate from services)

Architecture Benefits

With Selective Crate Copying

The optimized Dockerfiles only copy specific crates:

# Stage 1: Planner - Build dependencies with dummy source
COPY crates/common/Cargo.toml ./crates/common/Cargo.toml
COPY crates/api/Cargo.toml ./crates/api/Cargo.toml
# ... create dummy source files ...
RUN --mount=type=cache,target=/build/target,id=target-planner-api \
    cargo build --release --bin attune-api

# Stage 2: Builder - Build actual service
COPY crates/common/ ./crates/common/
COPY crates/api/ ./crates/api/
RUN --mount=type=cache,target=/build/target,id=target-builder-api \
    cargo build --release --bin attune-api

Why this enables shared registry caches:

Planner stage compiles dependencies (common across services)
Builder stage compiles service-specific code
Different services compile different binaries
No conflicting writes to same compilation artifacts
Safe to share registry/git caches

Parallel Build Flow

Time →

T0: docker compose build --parallel 4
    ├─ API build starts
    ├─ Executor build starts  
    ├─ Worker build starts
    └─ Sensor build starts

T1: All builds access shared registry cache
    ├─ API: Downloads dependencies (shared cache)
    ├─ Executor: Downloads dependencies (shared cache)
    ├─ Worker: Downloads dependencies (shared cache)
    └─ Sensor: Downloads dependencies (shared cache)

T2: Each build compiles in its own target cache
    ├─ API: target-builder-api (no conflicts)
    ├─ Executor: target-builder-executor (no conflicts)
    ├─ Worker: target-builder-worker (no conflicts)
    └─ Sensor: target-builder-sensor (no conflicts)

T3: All builds complete concurrently

Old strategy (sharing=locked):

T1: Only API downloads (others wait)
T2: API compiles (others wait)
T3: Executor downloads (others wait)
T4: Executor compiles (others wait)
T5-T8: Worker and Sensor sequentially
Total time: ~4x longer

New strategy (sharing=shared + cache IDs):

T1: All download concurrently
T2: All compile concurrently (different caches)
Total time: ~4x faster

Implementation Examples

Service Dockerfile (Dockerfile.optimized)

# Planner stage
ARG SERVICE=api
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-planner-${SERVICE} \
    cargo build --release --bin attune-${SERVICE} || true

# Builder stage
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-builder-${SERVICE} \
    cargo build --release --bin attune-${SERVICE}

Worker Dockerfile (Dockerfile.worker.optimized)

# Planner stage (shared by all worker variants)
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-worker-planner \
    cargo build --release --bin attune-worker || true

# Builder stage (shared by all worker variants)
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-worker-builder \
    cargo build --release --bin attune-worker

Note: All worker variants (shell, python, node, full) share the same caches because they build the same binary. Only the runtime stages differ.

Pack Binaries Dockerfile

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-pack-binaries \
    cargo build --release --bin attune-core-timer-sensor

Performance Comparison

Scenario	Old (sharing=locked)	New (shared + cache IDs)	Improvement
Sequential builds	~30 sec/service	~30 sec/service	Same
Parallel builds (4 services)	~120 sec total	~30 sec total	4x faster
First build (cache empty)	~300 sec	~300 sec	Same
Incremental (1 service)	~30 sec	~30 sec	Same
Incremental (all services)	~120 sec	~30 sec	4x faster

When to Use Each Strategy

Use `sharing=shared`

✅ Cargo registry cache
✅ Cargo git cache
✅ Any read-only cache
✅ Caches with internal locking (like cargo)

Use service-specific cache IDs

✅ Build target directories
✅ Compilation artifacts
✅ Any cache with potential write conflicts

Use `sharing=locked`

❌ Generally not needed with proper architecture
✅ Only if you encounter unexplained race conditions
✅ Legacy compatibility

Troubleshooting

Issue: "File exists" errors during parallel builds

Cause: Cache mount conflicts (shouldn't happen with new strategy)

Solution: Verify cache IDs are service-specific

# Check Dockerfile
grep "id=target-builder" docker/Dockerfile.optimized
# Should show: id=target-builder-${SERVICE}

Issue: Slower parallel builds than expected

Cause: BuildKit not enabled or old Docker version

Solution:

# Check BuildKit version
docker buildx version

# Ensure BuildKit is enabled (automatic with docker compose)
export DOCKER_BUILDKIT=1

# Check Docker version (need 20.10+)
docker --version

Issue: Cache not being reused between builds

Cause: Cache ID mismatch or cache pruned

Solution:

# Check cache usage
docker buildx du

# Verify cache IDs in use
docker buildx ls

# Clear and rebuild if corrupted
docker builder prune -a
docker compose build --no-cache

Best Practices

DO:

✅ Use sharing=shared for registry/git caches
✅ Use unique cache IDs for target directories
✅ Name cache IDs descriptively (e.g., target-builder-api)
✅ Share registry caches across all builds
✅ Separate target caches per service

DON'T:

❌ Don't use sharing=locked unless necessary
❌ Don't share target caches between different services
❌ Don't use sharing=private (creates duplicate caches)
❌ Don't mix cache IDs (be consistent)

Monitoring Cache Performance

# View cache usage
docker system df -v | grep buildx

# View specific cache details
docker buildx du --verbose

# Time parallel builds
time docker compose build --parallel 4

# Compare with sequential builds
time docker compose build api
time docker compose build executor
time docker compose build worker-shell
time docker compose build sensor

Summary

Old strategy:

sharing=locked on everything
Serialized builds
Safe but slow

New strategy:

sharing=shared on registry/git (concurrent-safe)
Service-specific cache IDs on target (no conflicts)
Fast parallel builds

Result:

✅ 4x faster parallel builds
✅ No race conditions
✅ Optimal cache reuse
✅ Safe concurrent builds

Key insight from selective crate copying: Each service compiles different binaries, so their target caches don't conflict. This enables safe concurrent builds without serialization overhead.

10 KiB Raw Blame History