Files
attune/docs/QUICKREF-buildkit-cache-strategy.md

10 KiB

Quick Reference: BuildKit Cache Mount Strategy

TL;DR

Optimized cache sharing for parallel Docker builds:

  • Cargo registry/git: sharing=shared (concurrent-safe)
  • Target directory: Service-specific cache IDs (no conflicts)
  • Result: Safe parallel builds without serialization overhead

Cache Mount Sharing Modes

sharing=locked (Old Strategy)

RUN --mount=type=cache,target=/build/target,sharing=locked \
    cargo build
  • Only one build can access cache at a time
  • Serializes parallel builds
  • Slower when building multiple services
  • Prevents race conditions (but unnecessary with proper strategy)

sharing=shared (New Strategy)

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    cargo build
  • Multiple builds can access cache concurrently
  • Faster parallel builds
  • Cargo registry/git are inherently concurrent-safe
  • Can cause conflicts if used incorrectly on target directory

sharing=private (Not Used)

RUN --mount=type=cache,target=/build/target,sharing=private
  • Each build gets its own cache copy
  • No benefit for our use case

Optimized Strategy

Registry and Git Caches: sharing=shared

Cargo's package registry and git cache are designed for concurrent access:

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    cargo build

Why it's safe:

  • Cargo uses file locking internally
  • Multiple cargo processes can download/cache packages concurrently
  • Registry is read-only after download
  • No compilation happens in these directories

Benefits:

  • Multiple services can download dependencies simultaneously
  • No waiting for registry lock
  • Faster parallel builds

Target Directory: Service-Specific Cache IDs

Each service compiles different crates, so use separate cache volumes:

# For API service
RUN --mount=type=cache,target=/build/target,id=target-builder-api \
    cargo build --release --bin attune-api

# For worker service
RUN --mount=type=cache,target=/build/target,id=target-builder-worker \
    cargo build --release --bin attune-worker

Why service-specific IDs:

  • Each service compiles different crates (api, executor, worker, etc.)
  • No shared compilation artifacts between services
  • Prevents conflicts when building in parallel
  • Each service gets its own optimized cache

Cache ID naming:

  • target-planner-${SERVICE}: Planner stage (dummy builds)
  • target-builder-${SERVICE}: Builder stage (actual builds)
  • target-worker-planner: Worker planner (shared by all workers)
  • target-worker-builder: Worker builder (shared by all workers)
  • target-pack-binaries: Pack binaries (separate from services)

Architecture Benefits

With Selective Crate Copying

The optimized Dockerfiles only copy specific crates:

# Stage 1: Planner - Build dependencies with dummy source
COPY crates/common/Cargo.toml ./crates/common/Cargo.toml
COPY crates/api/Cargo.toml ./crates/api/Cargo.toml
# ... create dummy source files ...
RUN --mount=type=cache,target=/build/target,id=target-planner-api \
    cargo build --release --bin attune-api

# Stage 2: Builder - Build actual service
COPY crates/common/ ./crates/common/
COPY crates/api/ ./crates/api/
RUN --mount=type=cache,target=/build/target,id=target-builder-api \
    cargo build --release --bin attune-api

Why this enables shared registry caches:

  1. Planner stage compiles dependencies (common across services)
  2. Builder stage compiles service-specific code
  3. Different services compile different binaries
  4. No conflicting writes to same compilation artifacts
  5. Safe to share registry/git caches

Parallel Build Flow

Time →

T0: docker compose build --parallel 4
    ├─ API build starts
    ├─ Executor build starts  
    ├─ Worker build starts
    └─ Sensor build starts

T1: All builds access shared registry cache
    ├─ API: Downloads dependencies (shared cache)
    ├─ Executor: Downloads dependencies (shared cache)
    ├─ Worker: Downloads dependencies (shared cache)
    └─ Sensor: Downloads dependencies (shared cache)

T2: Each build compiles in its own target cache
    ├─ API: target-builder-api (no conflicts)
    ├─ Executor: target-builder-executor (no conflicts)
    ├─ Worker: target-builder-worker (no conflicts)
    └─ Sensor: target-builder-sensor (no conflicts)

T3: All builds complete concurrently

Old strategy (sharing=locked):

  • T1: Only API downloads (others wait)
  • T2: API compiles (others wait)
  • T3: Executor downloads (others wait)
  • T4: Executor compiles (others wait)
  • T5-T8: Worker and Sensor sequentially
  • Total time: ~4x longer

New strategy (sharing=shared + cache IDs):

  • T1: All download concurrently
  • T2: All compile concurrently (different caches)
  • Total time: ~4x faster

Implementation Examples

Service Dockerfile (Dockerfile.optimized)

# Planner stage
ARG SERVICE=api
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-planner-${SERVICE} \
    cargo build --release --bin attune-${SERVICE} || true

# Builder stage
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-builder-${SERVICE} \
    cargo build --release --bin attune-${SERVICE}

Worker Dockerfile (Dockerfile.worker.optimized)

# Planner stage (shared by all worker variants)
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-worker-planner \
    cargo build --release --bin attune-worker || true

# Builder stage (shared by all worker variants)
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-worker-builder \
    cargo build --release --bin attune-worker

Note: All worker variants (shell, python, node, full) share the same caches because they build the same binary. Only the runtime stages differ.

Pack Binaries Dockerfile

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-pack-binaries \
    cargo build --release --bin attune-core-timer-sensor

Performance Comparison

Scenario Old (sharing=locked) New (shared + cache IDs) Improvement
Sequential builds ~30 sec/service ~30 sec/service Same
Parallel builds (4 services) ~120 sec total ~30 sec total 4x faster
First build (cache empty) ~300 sec ~300 sec Same
Incremental (1 service) ~30 sec ~30 sec Same
Incremental (all services) ~120 sec ~30 sec 4x faster

When to Use Each Strategy

Use sharing=shared

  • Cargo registry cache
  • Cargo git cache
  • Any read-only cache
  • Caches with internal locking (like cargo)

Use service-specific cache IDs

  • Build target directories
  • Compilation artifacts
  • Any cache with potential write conflicts

Use sharing=locked

  • Generally not needed with proper architecture
  • Only if you encounter unexplained race conditions
  • Legacy compatibility

Troubleshooting

Issue: "File exists" errors during parallel builds

Cause: Cache mount conflicts (shouldn't happen with new strategy)

Solution: Verify cache IDs are service-specific

# Check Dockerfile
grep "id=target-builder" docker/Dockerfile.optimized
# Should show: id=target-builder-${SERVICE}

Issue: Slower parallel builds than expected

Cause: BuildKit not enabled or old Docker version

Solution:

# Check BuildKit version
docker buildx version

# Ensure BuildKit is enabled (automatic with docker compose)
export DOCKER_BUILDKIT=1

# Check Docker version (need 20.10+)
docker --version

Issue: Cache not being reused between builds

Cause: Cache ID mismatch or cache pruned

Solution:

# Check cache usage
docker buildx du

# Verify cache IDs in use
docker buildx ls

# Clear and rebuild if corrupted
docker builder prune -a
docker compose build --no-cache

Best Practices

DO:

  • Use sharing=shared for registry/git caches
  • Use unique cache IDs for target directories
  • Name cache IDs descriptively (e.g., target-builder-api)
  • Share registry caches across all builds
  • Separate target caches per service

DON'T:

  • Don't use sharing=locked unless necessary
  • Don't share target caches between different services
  • Don't use sharing=private (creates duplicate caches)
  • Don't mix cache IDs (be consistent)

Monitoring Cache Performance

# View cache usage
docker system df -v | grep buildx

# View specific cache details
docker buildx du --verbose

# Time parallel builds
time docker compose build --parallel 4

# Compare with sequential builds
time docker compose build api
time docker compose build executor
time docker compose build worker-shell
time docker compose build sensor

Summary

Old strategy:

  • sharing=locked on everything
  • Serialized builds
  • Safe but slow

New strategy:

  • sharing=shared on registry/git (concurrent-safe)
  • Service-specific cache IDs on target (no conflicts)
  • Fast parallel builds

Result:

  • 4x faster parallel builds
  • No race conditions
  • Optimal cache reuse
  • Safe concurrent builds

Key insight from selective crate copying: Each service compiles different binaries, so their target caches don't conflict. This enables safe concurrent builds without serialization overhead.