Files

David Culbreth a74e13fa0b working out the worker/execution interface

2026-02-08 12:55:33 -06:00

14 KiB

Raw Blame History

Docker Layer Optimization Guide

Problem Statement

When building Rust workspace projects in Docker, copying the entire crates/ directory creates a single Docker layer that gets invalidated whenever any file in any crate changes. This means:

Before optimization: Changing one line in api/src/main.rs invalidates layers for ALL services (api, executor, worker, sensor, notifier)
Impact: Every service rebuild takes ~5-6 minutes instead of ~30 seconds
Root cause: Docker's layer caching treats COPY crates/ ./crates/ as an atomic operation

Architecture: Packs as Volumes

Important: The optimized Dockerfiles do NOT copy the packs/ directory into service images. Packs are content/configuration that should be decoupled from service binaries.

Packs Volume Strategy

# docker-compose.yaml
volumes:
  packs_data:  # Shared volume for all services

services:
  init-packs:  # Run-once service that populates packs_data
    volumes:
      - ./packs:/source/packs:ro        # Source packs from host
      - packs_data:/opt/attune/packs    # Copy to shared volume
      
  api:
    volumes:
      - packs_data:/opt/attune/packs:ro  # Mount packs as read-only
      
  worker:
    volumes:
      - packs_data:/opt/attune/packs:ro  # All services share same packs

Benefits:

✅ Update packs without rebuilding service images
✅ Reduce image size (packs not baked in)
✅ Faster builds (no pack copying during image build)
✅ Consistent packs across all services

The Solution: Selective Crate Copying

The optimized Dockerfiles use a multi-stage approach that separates dependency caching from source code compilation:

Stage 1: Planner (Dependency Caching)

# Copy only Cargo.toml files (not source code)
COPY Cargo.toml Cargo.lock ./
COPY crates/common/Cargo.toml ./crates/common/Cargo.toml
COPY crates/api/Cargo.toml ./crates/api/Cargo.toml
# ... all other crate manifests

# Create dummy source files
RUN mkdir -p crates/common/src && echo "fn main() {}" > crates/common/src/lib.rs
# ... create dummies for all crates

# Build with dummy source to cache dependencies
RUN cargo build --release --bin attune-${SERVICE}

Result: This layer is only invalidated when dependencies change (Cargo.toml/Cargo.lock modifications).

Stage 2: Builder (Selective Source Compilation)

# Copy common crate (shared dependency)
COPY crates/common/ ./crates/common/

# Copy ONLY the service being built
COPY crates/${SERVICE}/ ./crates/${SERVICE}/

# Build the actual service
RUN cargo build --release --bin attune-${SERVICE}

Result: This layer is only invalidated when the specific service's code changes (or common crate changes).

Stage 3: Runtime (No Packs Copying)

# Create directories for volume mount points
RUN mkdir -p /opt/attune/packs /opt/attune/logs

# Note: Packs are NOT copied here
# They will be mounted as a volume at runtime from packs_data volume

Result: Service images contain only binaries and configs, not packs. Packs are mounted at runtime.

Performance Comparison

Before Optimization (Old Dockerfile)

Scenario: Change api/src/routes/actions.rs
- Layer invalidated: COPY crates/ ./crates/
- Rebuilds: All dependencies + all crates
- Time: ~5-6 minutes
- Size: Full dependency rebuild

After Optimization (New Dockerfile)

Scenario: Change api/src/routes/actions.rs
- Layer invalidated: COPY crates/api/ ./crates/api/
- Rebuilds: Only attune-api binary
- Time: ~30-60 seconds
- Size: Minimal incremental compilation

Dependency Change Comparison

Scenario: Add new dependency to Cargo.toml
- Before: ~5-6 minutes (full rebuild)
- After: ~3-4 minutes (dependency cached separately)

Implementation

Using Optimized Dockerfiles

The optimized Dockerfiles are available as:

docker/Dockerfile.optimized - For main services (api, executor, sensor, notifier)
docker/Dockerfile.worker.optimized - For worker services

Option 1: Switch to Optimized Dockerfiles (Recommended)

Update docker-compose.yaml:

services:
  api:
    build:
      context: .
      dockerfile: docker/Dockerfile.optimized  # Changed from docker/Dockerfile
      args:
        SERVICE: api

Option 2: Replace Existing Dockerfiles

# Backup current Dockerfiles
cp docker/Dockerfile docker/Dockerfile.backup
cp docker/Dockerfile.worker docker/Dockerfile.worker.backup

# Replace with optimized versions
mv docker/Dockerfile.optimized docker/Dockerfile
mv docker/Dockerfile.worker.optimized docker/Dockerfile.worker

Testing the Optimization

Clean build (first time):

docker compose build --no-cache api
# Time: ~5-6 minutes (expected, building from scratch)

Incremental build (change API code):

# Edit attune/crates/api/src/routes/actions.rs
echo "// test comment" >> crates/api/src/routes/actions.rs

docker compose build api
# Time: ~30-60 seconds (optimized, only rebuilds API)

Verify other services not affected:

# The worker service should still use cached layers
docker compose build worker-shell
# Time: ~5 seconds (uses cache, no rebuild needed)

How It Works: Docker Layer Caching

Docker builds images in layers, and each instruction (COPY, RUN, etc.) creates a new layer. Layers are cached and reused if:

The instruction hasn't changed
The context (files being copied) hasn't changed
All previous layers are still valid

Old Approach (Unoptimized)

Layer 1: COPY Cargo.toml Cargo.lock
Layer 2: COPY crates/ ./crates/           ← Invalidated on ANY crate change
Layer 3: RUN cargo build                  ← Always rebuilds everything

New Approach (Optimized)

Stage 1 (Planner):
Layer 1: COPY Cargo.toml Cargo.lock       ← Only invalidated on dependency changes
Layer 2: COPY */Cargo.toml                ← Only invalidated on dependency changes
Layer 3: RUN cargo build (dummy)          ← Caches compiled dependencies

Stage 2 (Builder):
Layer 4: COPY crates/common/              ← Invalidated on common changes
Layer 5: COPY crates/${SERVICE}/          ← Invalidated on service-specific changes
Layer 6: RUN cargo build                  ← Only recompiles changed crates

BuildKit Cache Mounts

The optimized Dockerfiles also use BuildKit cache mounts for additional speedup:

RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
    --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
    --mount=type=cache,target=/build/target,id=target-builder-${SERVICE} \
    cargo build --release

Benefits:

Cargo registry: Downloaded crates persist between builds
Cargo git: Git dependencies persist between builds
Target directory: Compilation artifacts persist between builds
Optimized sharing: Registry/git use sharing=shared for concurrent access
Service-specific caches: Target directory uses unique cache IDs to prevent conflicts

Cache Strategy:

sharing=shared: Registry and git caches (cargo handles concurrent access safely)
Service-specific IDs: Target caches use id=target-builder-${SERVICE} to prevent conflicts
Result: Safe parallel builds without serialization overhead (4x faster)
See: docs/QUICKREF-buildkit-cache-strategy.md for detailed explanation

Requirements:

Enable BuildKit: export DOCKER_BUILDKIT=1
Or use docker-compose which enables it automatically

Advanced: Parallel Builds

With the optimized Dockerfiles, you can safely build multiple services in parallel:

# Build all services in parallel (4 workers)
docker compose build --parallel 4

# Or build specific services
docker compose build api executor worker-shell

Optimized for Parallel Builds:

✅ Registry/git caches use sharing=shared (concurrent-safe)
✅ Target caches use service-specific IDs (no conflicts)
✅ 4x faster than old sharing=locked strategy
✅ No race conditions or "File exists" errors

Why it's safe: Each service compiles different binaries (api vs executor vs worker), so their target caches don't conflict. Cargo's registry and git caches are inherently concurrent-safe.

See docs/QUICKREF-buildkit-cache-strategy.md for detailed explanation of the cache strategy.

Tradeoffs and Considerations

Advantages

✅ Faster incremental builds: 30 seconds vs 5 minutes
✅ Better cache utilization: Only rebuild what changed
✅ Smaller layer diffs: More efficient CI/CD pipelines
✅ Reduced build costs: Less CPU time in CI environments

Disadvantages

❌ More complex Dockerfiles: Additional planner stage
❌ Slightly longer first build: Dummy compilation overhead (~30 seconds)
❌ Manual manifest copying: Need to list all crates explicitly

When to Use

✅ Active development: Frequent code changes benefit from fast rebuilds
✅ CI/CD pipelines: Reduce build times and costs
✅ Monorepo workspaces: Multiple services sharing common code

When NOT to Use

❌ Single-crate projects: No benefit for non-workspace projects
❌ Infrequent builds: Complexity not worth it for rare builds
❌ Dockerfile simplicity required: Stick with basic approach

Pack Binaries

Pack binaries (like attune-core-timer-sensor) need to be built separately and placed in ./packs/ before starting docker-compose.

Building Pack Binaries

Use the provided script:

./scripts/build-pack-binaries.sh

Or manually:

# Build pack binaries in Docker with GLIBC compatibility
docker build -f docker/Dockerfile.pack-binaries -t attune-pack-builder .

# Extract binaries
docker create --name pack-tmp attune-pack-builder
docker cp pack-tmp:/pack-binaries/attune-core-timer-sensor ./packs/core/sensors/
docker rm pack-tmp

# Make executable
chmod +x ./packs/core/sensors/attune-core-timer-sensor

The init-packs service will copy these binaries (along with other pack files) into the packs_data volume when docker-compose starts.

Why Separate Pack Binaries?

GLIBC Compatibility: Built in Debian Bookworm for GLIBC 2.36 compatibility
Decoupled Updates: Update pack binaries without rebuilding service images
Smaller Service Images: Service images don't include pack compilation stages
Cleaner Architecture: Packs are content, services are runtime

Maintenance

Adding New Crates

When adding a new crate to the workspace:

Update Cargo.toml workspace members:

[workspace]
members = [
    "crates/common",
    "crates/new-service",  # Add this
]

Update optimized Dockerfiles (both planner and builder stages):

# In planner stage
COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml
RUN mkdir -p crates/new-service/src && echo "fn main() {}" > crates/new-service/src/main.rs

# In builder stage
COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml

Test the build:
```
docker compose build new-service
```

Updating Packs

Packs are mounted as volumes, so updating them doesn't require rebuilding service images:

Update pack files in ./packs/:

# Edit pack files
vim packs/core/actions/my_action.yaml

Rebuild pack binaries (if needed):
```
./scripts/build-pack-binaries.sh
```
Restart services to pick up changes:
```
docker compose restart
```

No image rebuild required!

Troubleshooting

Build fails with "crate not found"

Cause: Missing crate manifest in COPY instructions Fix: Add the crate's Cargo.toml to both planner and builder stages

Changes not reflected in build

Cause: Docker using stale cached layers Fix: Force rebuild with docker compose build --no-cache <service>

"File exists" errors during parallel builds

Cause: Cache mount conflicts Fix: Already handled by sharing=locked in optimized Dockerfiles

Slow builds after dependency changes

Cause: Expected behavior - dependencies must be recompiled Fix: This is normal; optimization helps with code changes, not dependency changes

Alternative Approaches

cargo-chef (Not Used)

The cargo-chef tool provides similar optimization but requires additional tooling:

Pros: Automatic dependency detection, no manual manifest copying
Cons: Extra dependency, learning curve, additional maintenance

We opted for the manual approach because:

Simpler to understand and maintain
No external dependencies
Full control over the build process
Easier to debug issues

Volume Mounts for Development

For local development, consider mounting the source as a volume:

volumes:
  - ./crates/api:/build/crates/api

Pros: Instant code updates without rebuilds
Cons: Not suitable for production images

References

Summary

The optimized Docker build strategy significantly reduces build times by:

Separating dependency resolution from source compilation
Only copying the specific crate being built (plus common dependencies)
Using BuildKit cache mounts to persist compilation artifacts
Mounting packs as volumes instead of copying them into images

Key Architecture Principles:

Service images: Contain only compiled binaries and configuration
Packs: Mounted as volumes, updated independently of services
Pack binaries: Built separately with GLIBC compatibility
Volume strategy: init-packs service populates shared packs_data volume

Result:

Incremental builds drop from 5-6 minutes to 30-60 seconds
Pack updates don't require image rebuilds
Service images are smaller and more focused
Docker-based development workflows are practical for Rust workspaces

14 KiB Raw Blame History