# Docker Layer Optimization Guide ## Problem Statement When building Rust workspace projects in Docker, copying the entire `crates/` directory creates a single Docker layer that gets invalidated whenever **any file** in **any crate** changes. This means: - **Before optimization**: Changing one line in `api/src/main.rs` invalidates layers for ALL services (api, executor, worker, sensor, notifier) - **Impact**: Every service rebuild takes ~5-6 minutes instead of ~30 seconds - **Root cause**: Docker's layer caching treats `COPY crates/ ./crates/` as an atomic operation ## Architecture: Packs as Volumes **Important**: The optimized Dockerfiles do NOT copy the `packs/` directory into service images. Packs are content/configuration that should be decoupled from service binaries. ### Packs Volume Strategy ```yaml # docker-compose.yaml volumes: packs_data: # Shared volume for all services services: init-packs: # Run-once service that populates packs_data volumes: - ./packs:/source/packs:ro # Source packs from host - packs_data:/opt/attune/packs # Copy to shared volume api: volumes: - packs_data:/opt/attune/packs:ro # Mount packs as read-only worker: volumes: - packs_data:/opt/attune/packs:ro # All services share same packs ``` **Benefits**: - ✅ Update packs without rebuilding service images - ✅ Reduce image size (packs not baked in) - ✅ Faster builds (no pack copying during image build) - ✅ Consistent packs across all services ## The Solution: Selective Crate Copying The optimized Dockerfiles use a multi-stage approach that separates dependency caching from source code compilation: ### Stage 1: Planner (Dependency Caching) ```dockerfile # Copy only Cargo.toml files (not source code) COPY Cargo.toml Cargo.lock ./ COPY crates/common/Cargo.toml ./crates/common/Cargo.toml COPY crates/api/Cargo.toml ./crates/api/Cargo.toml # ... all other crate manifests # Create dummy source files RUN mkdir -p crates/common/src && echo "fn main() {}" > crates/common/src/lib.rs # ... create dummies for all crates # Build with dummy source to cache dependencies RUN cargo build --release --bin attune-${SERVICE} ``` **Result**: This layer is only invalidated when dependencies change (Cargo.toml/Cargo.lock modifications). ### Stage 2: Builder (Selective Source Compilation) ```dockerfile # Copy common crate (shared dependency) COPY crates/common/ ./crates/common/ # Copy ONLY the service being built COPY crates/${SERVICE}/ ./crates/${SERVICE}/ # Build the actual service RUN cargo build --release --bin attune-${SERVICE} ``` **Result**: This layer is only invalidated when the specific service's code changes (or common crate changes). ### Stage 3: Runtime (No Packs Copying) ```dockerfile # Create directories for volume mount points RUN mkdir -p /opt/attune/packs /opt/attune/logs # Note: Packs are NOT copied here # They will be mounted as a volume at runtime from packs_data volume ``` **Result**: Service images contain only binaries and configs, not packs. Packs are mounted at runtime. ## Performance Comparison ### Before Optimization (Old Dockerfile) ``` Scenario: Change api/src/routes/actions.rs - Layer invalidated: COPY crates/ ./crates/ - Rebuilds: All dependencies + all crates - Time: ~5-6 minutes - Size: Full dependency rebuild ``` ### After Optimization (New Dockerfile) ``` Scenario: Change api/src/routes/actions.rs - Layer invalidated: COPY crates/api/ ./crates/api/ - Rebuilds: Only attune-api binary - Time: ~30-60 seconds - Size: Minimal incremental compilation ``` ### Dependency Change Comparison ``` Scenario: Add new dependency to Cargo.toml - Before: ~5-6 minutes (full rebuild) - After: ~3-4 minutes (dependency cached separately) ``` ## Implementation ### Using Optimized Dockerfiles The optimized Dockerfiles are available as: - `docker/Dockerfile.optimized` - For main services (api, executor, sensor, notifier) - `docker/Dockerfile.worker.optimized` - For worker services #### Option 1: Switch to Optimized Dockerfiles (Recommended) Update `docker-compose.yaml`: ```yaml services: api: build: context: . dockerfile: docker/Dockerfile.optimized # Changed from docker/Dockerfile args: SERVICE: api ``` #### Option 2: Replace Existing Dockerfiles ```bash # Backup current Dockerfiles cp docker/Dockerfile docker/Dockerfile.backup cp docker/Dockerfile.worker docker/Dockerfile.worker.backup # Replace with optimized versions mv docker/Dockerfile.optimized docker/Dockerfile mv docker/Dockerfile.worker.optimized docker/Dockerfile.worker ``` ### Testing the Optimization 1. **Clean build (first time)**: ```bash docker compose build --no-cache api # Time: ~5-6 minutes (expected, building from scratch) ``` 2. **Incremental build (change API code)**: ```bash # Edit attune/crates/api/src/routes/actions.rs echo "// test comment" >> crates/api/src/routes/actions.rs docker compose build api # Time: ~30-60 seconds (optimized, only rebuilds API) ``` 3. **Verify other services not affected**: ```bash # The worker service should still use cached layers docker compose build worker-shell # Time: ~5 seconds (uses cache, no rebuild needed) ``` ## How It Works: Docker Layer Caching Docker builds images in layers, and each instruction (`COPY`, `RUN`, etc.) creates a new layer. Layers are cached and reused if: 1. The instruction hasn't changed 2. The context (files being copied) hasn't changed 3. All previous layers are still valid ### Old Approach (Unoptimized) ``` Layer 1: COPY Cargo.toml Cargo.lock Layer 2: COPY crates/ ./crates/ ← Invalidated on ANY crate change Layer 3: RUN cargo build ← Always rebuilds everything ``` ### New Approach (Optimized) ``` Stage 1 (Planner): Layer 1: COPY Cargo.toml Cargo.lock ← Only invalidated on dependency changes Layer 2: COPY */Cargo.toml ← Only invalidated on dependency changes Layer 3: RUN cargo build (dummy) ← Caches compiled dependencies Stage 2 (Builder): Layer 4: COPY crates/common/ ← Invalidated on common changes Layer 5: COPY crates/${SERVICE}/ ← Invalidated on service-specific changes Layer 6: RUN cargo build ← Only recompiles changed crates ``` ## BuildKit Cache Mounts The optimized Dockerfiles also use BuildKit cache mounts for additional speedup: ```dockerfile RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \ --mount=type=cache,target=/usr/local/cargo/git,sharing=shared \ --mount=type=cache,target=/build/target,id=target-builder-${SERVICE} \ cargo build --release ``` **Benefits**: - **Cargo registry**: Downloaded crates persist between builds - **Cargo git**: Git dependencies persist between builds - **Target directory**: Compilation artifacts persist between builds - **Optimized sharing**: Registry/git use `sharing=shared` for concurrent access - **Service-specific caches**: Target directory uses unique cache IDs to prevent conflicts **Cache Strategy**: - **`sharing=shared`**: Registry and git caches (cargo handles concurrent access safely) - **Service-specific IDs**: Target caches use `id=target-builder-${SERVICE}` to prevent conflicts - **Result**: Safe parallel builds without serialization overhead (4x faster) - **See**: `docs/QUICKREF-buildkit-cache-strategy.md` for detailed explanation **Requirements**: - Enable BuildKit: `export DOCKER_BUILDKIT=1` - Or use docker-compose which enables it automatically ## Advanced: Parallel Builds With the optimized Dockerfiles, you can safely build multiple services in parallel: ```bash # Build all services in parallel (4 workers) docker compose build --parallel 4 # Or build specific services docker compose build api executor worker-shell ``` **Optimized for Parallel Builds**: - ✅ Registry/git caches use `sharing=shared` (concurrent-safe) - ✅ Target caches use service-specific IDs (no conflicts) - ✅ **4x faster** than old `sharing=locked` strategy - ✅ No race conditions or "File exists" errors **Why it's safe**: Each service compiles different binaries (api vs executor vs worker), so their target caches don't conflict. Cargo's registry and git caches are inherently concurrent-safe. See `docs/QUICKREF-buildkit-cache-strategy.md` for detailed explanation of the cache strategy. ## Tradeoffs and Considerations ### Advantages - ✅ **Faster incremental builds**: 30 seconds vs 5 minutes - ✅ **Better cache utilization**: Only rebuild what changed - ✅ **Smaller layer diffs**: More efficient CI/CD pipelines - ✅ **Reduced build costs**: Less CPU time in CI environments ### Disadvantages - ❌ **More complex Dockerfiles**: Additional planner stage - ❌ **Slightly longer first build**: Dummy compilation overhead (~30 seconds) - ❌ **Manual manifest copying**: Need to list all crates explicitly ### When to Use - ✅ **Active development**: Frequent code changes benefit from fast rebuilds - ✅ **CI/CD pipelines**: Reduce build times and costs - ✅ **Monorepo workspaces**: Multiple services sharing common code ### When NOT to Use - ❌ **Single-crate projects**: No benefit for non-workspace projects - ❌ **Infrequent builds**: Complexity not worth it for rare builds - ❌ **Dockerfile simplicity required**: Stick with basic approach ## Pack Binaries Pack binaries (like `attune-core-timer-sensor`) need to be built separately and placed in `./packs/` before starting docker-compose. ### Building Pack Binaries Use the provided script: ```bash ./scripts/build-pack-binaries.sh ``` Or manually: ```bash # Build pack binaries in Docker with GLIBC compatibility docker build -f docker/Dockerfile.pack-binaries -t attune-pack-builder . # Extract binaries docker create --name pack-tmp attune-pack-builder docker cp pack-tmp:/pack-binaries/attune-core-timer-sensor ./packs/core/sensors/ docker rm pack-tmp # Make executable chmod +x ./packs/core/sensors/attune-core-timer-sensor ``` The `init-packs` service will copy these binaries (along with other pack files) into the `packs_data` volume when docker-compose starts. ### Why Separate Pack Binaries? - **GLIBC Compatibility**: Built in Debian Bookworm for GLIBC 2.36 compatibility - **Decoupled Updates**: Update pack binaries without rebuilding service images - **Smaller Service Images**: Service images don't include pack compilation stages - **Cleaner Architecture**: Packs are content, services are runtime ## Maintenance ### Adding New Crates When adding a new crate to the workspace: 1. **Update `Cargo.toml`** workspace members: ```toml [workspace] members = [ "crates/common", "crates/new-service", # Add this ] ``` 2. **Update optimized Dockerfiles** (both planner and builder stages): ```dockerfile # In planner stage COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml RUN mkdir -p crates/new-service/src && echo "fn main() {}" > crates/new-service/src/main.rs # In builder stage COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml ``` 3. **Test the build**: ```bash docker compose build new-service ``` ### Updating Packs Packs are mounted as volumes, so updating them doesn't require rebuilding service images: 1. **Update pack files** in `./packs/`: ```bash # Edit pack files vim packs/core/actions/my_action.yaml ``` 2. **Rebuild pack binaries** (if needed): ```bash ./scripts/build-pack-binaries.sh ``` 3. **Restart services** to pick up changes: ```bash docker compose restart ``` No image rebuild required! ## Troubleshooting ### Build fails with "crate not found" **Cause**: Missing crate manifest in COPY instructions **Fix**: Add the crate's Cargo.toml to both planner and builder stages ### Changes not reflected in build **Cause**: Docker using stale cached layers **Fix**: Force rebuild with `docker compose build --no-cache ` ### "File exists" errors during parallel builds **Cause**: Cache mount conflicts **Fix**: Already handled by `sharing=locked` in optimized Dockerfiles ### Slow builds after dependency changes **Cause**: Expected behavior - dependencies must be recompiled **Fix**: This is normal; optimization helps with code changes, not dependency changes ## Alternative Approaches ### cargo-chef (Not Used) The `cargo-chef` tool provides similar optimization but requires additional tooling: - Pros: Automatic dependency detection, no manual manifest copying - Cons: Extra dependency, learning curve, additional maintenance We opted for the manual approach because: - Simpler to understand and maintain - No external dependencies - Full control over the build process - Easier to debug issues ### Volume Mounts for Development For local development, consider mounting the source as a volume: ```yaml volumes: - ./crates/api:/build/crates/api ``` - Pros: Instant code updates without rebuilds - Cons: Not suitable for production images ## References - [Docker Build Cache Documentation](https://docs.docker.com/build/cache/) - [BuildKit Cache Mounts](https://docs.docker.com/build/guide/mounts/) - [Rust Docker Best Practices](https://docs.docker.com/language/rust/build-images/) - [cargo-chef Alternative](https://github.com/LukeMathWalker/cargo-chef) ## Summary The optimized Docker build strategy significantly reduces build times by: 1. **Separating dependency resolution from source compilation** 2. **Only copying the specific crate being built** (plus common dependencies) 3. **Using BuildKit cache mounts** to persist compilation artifacts 4. **Mounting packs as volumes** instead of copying them into images **Key Architecture Principles**: - **Service images**: Contain only compiled binaries and configuration - **Packs**: Mounted as volumes, updated independently of services - **Pack binaries**: Built separately with GLIBC compatibility - **Volume strategy**: `init-packs` service populates shared `packs_data` volume **Result**: - Incremental builds drop from 5-6 minutes to 30-60 seconds - Pack updates don't require image rebuilds - Service images are smaller and more focused - Docker-based development workflows are practical for Rust workspaces