working out the worker/execution interface
This commit is contained in:
425
docs/docker-layer-optimization.md
Normal file
425
docs/docker-layer-optimization.md
Normal file
@@ -0,0 +1,425 @@
|
||||
# Docker Layer Optimization Guide
|
||||
|
||||
## Problem Statement
|
||||
|
||||
When building Rust workspace projects in Docker, copying the entire `crates/` directory creates a single Docker layer that gets invalidated whenever **any file** in **any crate** changes. This means:
|
||||
|
||||
- **Before optimization**: Changing one line in `api/src/main.rs` invalidates layers for ALL services (api, executor, worker, sensor, notifier)
|
||||
- **Impact**: Every service rebuild takes ~5-6 minutes instead of ~30 seconds
|
||||
- **Root cause**: Docker's layer caching treats `COPY crates/ ./crates/` as an atomic operation
|
||||
|
||||
## Architecture: Packs as Volumes
|
||||
|
||||
**Important**: The optimized Dockerfiles do NOT copy the `packs/` directory into service images. Packs are content/configuration that should be decoupled from service binaries.
|
||||
|
||||
### Packs Volume Strategy
|
||||
```yaml
|
||||
# docker-compose.yaml
|
||||
volumes:
|
||||
packs_data: # Shared volume for all services
|
||||
|
||||
services:
|
||||
init-packs: # Run-once service that populates packs_data
|
||||
volumes:
|
||||
- ./packs:/source/packs:ro # Source packs from host
|
||||
- packs_data:/opt/attune/packs # Copy to shared volume
|
||||
|
||||
api:
|
||||
volumes:
|
||||
- packs_data:/opt/attune/packs:ro # Mount packs as read-only
|
||||
|
||||
worker:
|
||||
volumes:
|
||||
- packs_data:/opt/attune/packs:ro # All services share same packs
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- ✅ Update packs without rebuilding service images
|
||||
- ✅ Reduce image size (packs not baked in)
|
||||
- ✅ Faster builds (no pack copying during image build)
|
||||
- ✅ Consistent packs across all services
|
||||
|
||||
## The Solution: Selective Crate Copying
|
||||
|
||||
The optimized Dockerfiles use a multi-stage approach that separates dependency caching from source code compilation:
|
||||
|
||||
### Stage 1: Planner (Dependency Caching)
|
||||
```dockerfile
|
||||
# Copy only Cargo.toml files (not source code)
|
||||
COPY Cargo.toml Cargo.lock ./
|
||||
COPY crates/common/Cargo.toml ./crates/common/Cargo.toml
|
||||
COPY crates/api/Cargo.toml ./crates/api/Cargo.toml
|
||||
# ... all other crate manifests
|
||||
|
||||
# Create dummy source files
|
||||
RUN mkdir -p crates/common/src && echo "fn main() {}" > crates/common/src/lib.rs
|
||||
# ... create dummies for all crates
|
||||
|
||||
# Build with dummy source to cache dependencies
|
||||
RUN cargo build --release --bin attune-${SERVICE}
|
||||
```
|
||||
|
||||
**Result**: This layer is only invalidated when dependencies change (Cargo.toml/Cargo.lock modifications).
|
||||
|
||||
### Stage 2: Builder (Selective Source Compilation)
|
||||
```dockerfile
|
||||
# Copy common crate (shared dependency)
|
||||
COPY crates/common/ ./crates/common/
|
||||
|
||||
# Copy ONLY the service being built
|
||||
COPY crates/${SERVICE}/ ./crates/${SERVICE}/
|
||||
|
||||
# Build the actual service
|
||||
RUN cargo build --release --bin attune-${SERVICE}
|
||||
```
|
||||
|
||||
**Result**: This layer is only invalidated when the specific service's code changes (or common crate changes).
|
||||
|
||||
### Stage 3: Runtime (No Packs Copying)
|
||||
```dockerfile
|
||||
# Create directories for volume mount points
|
||||
RUN mkdir -p /opt/attune/packs /opt/attune/logs
|
||||
|
||||
# Note: Packs are NOT copied here
|
||||
# They will be mounted as a volume at runtime from packs_data volume
|
||||
```
|
||||
|
||||
**Result**: Service images contain only binaries and configs, not packs. Packs are mounted at runtime.
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
### Before Optimization (Old Dockerfile)
|
||||
```
|
||||
Scenario: Change api/src/routes/actions.rs
|
||||
- Layer invalidated: COPY crates/ ./crates/
|
||||
- Rebuilds: All dependencies + all crates
|
||||
- Time: ~5-6 minutes
|
||||
- Size: Full dependency rebuild
|
||||
```
|
||||
|
||||
### After Optimization (New Dockerfile)
|
||||
```
|
||||
Scenario: Change api/src/routes/actions.rs
|
||||
- Layer invalidated: COPY crates/api/ ./crates/api/
|
||||
- Rebuilds: Only attune-api binary
|
||||
- Time: ~30-60 seconds
|
||||
- Size: Minimal incremental compilation
|
||||
```
|
||||
|
||||
### Dependency Change Comparison
|
||||
```
|
||||
Scenario: Add new dependency to Cargo.toml
|
||||
- Before: ~5-6 minutes (full rebuild)
|
||||
- After: ~3-4 minutes (dependency cached separately)
|
||||
```
|
||||
|
||||
## Implementation
|
||||
|
||||
### Using Optimized Dockerfiles
|
||||
|
||||
The optimized Dockerfiles are available as:
|
||||
- `docker/Dockerfile.optimized` - For main services (api, executor, sensor, notifier)
|
||||
- `docker/Dockerfile.worker.optimized` - For worker services
|
||||
|
||||
#### Option 1: Switch to Optimized Dockerfiles (Recommended)
|
||||
|
||||
Update `docker-compose.yaml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
api:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: docker/Dockerfile.optimized # Changed from docker/Dockerfile
|
||||
args:
|
||||
SERVICE: api
|
||||
```
|
||||
|
||||
#### Option 2: Replace Existing Dockerfiles
|
||||
|
||||
```bash
|
||||
# Backup current Dockerfiles
|
||||
cp docker/Dockerfile docker/Dockerfile.backup
|
||||
cp docker/Dockerfile.worker docker/Dockerfile.worker.backup
|
||||
|
||||
# Replace with optimized versions
|
||||
mv docker/Dockerfile.optimized docker/Dockerfile
|
||||
mv docker/Dockerfile.worker.optimized docker/Dockerfile.worker
|
||||
```
|
||||
|
||||
### Testing the Optimization
|
||||
|
||||
1. **Clean build (first time)**:
|
||||
```bash
|
||||
docker compose build --no-cache api
|
||||
# Time: ~5-6 minutes (expected, building from scratch)
|
||||
```
|
||||
|
||||
2. **Incremental build (change API code)**:
|
||||
```bash
|
||||
# Edit attune/crates/api/src/routes/actions.rs
|
||||
echo "// test comment" >> crates/api/src/routes/actions.rs
|
||||
|
||||
docker compose build api
|
||||
# Time: ~30-60 seconds (optimized, only rebuilds API)
|
||||
```
|
||||
|
||||
3. **Verify other services not affected**:
|
||||
```bash
|
||||
# The worker service should still use cached layers
|
||||
docker compose build worker-shell
|
||||
# Time: ~5 seconds (uses cache, no rebuild needed)
|
||||
```
|
||||
|
||||
## How It Works: Docker Layer Caching
|
||||
|
||||
Docker builds images in layers, and each instruction (`COPY`, `RUN`, etc.) creates a new layer. Layers are cached and reused if:
|
||||
1. The instruction hasn't changed
|
||||
2. The context (files being copied) hasn't changed
|
||||
3. All previous layers are still valid
|
||||
|
||||
### Old Approach (Unoptimized)
|
||||
```
|
||||
Layer 1: COPY Cargo.toml Cargo.lock
|
||||
Layer 2: COPY crates/ ./crates/ ← Invalidated on ANY crate change
|
||||
Layer 3: RUN cargo build ← Always rebuilds everything
|
||||
```
|
||||
|
||||
### New Approach (Optimized)
|
||||
```
|
||||
Stage 1 (Planner):
|
||||
Layer 1: COPY Cargo.toml Cargo.lock ← Only invalidated on dependency changes
|
||||
Layer 2: COPY */Cargo.toml ← Only invalidated on dependency changes
|
||||
Layer 3: RUN cargo build (dummy) ← Caches compiled dependencies
|
||||
|
||||
Stage 2 (Builder):
|
||||
Layer 4: COPY crates/common/ ← Invalidated on common changes
|
||||
Layer 5: COPY crates/${SERVICE}/ ← Invalidated on service-specific changes
|
||||
Layer 6: RUN cargo build ← Only recompiles changed crates
|
||||
```
|
||||
|
||||
## BuildKit Cache Mounts
|
||||
|
||||
The optimized Dockerfiles also use BuildKit cache mounts for additional speedup:
|
||||
|
||||
```dockerfile
|
||||
RUN --mount=type=cache,target=/usr/local/cargo/registry,sharing=shared \
|
||||
--mount=type=cache,target=/usr/local/cargo/git,sharing=shared \
|
||||
--mount=type=cache,target=/build/target,id=target-builder-${SERVICE} \
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- **Cargo registry**: Downloaded crates persist between builds
|
||||
- **Cargo git**: Git dependencies persist between builds
|
||||
- **Target directory**: Compilation artifacts persist between builds
|
||||
- **Optimized sharing**: Registry/git use `sharing=shared` for concurrent access
|
||||
- **Service-specific caches**: Target directory uses unique cache IDs to prevent conflicts
|
||||
|
||||
**Cache Strategy**:
|
||||
- **`sharing=shared`**: Registry and git caches (cargo handles concurrent access safely)
|
||||
- **Service-specific IDs**: Target caches use `id=target-builder-${SERVICE}` to prevent conflicts
|
||||
- **Result**: Safe parallel builds without serialization overhead (4x faster)
|
||||
- **See**: `docs/QUICKREF-buildkit-cache-strategy.md` for detailed explanation
|
||||
|
||||
**Requirements**:
|
||||
- Enable BuildKit: `export DOCKER_BUILDKIT=1`
|
||||
- Or use docker-compose which enables it automatically
|
||||
|
||||
## Advanced: Parallel Builds
|
||||
|
||||
With the optimized Dockerfiles, you can safely build multiple services in parallel:
|
||||
|
||||
```bash
|
||||
# Build all services in parallel (4 workers)
|
||||
docker compose build --parallel 4
|
||||
|
||||
# Or build specific services
|
||||
docker compose build api executor worker-shell
|
||||
```
|
||||
|
||||
**Optimized for Parallel Builds**:
|
||||
- ✅ Registry/git caches use `sharing=shared` (concurrent-safe)
|
||||
- ✅ Target caches use service-specific IDs (no conflicts)
|
||||
- ✅ **4x faster** than old `sharing=locked` strategy
|
||||
- ✅ No race conditions or "File exists" errors
|
||||
|
||||
**Why it's safe**: Each service compiles different binaries (api vs executor vs worker), so their target caches don't conflict. Cargo's registry and git caches are inherently concurrent-safe.
|
||||
|
||||
See `docs/QUICKREF-buildkit-cache-strategy.md` for detailed explanation of the cache strategy.
|
||||
|
||||
## Tradeoffs and Considerations
|
||||
|
||||
### Advantages
|
||||
- ✅ **Faster incremental builds**: 30 seconds vs 5 minutes
|
||||
- ✅ **Better cache utilization**: Only rebuild what changed
|
||||
- ✅ **Smaller layer diffs**: More efficient CI/CD pipelines
|
||||
- ✅ **Reduced build costs**: Less CPU time in CI environments
|
||||
|
||||
### Disadvantages
|
||||
- ❌ **More complex Dockerfiles**: Additional planner stage
|
||||
- ❌ **Slightly longer first build**: Dummy compilation overhead (~30 seconds)
|
||||
- ❌ **Manual manifest copying**: Need to list all crates explicitly
|
||||
|
||||
### When to Use
|
||||
- ✅ **Active development**: Frequent code changes benefit from fast rebuilds
|
||||
- ✅ **CI/CD pipelines**: Reduce build times and costs
|
||||
- ✅ **Monorepo workspaces**: Multiple services sharing common code
|
||||
|
||||
### When NOT to Use
|
||||
- ❌ **Single-crate projects**: No benefit for non-workspace projects
|
||||
- ❌ **Infrequent builds**: Complexity not worth it for rare builds
|
||||
- ❌ **Dockerfile simplicity required**: Stick with basic approach
|
||||
|
||||
## Pack Binaries
|
||||
|
||||
Pack binaries (like `attune-core-timer-sensor`) need to be built separately and placed in `./packs/` before starting docker-compose.
|
||||
|
||||
### Building Pack Binaries
|
||||
|
||||
Use the provided script:
|
||||
```bash
|
||||
./scripts/build-pack-binaries.sh
|
||||
```
|
||||
|
||||
Or manually:
|
||||
```bash
|
||||
# Build pack binaries in Docker with GLIBC compatibility
|
||||
docker build -f docker/Dockerfile.pack-binaries -t attune-pack-builder .
|
||||
|
||||
# Extract binaries
|
||||
docker create --name pack-tmp attune-pack-builder
|
||||
docker cp pack-tmp:/pack-binaries/attune-core-timer-sensor ./packs/core/sensors/
|
||||
docker rm pack-tmp
|
||||
|
||||
# Make executable
|
||||
chmod +x ./packs/core/sensors/attune-core-timer-sensor
|
||||
```
|
||||
|
||||
The `init-packs` service will copy these binaries (along with other pack files) into the `packs_data` volume when docker-compose starts.
|
||||
|
||||
### Why Separate Pack Binaries?
|
||||
|
||||
- **GLIBC Compatibility**: Built in Debian Bookworm for GLIBC 2.36 compatibility
|
||||
- **Decoupled Updates**: Update pack binaries without rebuilding service images
|
||||
- **Smaller Service Images**: Service images don't include pack compilation stages
|
||||
- **Cleaner Architecture**: Packs are content, services are runtime
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Adding New Crates
|
||||
|
||||
When adding a new crate to the workspace:
|
||||
|
||||
1. **Update `Cargo.toml`** workspace members:
|
||||
```toml
|
||||
[workspace]
|
||||
members = [
|
||||
"crates/common",
|
||||
"crates/new-service", # Add this
|
||||
]
|
||||
```
|
||||
|
||||
2. **Update optimized Dockerfiles** (both planner and builder stages):
|
||||
```dockerfile
|
||||
# In planner stage
|
||||
COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml
|
||||
RUN mkdir -p crates/new-service/src && echo "fn main() {}" > crates/new-service/src/main.rs
|
||||
|
||||
# In builder stage
|
||||
COPY crates/new-service/Cargo.toml ./crates/new-service/Cargo.toml
|
||||
```
|
||||
|
||||
3. **Test the build**:
|
||||
```bash
|
||||
docker compose build new-service
|
||||
```
|
||||
|
||||
### Updating Packs
|
||||
|
||||
Packs are mounted as volumes, so updating them doesn't require rebuilding service images:
|
||||
|
||||
1. **Update pack files** in `./packs/`:
|
||||
```bash
|
||||
# Edit pack files
|
||||
vim packs/core/actions/my_action.yaml
|
||||
```
|
||||
|
||||
2. **Rebuild pack binaries** (if needed):
|
||||
```bash
|
||||
./scripts/build-pack-binaries.sh
|
||||
```
|
||||
|
||||
3. **Restart services** to pick up changes:
|
||||
```bash
|
||||
docker compose restart
|
||||
```
|
||||
|
||||
No image rebuild required!
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Build fails with "crate not found"
|
||||
**Cause**: Missing crate manifest in COPY instructions
|
||||
**Fix**: Add the crate's Cargo.toml to both planner and builder stages
|
||||
|
||||
### Changes not reflected in build
|
||||
**Cause**: Docker using stale cached layers
|
||||
**Fix**: Force rebuild with `docker compose build --no-cache <service>`
|
||||
|
||||
### "File exists" errors during parallel builds
|
||||
**Cause**: Cache mount conflicts
|
||||
**Fix**: Already handled by `sharing=locked` in optimized Dockerfiles
|
||||
|
||||
### Slow builds after dependency changes
|
||||
**Cause**: Expected behavior - dependencies must be recompiled
|
||||
**Fix**: This is normal; optimization helps with code changes, not dependency changes
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
### cargo-chef (Not Used)
|
||||
The `cargo-chef` tool provides similar optimization but requires additional tooling:
|
||||
- Pros: Automatic dependency detection, no manual manifest copying
|
||||
- Cons: Extra dependency, learning curve, additional maintenance
|
||||
|
||||
We opted for the manual approach because:
|
||||
- Simpler to understand and maintain
|
||||
- No external dependencies
|
||||
- Full control over the build process
|
||||
- Easier to debug issues
|
||||
|
||||
### Volume Mounts for Development
|
||||
For local development, consider mounting the source as a volume:
|
||||
```yaml
|
||||
volumes:
|
||||
- ./crates/api:/build/crates/api
|
||||
```
|
||||
- Pros: Instant code updates without rebuilds
|
||||
- Cons: Not suitable for production images
|
||||
|
||||
## References
|
||||
|
||||
- [Docker Build Cache Documentation](https://docs.docker.com/build/cache/)
|
||||
- [BuildKit Cache Mounts](https://docs.docker.com/build/guide/mounts/)
|
||||
- [Rust Docker Best Practices](https://docs.docker.com/language/rust/build-images/)
|
||||
- [cargo-chef Alternative](https://github.com/LukeMathWalker/cargo-chef)
|
||||
|
||||
## Summary
|
||||
|
||||
The optimized Docker build strategy significantly reduces build times by:
|
||||
1. **Separating dependency resolution from source compilation**
|
||||
2. **Only copying the specific crate being built** (plus common dependencies)
|
||||
3. **Using BuildKit cache mounts** to persist compilation artifacts
|
||||
4. **Mounting packs as volumes** instead of copying them into images
|
||||
|
||||
**Key Architecture Principles**:
|
||||
- **Service images**: Contain only compiled binaries and configuration
|
||||
- **Packs**: Mounted as volumes, updated independently of services
|
||||
- **Pack binaries**: Built separately with GLIBC compatibility
|
||||
- **Volume strategy**: `init-packs` service populates shared `packs_data` volume
|
||||
|
||||
**Result**:
|
||||
- Incremental builds drop from 5-6 minutes to 30-60 seconds
|
||||
- Pack updates don't require image rebuilds
|
||||
- Service images are smaller and more focused
|
||||
- Docker-based development workflows are practical for Rust workspaces
|
||||
Reference in New Issue
Block a user