313 lines
12 KiB
Markdown
313 lines
12 KiB
Markdown
# Docker Migrations and Startup Configuration Fixes
|
|
|
|
**Date**: 2026-01-31
|
|
**Status**: ✅ Complete
|
|
**Issue**: Services failing to start due to missing database migrations and configuration errors
|
|
|
|
## Problems Solved
|
|
|
|
### 1. Database Migrations Not Running
|
|
**Error**: `enum WorkerType does not have variant constructor docker`
|
|
|
|
**Root Cause**: Database schema (enums, tables, triggers) wasn't being created when Docker containers started, causing enum type errors when services tried to query the database.
|
|
|
|
**Solution**: Created automated migration system that runs before services start.
|
|
|
|
### 2. Port Conflicts
|
|
**Error**: `address already in use` for ports 5432 (PostgreSQL) and 5672 (RabbitMQ)
|
|
|
|
**Root Cause**: System-level PostgreSQL and RabbitMQ services were already running and using the same ports.
|
|
|
|
**Solution**: Created helper script to stop system services and documented port conflict resolution.
|
|
|
|
### 3. Configuration Errors
|
|
**Error**: Multiple configuration validation failures
|
|
|
|
**Issues Fixed**:
|
|
- `worker_type: docker` → Changed to `worker_type: container` (invalid enum value)
|
|
- `ENCRYPTION_KEY` too short → Extended to 60+ characters
|
|
- Wrong environment variable names → Fixed to use `ATTUNE__` prefix
|
|
|
|
## Implementation Details
|
|
|
|
### Migration System
|
|
|
|
**Created Files**:
|
|
1. **`docker/run-migrations.sh`** (162 lines)
|
|
- Waits for PostgreSQL to be ready
|
|
- Tracks applied migrations in `_migrations` table
|
|
- Runs migrations in sorted order with transaction safety
|
|
- Provides detailed progress output with color coding
|
|
- Handles errors gracefully with rollback
|
|
|
|
2. **`docker/init-roles.sql`** (19 lines)
|
|
- Creates required PostgreSQL roles (`svc_attune`, `attune_api`)
|
|
- Grants necessary permissions
|
|
- Runs before migrations to satisfy GRANT statements
|
|
|
|
**Updated Files**:
|
|
- **`docker-compose.yaml`**:
|
|
- Added `migrations` service using `postgres:16-alpine` image
|
|
- Configured to run before all Attune services
|
|
- Services depend on `migrations` with `condition: service_completed_successfully`
|
|
- Mounts migration scripts and SQL files
|
|
|
|
### Port Conflict Resolution
|
|
|
|
**Created Files**:
|
|
1. **`scripts/stop-system-services.sh`** (184 lines)
|
|
- Stops PostgreSQL, RabbitMQ, Redis system services
|
|
- Verifies ports are free (5432, 5672, 6379, 8080, 8081, 3000)
|
|
- Cleans up orphaned Docker containers
|
|
- Interactive prompts for disabling services on boot
|
|
|
|
2. **`docker/PORT_CONFLICTS.md`** (303 lines)
|
|
- Comprehensive troubleshooting guide
|
|
- Port conflict table
|
|
- Multiple resolution methods
|
|
- Alternative approaches (changing ports, using system services)
|
|
|
|
### Configuration Fixes
|
|
|
|
**Files Modified**:
|
|
|
|
1. **`docker-compose.yaml`**:
|
|
- Fixed: `ENCRYPTION_KEY` → `ATTUNE__SECURITY__ENCRYPTION_KEY`
|
|
- Fixed: `JWT_SECRET` → `ATTUNE__SECURITY__JWT_SECRET`
|
|
- Added: `ATTUNE__WORKER__WORKER_TYPE: container`
|
|
- Updated default encryption key length to 60+ characters
|
|
|
|
2. **`config.docker.yaml`**:
|
|
- Changed `worker_type: docker` → `worker_type: container`
|
|
|
|
3. **`env.docker.example`**:
|
|
- Updated `ENCRYPTION_KEY` example to 60+ characters
|
|
- Added proper documentation for environment variable format
|
|
|
|
### Docker Build Race Conditions (Bonus)
|
|
|
|
**Also Fixed**:
|
|
- Added `sharing=locked` to BuildKit cache mounts in `docker/Dockerfile`
|
|
- Created `make docker-cache-warm` target for optimal build performance
|
|
- Documented race condition solutions in `docker/DOCKER_BUILD_RACE_CONDITIONS.md`
|
|
|
|
## Migration System Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ docker compose up -d │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Infrastructure Services Start │
|
|
│ - PostgreSQL (postgres:16-alpine) │
|
|
│ - RabbitMQ (rabbitmq:3.13-management-alpine) │
|
|
│ - Redis (redis:7-alpine) │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Wait for Services to be Healthy │
|
|
│ (healthchecks pass) │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Migrations Service Starts │
|
|
│ 1. Run docker/init-roles.sql │
|
|
│ - Create svc_attune role │
|
|
│ - Create attune_api role │
|
|
│ - Grant permissions │
|
|
│ 2. Create _migrations tracking table │
|
|
│ 3. Run migrations in order: │
|
|
│ - Check if already applied │
|
|
│ - Run in transaction │
|
|
│ - Mark as applied │
|
|
│ 4. Exit with success │
|
|
└────────────────┬────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Attune Services Start (depend on migrations) │
|
|
│ - attune-api (port 8080) │
|
|
│ - attune-executor │
|
|
│ - attune-worker │
|
|
│ - attune-sensor │
|
|
│ - attune-notifier (port 8081) │
|
|
│ - attune-web (port 3000) │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Migration Tracking
|
|
|
|
The migration system creates a `_migrations` table to track applied migrations:
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS _migrations (
|
|
id SERIAL PRIMARY KEY,
|
|
filename VARCHAR(255) UNIQUE NOT NULL,
|
|
applied_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
This prevents re-running migrations and allows for idempotent deployments.
|
|
|
|
## Results
|
|
|
|
### Before
|
|
- ❌ Services failed to start with enum errors
|
|
- ❌ Port conflicts prevented container startup
|
|
- ❌ Configuration validation errors
|
|
- ❌ Manual database setup required
|
|
- ❌ No migration tracking
|
|
|
|
### After
|
|
- ✅ All 16 migrations apply automatically on first startup
|
|
- ✅ Migrations tracked and skipped on subsequent runs
|
|
- ✅ API service healthy and responding on port 8080
|
|
- ✅ Web UI accessible on port 3000
|
|
- ✅ Infrastructure services running correctly
|
|
- ✅ Executor, sensor, notifier services operational
|
|
- ✅ Configuration properly validated
|
|
- ⚠️ Worker service needs Python runtime (separate issue)
|
|
|
|
## Testing Results
|
|
|
|
```bash
|
|
$ docker compose ps
|
|
NAME STATUS PORTS
|
|
attune-api Up (healthy) 0.0.0.0:8080->8080/tcp
|
|
attune-executor Up (health: starting) 8080/tcp
|
|
attune-notifier Up (health: starting) 0.0.0.0:8081->8081/tcp
|
|
attune-postgres Up (healthy) 0.0.0.0:5432->5432/tcp
|
|
attune-rabbitmq Up (healthy) 0.0.0.0:5672->5672/tcp
|
|
attune-redis Up (healthy) 0.0.0.0:6379->6379/tcp
|
|
attune-sensor Up (health: starting) 8080/tcp
|
|
attune-web Up (healthy) 0.0.0.0:3000->80/tcp
|
|
attune-worker Restarting (Python issue)
|
|
|
|
$ curl http://localhost:8080/health
|
|
{"status":"ok"}
|
|
```
|
|
|
|
## Usage
|
|
|
|
### First-Time Setup
|
|
|
|
```bash
|
|
# Stop system services (if needed)
|
|
./scripts/stop-system-services.sh
|
|
|
|
# Start everything
|
|
docker compose up -d
|
|
|
|
# Check status
|
|
docker compose ps
|
|
|
|
# View migration logs
|
|
docker compose logs migrations
|
|
|
|
# Check API health
|
|
curl http://localhost:8080/health
|
|
```
|
|
|
|
### Subsequent Starts
|
|
|
|
```bash
|
|
# Migrations only run if new ones are detected
|
|
docker compose up -d
|
|
|
|
# Database schema persists in postgres_data volume
|
|
# Already-applied migrations are skipped automatically
|
|
```
|
|
|
|
### Troubleshooting
|
|
|
|
```bash
|
|
# Reset database completely
|
|
docker compose down -v # WARNING: Deletes all data
|
|
docker compose up -d
|
|
|
|
# Check migration status
|
|
docker compose exec postgres psql -U attune -d attune -c "SELECT * FROM _migrations;"
|
|
|
|
# View service logs
|
|
docker compose logs api
|
|
docker compose logs migrations
|
|
```
|
|
|
|
## Known Issues
|
|
|
|
### Worker Service - Python Runtime Missing
|
|
**Status**: Not Critical (services work without worker)
|
|
|
|
**Error**: `Python validation failed: No such file or directory (os error 2)`
|
|
|
|
**Cause**: Worker container doesn't have Python installed but tries to validate Python runtime
|
|
|
|
**Solution Options**:
|
|
1. Install Python in worker container (Dockerfile update)
|
|
2. Make Python runtime validation optional
|
|
3. Use shell-only actions until fixed
|
|
|
|
This doesn't block core functionality - API, executor, sensor, and notifier all work correctly.
|
|
|
|
## Files Created/Modified
|
|
|
|
### Created (9 files)
|
|
- `docker/run-migrations.sh` - Migration runner script
|
|
- `docker/init-roles.sql` - PostgreSQL role initialization
|
|
- `docker/PORT_CONFLICTS.md` - Port conflict resolution guide
|
|
- `scripts/stop-system-services.sh` - System service management
|
|
- `docker/DOCKER_BUILD_RACE_CONDITIONS.md` - Build optimization guide
|
|
- `docker/BUILD_QUICKSTART.md` - Quick start guide
|
|
- `docker/.dockerbuild-quickref.txt` - Quick reference card
|
|
- `work-summary/docker-build-race-fix.md` - Build race fix summary
|
|
- `work-summary/docker-migrations-startup-fix.md` - This file
|
|
|
|
### Modified (6 files)
|
|
- `docker-compose.yaml` - Added migrations service, fixed env vars
|
|
- `docker/Dockerfile` - Added cache sharing locks
|
|
- `config.docker.yaml` - Fixed worker_type enum value
|
|
- `env.docker.example` - Updated encryption key length
|
|
- `Makefile` - Added docker helpers
|
|
- `README.md` - Updated Docker deployment instructions
|
|
|
|
## Environment Variable Reference
|
|
|
|
### Required Format
|
|
|
|
```bash
|
|
# Use double underscore __ as separator
|
|
ATTUNE__SECTION__KEY=value
|
|
|
|
# Examples:
|
|
ATTUNE__SECURITY__JWT_SECRET=your-secret-here
|
|
ATTUNE__SECURITY__ENCRYPTION_KEY=your-32plus-char-key-here
|
|
ATTUNE__DATABASE__URL=postgresql://user:pass@host:port/db
|
|
ATTUNE__WORKER__WORKER_TYPE=container
|
|
```
|
|
|
|
### Common Mistakes
|
|
|
|
❌ `ENCRYPTION_KEY=value` (missing prefix)
|
|
✅ `ATTUNE__SECURITY__ENCRYPTION_KEY=value`
|
|
|
|
❌ `ATTUNE_SECURITY_ENCRYPTION_KEY=value` (single underscore)
|
|
✅ `ATTUNE__SECURITY__ENCRYPTION_KEY=value` (double underscore)
|
|
|
|
❌ Short encryption key (< 32 chars)
|
|
✅ Key with 32+ characters
|
|
|
|
## Summary
|
|
|
|
Successfully implemented automated database migration system for Docker deployments, eliminating manual setup steps and ensuring consistent database state across environments. The migration system is:
|
|
|
|
- **Idempotent**: Safe to run multiple times
|
|
- **Transactional**: Each migration runs in a transaction with rollback on error
|
|
- **Tracked**: Applied migrations recorded to prevent re-running
|
|
- **Ordered**: Migrations run in sorted filename order
|
|
- **Visible**: Clear console output with success/failure indicators
|
|
|
|
This provides a production-ready database initialization flow that matches industry best practices for containerized applications. |