8.4 KiB
Log Size Limits
Overview
The log size limits feature prevents Out-of-Memory (OOM) issues when actions produce large amounts of output. Instead of buffering all stdout/stderr in memory, the worker service streams logs with configurable size limits and adds truncation notices when limits are exceeded.
Configuration
Log size limits are configured in the worker configuration:
worker:
max_stdout_bytes: 10485760 # 10MB (default)
max_stderr_bytes: 10485760 # 10MB (default)
stream_logs: true # Enable log streaming (default)
Or via environment variables:
ATTUNE__WORKER__MAX_STDOUT_BYTES=10485760
ATTUNE__WORKER__MAX_STDERR_BYTES=10485760
ATTUNE__WORKER__STREAM_LOGS=true
How It Works
1. Streaming Architecture
Instead of using wait_with_output() which buffers all output in memory, the worker:
- Spawns the process with piped stdout/stderr
- Creates
BoundedLogWriterinstances for each stream - Reads output line-by-line concurrently
- Writes to bounded writers that enforce size limits
- Waits for process completion while streaming continues
2. Truncation Behavior
When output exceeds the configured limit:
- The writer stops accepting new data after reaching the effective limit (configured limit - 128 byte reserve)
- A truncation notice is appended to the log
- Additional output is counted but discarded
- The execution result includes truncation metadata
Truncation Notices:
- stdout:
[OUTPUT TRUNCATED: stdout exceeded size limit] - stderr:
[OUTPUT TRUNCATED: stderr exceeded size limit]
3. Execution Result Metadata
The ExecutionResult struct includes truncation information:
pub struct ExecutionResult {
pub stdout: String,
pub stderr: String,
// ... other fields ...
// Truncation metadata
pub stdout_truncated: bool,
pub stderr_truncated: bool,
pub stdout_bytes_truncated: usize,
pub stderr_bytes_truncated: usize,
}
Example:
{
"stdout": "Line 1\nLine 2\n...\nLine 100\n\n[OUTPUT TRUNCATED: stdout exceeded size limit]\n",
"stderr": "",
"stdout_truncated": true,
"stderr_truncated": false,
"stdout_bytes_truncated": 950000,
"exit_code": 0
}
Implementation Details
BoundedLogWriter
The core component is BoundedLogWriter, which implements AsyncWrite:
- Reserve Space: Reserves 128 bytes for the truncation notice
- Line-by-Line Reading: Reads output line-by-line to ensure clean truncation boundaries
- No Backpressure: Always reports successful writes to avoid blocking the process
- Concurrent Streaming: stdout and stderr are streamed concurrently using
tokio::join!
Runtime Integration
All runtimes (Python, Shell, Local) use the streaming approach:
- Python Runtime:
execute_with_streaming()method handles both-cand file execution - Shell Runtime:
execute_with_streaming()method handles both-cand file execution - Local Runtime: Delegates to Python/Shell, inheriting streaming behavior
Memory Safety
Without log size limits:
- Action outputting 1GB → Worker uses 1GB+ memory
- 10 concurrent large actions → 10GB+ memory usage → OOM
With log size limits (10MB default):
- Action outputting 1GB → Worker uses ~10MB per action
- 10 concurrent large actions → ~100MB memory usage
- Safe and predictable memory usage
Examples
Action with Large Output
Action:
# outputs 100MB
for i in range(1000000):
print(f"Line {i}: " + "x" * 100)
Result (with 10MB limit):
{
"exit_code": 0,
"stdout": "[first 10MB of output]\n\n[OUTPUT TRUNCATED: stdout exceeded size limit]\n",
"stdout_truncated": true,
"stdout_bytes_truncated": 90000000,
"duration_ms": 1234
}
Action with Large stderr
Action:
import sys
# outputs 50MB to stderr
for i in range(500000):
sys.stderr.write(f"Warning {i}\n")
Result (with 10MB limit):
{
"exit_code": 0,
"stdout": "",
"stderr": "[first 10MB of warnings]\n\n[OUTPUT TRUNCATED: stderr exceeded size limit]\n",
"stderr_truncated": true,
"stderr_bytes_truncated": 40000000,
"duration_ms": 2345
}
No Truncation (Under Limit)
Action:
print("Hello, World!")
Result:
{
"exit_code": 0,
"stdout": "Hello, World!\n",
"stderr": "",
"stdout_truncated": false,
"stderr_truncated": false,
"stdout_bytes_truncated": 0,
"stderr_bytes_truncated": 0,
"duration_ms": 45
}
API Access
Execution Result
When retrieving execution results via the API, truncation metadata is included:
curl http://localhost:8080/api/v1/executions/123
Response:
{
"data": {
"id": 123,
"status": "succeeded",
"result": {
"stdout": "...[OUTPUT TRUNCATED]...",
"stderr": "",
"exit_code": 0
},
"stdout_truncated": true,
"stderr_truncated": false,
"stdout_bytes_truncated": 1500000
}
}
Best Practices
1. Configure Appropriate Limits
Choose limits based on your use case:
- Small actions (< 1MB output): Use default 10MB limit
- Data processing (moderate output): Consider 50-100MB
- Log analysis (large output): Consider 100-500MB
- Never: Set to unlimited (risks OOM)
2. Design Actions for Limited Logs
Instead of printing all data:
# BAD: Prints entire dataset
for item in large_dataset:
print(item)
Use structured output:
# GOOD: Print summary, store data elsewhere
print(f"Processed {len(large_dataset)} items")
print(f"Results saved to: {output_file}")
3. Monitor Truncation
Track truncation events:
- Alert if many executions are truncated
- May indicate actions need refactoring
- Or limits need adjustment
4. Use Artifacts for Large Data
For large outputs, use artifacts:
import json
# Write large data to artifact
with open('/tmp/results.json', 'w') as f:
json.dump(large_results, f)
# Print only summary
print(f"Results written: {len(large_results)} items")
Performance Impact
Before (Buffered Output)
- Memory: O(output_size) per execution
- Risk: OOM on large output
- Speed: Fast (no streaming overhead)
After (Streaming with Limits)
- Memory: O(limit_size) per execution, bounded
- Risk: No OOM, predictable memory usage
- Speed: Minimal overhead (~1-2% for line-by-line reading)
- Safety: Production-ready
Testing
Test log truncation in your actions:
import sys
def test_truncation():
# Output 20MB (exceeds 10MB limit)
for i in range(200000):
print("x" * 100)
# This line won't appear in output if truncated
print("END")
# But execution still completes successfully
return {"status": "success"}
Check truncation in result:
if result.stdout_truncated:
print(f"Output was truncated by {result.stdout_bytes_truncated} bytes")
Troubleshooting
Issue: Important output is truncated
Solution: Refactor action to:
- Print only essential information
- Store detailed data in artifacts
- Use structured logging
Issue: Need to see all output for debugging
Solution: Temporarily increase limits:
worker:
max_stdout_bytes: 104857600 # 100MB for debugging
Issue: Memory usage still high
Check:
- Are limits configured correctly?
- Are multiple workers running with high concurrency?
- Are artifacts consuming memory?
Limitations
- Line Boundaries: Truncation happens at line boundaries, so the last line before truncation is included completely
- Binary Output: Only text output is supported; binary output may be corrupted
- Reserve Space: 128 bytes reserved for truncation notice reduces effective limit
- No Rotation: Logs don't rotate; truncation is permanent
Future Enhancements
Potential improvements:
- Log Rotation: Rotate logs to files instead of truncation
- Compressed Storage: Store truncated logs compressed
- Streaming API: Stream logs in real-time via WebSocket
- Per-Action Limits: Configure limits per action
- Smart Truncation: Preserve first N bytes and last M bytes
Related Features
- Artifacts: Store large output as artifacts instead of logs
- Timeouts: Prevent runaway processes (separate from log limits)
- Resource Limits: CPU/memory limits for actions (future)