re-uploading work

2026-02-04 17:46:30 -06:00
commit 3b14c65998
1388 changed files with 381262 additions and 0 deletions
--- a/docs/performance/log-size-limits.md
+++ b/docs/performance/log-size-limits.md
@@ -0,0 +1,346 @@
+# Log Size Limits
+
+## Overview
+
+The log size limits feature prevents Out-of-Memory (OOM) issues when actions produce large amounts of output. Instead of buffering all stdout/stderr in memory, the worker service streams logs with configurable size limits and adds truncation notices when limits are exceeded.
+
+## Configuration
+
+Log size limits are configured in the worker configuration:
+
+```yaml
+worker:
+  max_stdout_bytes: 10485760  # 10MB (default)
+  max_stderr_bytes: 10485760  # 10MB (default)
+  stream_logs: true           # Enable log streaming (default)
+```
+
+Or via environment variables:
+
+```bash
+ATTUNE__WORKER__MAX_STDOUT_BYTES=10485760
+ATTUNE__WORKER__MAX_STDERR_BYTES=10485760
+ATTUNE__WORKER__STREAM_LOGS=true
+```
+
+## How It Works
+
+### 1. Streaming Architecture
+
+Instead of using `wait_with_output()` which buffers all output in memory, the worker:
+
+1. Spawns the process with piped stdout/stderr
+2. Creates `BoundedLogWriter` instances for each stream
+3. Reads output line-by-line concurrently
+4. Writes to bounded writers that enforce size limits
+5. Waits for process completion while streaming continues
+
+### 2. Truncation Behavior
+
+When output exceeds the configured limit:
+
+1. The writer stops accepting new data after reaching the effective limit (configured limit - 128 byte reserve)
+2. A truncation notice is appended to the log
+3. Additional output is counted but discarded
+4. The execution result includes truncation metadata
+
+**Truncation Notices:**
+- **stdout**: `[OUTPUT TRUNCATED: stdout exceeded size limit]`
+- **stderr**: `[OUTPUT TRUNCATED: stderr exceeded size limit]`
+
+### 3. Execution Result Metadata
+
+The `ExecutionResult` struct includes truncation information:
+
+```rust
+pub struct ExecutionResult {
+    pub stdout: String,
+    pub stderr: String,
+    // ... other fields ...
+    
+    // Truncation metadata
+    pub stdout_truncated: bool,
+    pub stderr_truncated: bool,
+    pub stdout_bytes_truncated: usize,
+    pub stderr_bytes_truncated: usize,
+}
+```
+
+**Example:**
+```json
+{
+  "stdout": "Line 1\nLine 2\n...\nLine 100\n\n[OUTPUT TRUNCATED: stdout exceeded size limit]\n",
+  "stderr": "",
+  "stdout_truncated": true,
+  "stderr_truncated": false,
+  "stdout_bytes_truncated": 950000,
+  "exit_code": 0
+}
+```
+
+## Implementation Details
+
+### BoundedLogWriter
+
+The core component is `BoundedLogWriter`, which implements `AsyncWrite`:
+
+- **Reserve Space**: Reserves 128 bytes for the truncation notice
+- **Line-by-Line Reading**: Reads output line-by-line to ensure clean truncation boundaries
+- **No Backpressure**: Always reports successful writes to avoid blocking the process
+- **Concurrent Streaming**: stdout and stderr are streamed concurrently using `tokio::join!`
+
+### Runtime Integration
+
+All runtimes (Python, Shell, Local) use the streaming approach:
+
+1. **Python Runtime**: `execute_with_streaming()` method handles both `-c` and file execution
+2. **Shell Runtime**: `execute_with_streaming()` method handles both `-c` and file execution
+3. **Local Runtime**: Delegates to Python/Shell, inheriting streaming behavior
+
+### Memory Safety
+
+Without log size limits:
+- Action outputting 1GB → Worker uses 1GB+ memory
+- 10 concurrent large actions → 10GB+ memory usage → OOM
+
+With log size limits (10MB default):
+- Action outputting 1GB → Worker uses ~10MB per action
+- 10 concurrent large actions → ~100MB memory usage
+- Safe and predictable memory usage
+
+## Examples
+
+### Action with Large Output
+
+**Action:**
+```python
+# outputs 100MB
+for i in range(1000000):
+    print(f"Line {i}: " + "x" * 100)
+```
+
+**Result (with 10MB limit):**
+```json
+{
+  "exit_code": 0,
+  "stdout": "[first 10MB of output]\n\n[OUTPUT TRUNCATED: stdout exceeded size limit]\n",
+  "stdout_truncated": true,
+  "stdout_bytes_truncated": 90000000,
+  "duration_ms": 1234
+}
+```
+
+### Action with Large stderr
+
+**Action:**
+```python
+import sys
+# outputs 50MB to stderr
+for i in range(500000):
+    sys.stderr.write(f"Warning {i}\n")
+```
+
+**Result (with 10MB limit):**
+```json
+{
+  "exit_code": 0,
+  "stdout": "",
+  "stderr": "[first 10MB of warnings]\n\n[OUTPUT TRUNCATED: stderr exceeded size limit]\n",
+  "stderr_truncated": true,
+  "stderr_bytes_truncated": 40000000,
+  "duration_ms": 2345
+}
+```
+
+### No Truncation (Under Limit)
+
+**Action:**
+```python
+print("Hello, World!")
+```
+
+**Result:**
+```json
+{
+  "exit_code": 0,
+  "stdout": "Hello, World!\n",
+  "stderr": "",
+  "stdout_truncated": false,
+  "stderr_truncated": false,
+  "stdout_bytes_truncated": 0,
+  "stderr_bytes_truncated": 0,
+  "duration_ms": 45
+}
+```
+
+## API Access
+
+### Execution Result
+
+When retrieving execution results via the API, truncation metadata is included:
+
+```bash
+curl http://localhost:8080/api/v1/executions/123
+```
+
+**Response:**
+```json
+{
+  "data": {
+    "id": 123,
+    "status": "succeeded",
+    "result": {
+      "stdout": "...[OUTPUT TRUNCATED]...",
+      "stderr": "",
+      "exit_code": 0
+    },
+    "stdout_truncated": true,
+    "stderr_truncated": false,
+    "stdout_bytes_truncated": 1500000
+  }
+}
+```
+
+## Best Practices
+
+### 1. Configure Appropriate Limits
+
+Choose limits based on your use case:
+
+- **Small actions** (< 1MB output): Use default 10MB limit
+- **Data processing** (moderate output): Consider 50-100MB
+- **Log analysis** (large output): Consider 100-500MB
+- **Never**: Set to unlimited (risks OOM)
+
+### 2. Design Actions for Limited Logs
+
+Instead of printing all data:
+
+```python
+# BAD: Prints entire dataset
+for item in large_dataset:
+    print(item)
+```
+
+Use structured output:
+
+```python
+# GOOD: Print summary, store data elsewhere
+print(f"Processed {len(large_dataset)} items")
+print(f"Results saved to: {output_file}")
+```
+
+### 3. Monitor Truncation
+
+Track truncation events:
+- Alert if many executions are truncated
+- May indicate actions need refactoring
+- Or limits need adjustment
+
+### 4. Use Artifacts for Large Data
+
+For large outputs, use artifacts:
+
+```python
+import json
+
+# Write large data to artifact
+with open('/tmp/results.json', 'w') as f:
+    json.dump(large_results, f)
+
+# Print only summary
+print(f"Results written: {len(large_results)} items")
+```
+
+## Performance Impact
+
+### Before (Buffered Output)
+
+- **Memory**: O(output_size) per execution
+- **Risk**: OOM on large output
+- **Speed**: Fast (no streaming overhead)
+
+### After (Streaming with Limits)
+
+- **Memory**: O(limit_size) per execution, bounded
+- **Risk**: No OOM, predictable memory usage
+- **Speed**: Minimal overhead (~1-2% for line-by-line reading)
+- **Safety**: Production-ready
+
+## Testing
+
+Test log truncation in your actions:
+
+```python
+import sys
+
+def test_truncation():
+    # Output 20MB (exceeds 10MB limit)
+    for i in range(200000):
+        print("x" * 100)
+    
+    # This line won't appear in output if truncated
+    print("END")
+    
+    # But execution still completes successfully
+    return {"status": "success"}
+```
+
+Check truncation in result:
+```python
+if result.stdout_truncated:
+    print(f"Output was truncated by {result.stdout_bytes_truncated} bytes")
+```
+
+## Troubleshooting
+
+### Issue: Important output is truncated
+
+**Solution**: Refactor action to:
+1. Print only essential information
+2. Store detailed data in artifacts
+3. Use structured logging
+
+### Issue: Need to see all output for debugging
+
+**Solution**: Temporarily increase limits:
+```yaml
+worker:
+  max_stdout_bytes: 104857600  # 100MB for debugging
+```
+
+### Issue: Memory usage still high
+
+**Check**:
+1. Are limits configured correctly?
+2. Are multiple workers running with high concurrency?
+3. Are artifacts consuming memory?
+
+## Limitations
+
+1. **Line Boundaries**: Truncation happens at line boundaries, so the last line before truncation is included completely
+2. **Binary Output**: Only text output is supported; binary output may be corrupted
+3. **Reserve Space**: 128 bytes reserved for truncation notice reduces effective limit
+4. **No Rotation**: Logs don't rotate; truncation is permanent
+
+## Future Enhancements
+
+Potential improvements:
+
+1. **Log Rotation**: Rotate logs to files instead of truncation
+2. **Compressed Storage**: Store truncated logs compressed
+3. **Streaming API**: Stream logs in real-time via WebSocket
+4. **Per-Action Limits**: Configure limits per action
+5. **Smart Truncation**: Preserve first N bytes and last M bytes
+
+## Related Features
+
+- **Artifacts**: Store large output as artifacts instead of logs
+- **Timeouts**: Prevent runaway processes (separate from log limits)
+- **Resource Limits**: CPU/memory limits for actions (future)
+
+## See Also
+
+- [Worker Configuration](worker-configuration.md)
+- [Runtime Architecture](runtime-architecture.md)
+- [Performance Tuning](performance-tuning.md)