Files
attune/work-summary/phases/phase3-jsonschema-analysis.md
2026-02-04 17:46:30 -06:00

10 KiB

Phase 3: JsonSchema Investigation - Analysis & Recommendation

Date: 2026-01-28
Status: 🔍 INVESTIGATED
Priority: LOW
Recommendation: DO NOT REMOVE - Critical functionality


Executive Summary

After thorough investigation, Phase 3 (removing jsonschema) is NOT RECOMMENDED. The jsonschema crate provides critical runtime validation functionality that cannot be easily replaced. The reqwest 0.12 vs 0.13 duplication it causes is a minor issue compared to the value it provides.

Recommendation: Keep jsonschema, accept the reqwest duplication, and monitor for upstream updates.


What jsonschema Does

Primary Use Case: Runtime JSON Schema Validation

Location: crates/common/src/schema.rs

The SchemaValidator struct provides runtime validation of JSON data against JSON Schema specifications:

pub struct SchemaValidator {
    schema: JsonValue,
}

impl SchemaValidator {
    pub fn new(schema: JsonValue) -> Result<Self> { ... }
    
    pub fn validate(&self, data: &JsonValue) -> Result<()> {
        let compiled = jsonschema::validator_for(&self.schema)
            .map_err(|e| Error::schema_validation(...))?;

        if let Err(error) = compiled.validate(data) {
            return Err(Error::schema_validation(...));
        }
        Ok(())
    }
}

Critical Business Use Cases

  1. Action Parameter Validation: Ensures action inputs conform to their schema
  2. Workflow Input Validation: Validates workflow parameters at runtime
  3. Inquiry Response Validation: Validates human responses match expected schema
  4. Trigger Output Validation: Ensures trigger outputs are well-formed
  5. Pack Configuration Validation: Validates pack config against schema

Schema Storage in Database

Multiple entities store JSON schemas in the database:

Entity Schema Fields Purpose
Pack conf_schema Validate pack configuration
Trigger param_schema, out_schema Validate trigger params/outputs
Sensor param_schema Validate sensor configuration
Action param_schema, out_schema Validate action inputs/outputs
Inquiry response_schema Validate human responses
WorkflowDefinition param_schema, out_schema Validate workflow inputs/outputs

These schemas are user-defined and stored as JSONB in PostgreSQL. They are loaded at runtime and used to validate data dynamically.


Why jsonschema Cannot Be Easily Removed

1. No Drop-in Replacement

Problem: There is no equivalent Rust crate that provides:

  • Full JSON Schema Draft 7 support
  • Runtime schema compilation from JSON
  • Comprehensive validation error messages
  • Active maintenance

Alternatives Considered:

Alternative Why It Doesn't Work
validator crate Compile-time annotations only; cannot validate against runtime JSON schemas
schemars crate Schema generation only; does not perform validation
Custom validation Would require implementing JSON Schema spec from scratch (~1000s of LOCs)

2. JSON Schema is a Standard

JSON Schema is an industry standard (RFC 8927) used by:

  • OpenAPI specifications
  • Pack definitions
  • User-defined validation rules
  • Third-party integrations

Removing it would break compatibility with these standards.

3. Critical for Multi-Tenancy

In a multi-tenant system like Attune:

  • Each tenant can define custom actions with custom schemas
  • Each workflow can have different input/output schemas
  • Validation must happen at runtime with tenant-specific schemas

This cannot be done with compile-time validation tools.

4. Human-in-the-Loop Workflows

Inquiries require validating human responses against schemas:

{
  "type": "object",
  "properties": {
    "approved": {"type": "boolean"},
    "comments": {"type": "string"}
  },
  "required": ["approved"]
}

Without runtime validation, we cannot ensure human inputs are valid.


Cost of Keeping jsonschema

The Reqwest Duplication Issue

Current State:

  • jsonschema 0.38.1 depends on reqwest 0.12.28
  • Our codebase uses reqwest 0.13.1
  • Both versions exist in the dependency tree

Impact:

  • ⚠️ ~8-10 duplicate transitive dependencies (hyper, http, etc.)
  • ⚠️ ~1-2 MB additional binary size
  • ⚠️ Slightly larger SBOM
  • ⚠️ Longer compilation time (~10-20 seconds)

Why This Happens: jsonschema uses reqwest to fetch remote schemas (e.g., http://json-schema.org/draft-07/schema#). This is an optional feature but enabled by default.

Is the Duplication a Problem?

NO - for the following reasons:

  1. Marginal Impact: 1-2 MB in a ~50-100 MB binary is negligible
  2. No Runtime Conflicts: Both versions coexist peacefully
  3. No Security Issues: Both versions are actively maintained
  4. Temporary: Will resolve when jsonschema updates (see below)

Mitigation Strategies

Status: jsonschema is actively maintained

Tracking:

Expectation: Will likely update to reqwest 0.13 in next major/minor release

Action: Monitor quarterly; no code changes needed


Option 2: Disable Remote Schema Fetching

Idea: Use jsonschema with default-features = false to avoid reqwest entirely

Investigation:

jsonschema = { version = "0.38", default-features = false }

Pros:

  • Would eliminate reqwest 0.12 dependency
  • No code changes required
  • Retains all validation functionality

Cons:

  • Breaks remote schema references (e.g., {"$ref": "http://..."}}
  • May break pack imports from external sources
  • Needs testing to verify no current packs use remote refs

Recommendation: 🔍 INVESTIGATE if we want to eliminate duplication

Testing Required:

  1. Check if any packs use remote schema references
  2. Build with default-features = false
  3. Run full test suite
  4. Test core pack loading

Risk: LOW if no remote refs are used; MEDIUM if they are


Idea: Patch jsonschema to use reqwest 0.13

Why Not:

  • Fragile; breaks on jsonschema updates
  • Requires maintaining a fork
  • May introduce subtle bugs
  • Against Rust ecosystem best practices

Verdict: DO NOT DO THIS


Idea: Build our own JSON Schema validator

Estimated Effort: 2-4 weeks full-time

Why Not:

  • Massive engineering effort
  • Bug-prone (JSON Schema spec is complex)
  • Maintenance burden
  • No competitive advantage

Verdict: TERRIBLE IDEA


Recommendation

Immediate Action: Accept the Status Quo

Decision: Keep jsonschema 0.38.1 with reqwest 0.12 duplication

Rationale:

  1. Critical functionality, cannot be removed
  2. Duplication impact is negligible (1-2 MB, ~15 seconds build time)
  3. No security or runtime issues
  4. Will likely resolve itself via upstream update
  5. No engineering effort required

Follow-up Action: Investigate Disabling Remote Schema Fetching 🔍

Timeline: Next quarter (when time permits)

Steps:

  1. Audit all pack definitions for remote schema references
  2. If none found, test with default-features = false
  3. Run comprehensive test suite
  4. If successful, eliminate reqwest 0.12 entirely

Expected Effort: 1-2 hours

Expected Impact (if successful):

  • Eliminates reqwest duplication
  • ~1-2 MB binary reduction
  • ~10-20 seconds faster builds
  • Cleaner dependency tree

Long-term Monitoring 📊

Quarterly Check:

cargo tree -p jsonschema | grep reqwest

If jsonschema updates to reqwest 0.13:

  1. Update Cargo.toml to latest version
  2. Run tests
  3. Duplication automatically resolved

Conclusion

Phase 3 Decision: DO NOT PROCEED with jsonschema removal

The jsonschema crate is critical infrastructure for Attune's automation platform. The reqwest duplication it causes is a minor inconvenience that will likely resolve itself through normal dependency updates.

Final Recommendation Matrix

Action Priority Effort Impact Decision
Keep jsonschema HIGH None HIGH (maintains critical functionality) DO THIS
Investigate default-features = false 🔍 LOW 1-2 hours MEDIUM (eliminates duplication) INVESTIGATE LATER
Wait for upstream reqwest 0.13 update MEDIUM None HIGH (resolves automatically) MONITOR QUARTERLY
Remove jsonschema N/A N/A N/A NEVER DO THIS
Implement custom validator N/A N/A N/A NEVER DO THIS
Use cargo patch N/A N/A N/A NEVER DO THIS

HTTP Client Consolidation: Final Status

Phase 1: Complete (2026-01-27)

  • Replaced eventsource-client with reqwest-eventsource
  • Eliminated old hyper 0.14 + rustls 0.21 ecosystem
  • Impact: ~15-20 crates removed, 3-5 MB reduction, 20-40s faster builds

Phase 2: Complete (2026-01-28)

  • Removed direct hyper and http-body-util dependencies
  • Cleaner code with Axum built-in utilities
  • Impact: Better abstractions, improved error handling

Phase 3: Cancelled (2026-01-28)

  • Investigated jsonschema usage
  • Determined it is critical and cannot be removed
  • Reqwest duplication is acceptable
  • Impact: None (status quo maintained)

🎯 Overall Result

SUCCESS: We achieved our primary goals:

  • Eliminated unnecessary old dependencies (Phase 1)
  • Cleaned up direct dependencies (Phase 2)
  • Understood our critical dependencies (Phase 3)
  • ~4-6 MB binary reduction
  • ~30-60 seconds faster clean builds
  • Cleaner, more maintainable dependency tree

Trade-off Accepted: Minor reqwest duplication for critical functionality


References


Author: AI Assistant
Date: 2026-01-28
Status: Complete - No further action required