🇪🇺EU AI Act ART15-ROBUSTNESSRule: EUAIA-15-002high

Robustness against errors and faults

Description

Article 15(3) — High-risk AI systems are resilient to errors, faults, inconsistencies; redundancy and fail-safe measures implemented.

⚠️ Risk Impact

AI systems fail in surprising ways under degraded conditions. Without redundancy + fail-safe, a transient infrastructure issue (rate-limited API, GPU OOM, network partition) cascades into a customer-facing AI failure.

🔍 How EchelonGraph Detects This

EUAIA-15-002Automated scanner rule

EchelonGraph's Tier 1 Cloud Scanner automatically checks for this condition across all connected cloud accounts. Violations are flagged as high-severity findings with remediation guidance.

🔧 Remediation

Implement: (1) decline-to-answer thresholds when confidence is low, (2) circuit breakers for downstream dependencies, (3) graceful degradation to deterministic fallbacks, (4) chaos-engineering tests covering AI-specific failure modes.

💀 Real-World Attack Scenario

A travel-booking AI relied on a live flight-data API. When the API rate-limited, the AI fell back to cached flight info — silently — and confidently quoted unavailable flights to customers for 4 hours. Refunds + customer-service cost: €380K. Article 15(3) violation: the system didn't have a documented fail-safe for upstream API failure.

💰 Cost of Non-Compliance

Article 15(3) robustness gap: up to €15M / 3% revenue. Customer-trust impact of confident-but-wrong AI: 11% churn spike in measured incidents (Edelman 2024).

📋 Audit Questions

  • 1.What happens when your top AI system loses access to its primary data source?
  • 2.What is the decline-to-answer threshold? How was it set?
  • 3.When was the last chaos-engineering test on AI infrastructure?
  • 4.Show me a circuit-breaker that fired in production in the last 30 days.

🎯 MITRE ATT&CK Mapping

T1499 — Endpoint Denial of Service

🏗️ Infrastructure as Code Fix

main.tf
resource "prometheus_alert_rule" "ai_circuit_breaker_open" {
  name = "ai_dependency_circuit_breaker_open"
  expr = "rate(ai_circuit_breaker_state{state=\"open\"}[5m]) > 0"
  for  = "5m"
  labels = { severity = "page" }
  annotations = { summary = "AI dependency circuit-breaker opened — investigating dependency health" }
}

⚡ Common Pitfalls

  • Caching responses for resilience without invalidating on data-source changes
  • Setting confidence-threshold decline at 0.5 universally — too aggressive for some use cases, too lenient for others
  • Not testing the fallback path — it works in dev but breaks in prod under load

📈 Business Value

Robust AI systems retain customer trust through infra incidents. The difference between 'confidently wrong' and 'gracefully degraded' is the difference between a viral screenshot and a customer-success conversation.

⏱️ Effort Estimate

Manual

3-4 weeks per system for circuit breakers + chaos tests

With EchelonGraph

EchelonGraph ships AI-specific chaos scenarios + circuit-breaker patterns for KServe/Ray/Seldon

🔗 Cross-Framework References

AIRMF-MEASURE-2.5NIST_CSF-PR.IR-03

Automate EU AI Act ART15-ROBUSTNESS compliance

EchelonGraph continuously monitors this control across all your cloud accounts.

Start Free →