🎯MITRE ATLAS AML.T0020Rule: ATLAS-RD-001critical

Poison Training Data

Description

Attacker injects malicious samples into the training data to alter model behaviour. Can be label-flipping, feature manipulation, or trigger insertion.

⚠️ Risk Impact

Training-data poisoning is the silent attack — the model trains 'successfully' but learns adversary-chosen biases. Detection requires comparing trained model behaviour to expected behaviour.

🔍 How EchelonGraph Detects This

ATLAS-RD-001Automated scanner rule

EchelonGraph's Tier 1 Cloud Scanner automatically checks for this condition across all connected cloud accounts. Violations are flagged as critical-severity findings with remediation guidance.

🔧 Remediation

Cryptographically hash training data; validate against approved baselines. Restrict write access to training datasets. Implement RONI (Reject On Negative Impact) — exclude samples that disproportionately shift model behaviour.

💀 Real-World Attack Scenario

A spam-classifier was retrained weekly on user 'is this spam?' feedback. An adversarial campaign generated thousands of false-positive flags on legitimate emails from a competitor's domain. After 4 weekly retrains, the model had learned to classify the competitor's emails as spam — a competitive sabotage attack via training-data poisoning.

💰 Cost of Non-Compliance

Training-data poisoning case studies: documented in academic literature; rarely public-disclosed in industry due to reputational sensitivity. Estimated detection lag: weeks to months.

📋 Audit Questions

  • 1.How is training data integrity verified before each training run?
  • 2.Who can write to your training datasets?
  • 3.Is RONI or similar contamination detection in your training pipeline?
  • 4.When did the last training-data audit catch a poisoning attempt?

🎯 MITRE ATT&CK Mapping

MITRE_ATLAS-AML.T0020

⚡ Common Pitfalls

  • Trusting user-generated training data without contamination detection
  • No baseline behaviour test post-training — drift goes undetected
  • Treating feedback-loop training as low-risk

📈 Business Value

Training-data integrity is the bedrock of trustworthy AI. Compromise here propagates to every downstream decision the model makes.

⏱️ Effort Estimate

Manual

3-4 weeks for training-pipeline integrity + contamination detection

With EchelonGraph

EchelonGraph integrates RONI-style detection in training pipeline; baseline comparison post-train

🔗 Cross-Framework References

MITRE_ATLAS-AML.T0018OWASP_LLM-LLM04EUAIA-ART10-DATA-GOV

Automate MITRE ATLAS AML.T0020 compliance

EchelonGraph continuously monitors this control across all your cloud accounts.

Start Free →