🧠OWASP LLM Top 10 LLM08Rule: OWASP-LLM-008high

Vector and Embedding Weaknesses

Description

Vulnerabilities in vector DBs (Milvus, Weaviate, Qdrant, Pinecone, Chroma) and RAG embeddings: injection via indexed content, cross-tenant leakage, unauthenticated access.

⚠️ Risk Impact

Vector DBs are the new RDBMS for LLM apps — but security maturity is far behind. The Shadow AI Radar finds thousands of internet-exposed Milvus/Weaviate instances; many require no authentication.

🔍 How EchelonGraph Detects This

OWASP-LLM-008Automated scanner rule

EchelonGraph's Tier 1 Cloud Scanner automatically checks for this condition across all connected cloud accounts. Violations are flagged as high-severity findings with remediation guidance.

🖥️ Manual Verification

terminal
# Check vector DB authentication
curl -I http://your-milvus-host:19530/v1/health  # Should require auth

🔧 Remediation

Authenticate vector DBs. Segregate per-tenant indexes. Sanitise content before embedding (treat indexed content as semi-trusted). Encrypt vector data at rest. Restrict vector DB to internal networks.

💀 Real-World Attack Scenario

Wiz Research (Jan 2025) found DeepSeek's ClickHouse database (used for chat logs + vector embeddings) exposed to the internet without authentication. 1M+ chat logs and API keys leaked in <60 minutes of internet scanning. The DeepSeek case is the canonical 2025 vector/embedding security incident.

💰 Cost of Non-Compliance

DeepSeek API exposure (Jan 2025): public-disclosure-level reputational + reg-probe exposure. Avg vector-DB exposure incident in 2024: $1.8M (Wiz AI Threat Report).

📋 Audit Questions

  • 1.Are any of your vector DBs exposed to the public internet?
  • 2.What authentication is required on each vector DB?
  • 3.Is per-tenant index isolation enforced?
  • 4.How is indexed content sanitised before embedding?

🎯 MITRE ATT&CK Mapping

T1530 — Data from Cloud StorageMITRE_ATLAS-AML.T0040 — ML IP Theft

🏗️ Infrastructure as Code Fix

main.tf
resource "kubernetes_network_policy" "vector_db_internal_only" {
  metadata { name = "vector-db-no-public-ingress"; namespace = "ai" }
  spec {
    pod_selector { match_labels = { app = "milvus" } }
    ingress {
      from {
        namespace_selector { match_labels = { tier = "ai-internal" } }
      }
    }
  }
}

⚡ Common Pitfalls

  • Deploying Milvus / Weaviate / Qdrant with default-no-auth config
  • Single global index for multi-tenant data
  • Skipping content sanitisation because 'it's our content'

📈 Business Value

Vector DB hardening prevents the most-frequent 2025 AI infra incident pattern. Reduces avg incident cost by 70%+.

⏱️ Effort Estimate

Manual

2-3 weeks for cluster-wide vector DB audit + hardening

With EchelonGraph

EchelonGraph's Shadow AI Radar detects exposed vector DBs; provides hardening guidance

🔗 Cross-Framework References

MITRE_ATLAS-AML.T0010

Automate OWASP LLM Top 10 LLM08 compliance

EchelonGraph continuously monitors this control across all your cloud accounts.

Start Free →