Vector and Embedding Weaknesses
Description
Vulnerabilities in vector DBs (Milvus, Weaviate, Qdrant, Pinecone, Chroma) and RAG embeddings: injection via indexed content, cross-tenant leakage, unauthenticated access.
⚠️ Risk Impact
Vector DBs are the new RDBMS for LLM apps — but security maturity is far behind. The Shadow AI Radar finds thousands of internet-exposed Milvus/Weaviate instances; many require no authentication.
🔍 How EchelonGraph Detects This
EchelonGraph's Tier 1 Cloud Scanner automatically checks for this condition across all connected cloud accounts. Violations are flagged as high-severity findings with remediation guidance.
🖥️ Manual Verification
# Check vector DB authentication
curl -I http://your-milvus-host:19530/v1/health # Should require auth🔧 Remediation
Authenticate vector DBs. Segregate per-tenant indexes. Sanitise content before embedding (treat indexed content as semi-trusted). Encrypt vector data at rest. Restrict vector DB to internal networks.
💀 Real-World Attack Scenario
Wiz Research (Jan 2025) found DeepSeek's ClickHouse database (used for chat logs + vector embeddings) exposed to the internet without authentication. 1M+ chat logs and API keys leaked in <60 minutes of internet scanning. The DeepSeek case is the canonical 2025 vector/embedding security incident.
💰 Cost of Non-Compliance
DeepSeek API exposure (Jan 2025): public-disclosure-level reputational + reg-probe exposure. Avg vector-DB exposure incident in 2024: $1.8M (Wiz AI Threat Report).
📋 Audit Questions
- 1.Are any of your vector DBs exposed to the public internet?
- 2.What authentication is required on each vector DB?
- 3.Is per-tenant index isolation enforced?
- 4.How is indexed content sanitised before embedding?
🎯 MITRE ATT&CK Mapping
🏗️ Infrastructure as Code Fix
resource "kubernetes_network_policy" "vector_db_internal_only" {
metadata { name = "vector-db-no-public-ingress"; namespace = "ai" }
spec {
pod_selector { match_labels = { app = "milvus" } }
ingress {
from {
namespace_selector { match_labels = { tier = "ai-internal" } }
}
}
}
}⚡ Common Pitfalls
- ⛔Deploying Milvus / Weaviate / Qdrant with default-no-auth config
- ⛔Single global index for multi-tenant data
- ⛔Skipping content sanitisation because 'it's our content'
📈 Business Value
Vector DB hardening prevents the most-frequent 2025 AI infra incident pattern. Reduces avg incident cost by 70%+.
⏱️ Effort Estimate
2-3 weeks for cluster-wide vector DB audit + hardening
EchelonGraph's Shadow AI Radar detects exposed vector DBs; provides hardening guidance
🔗 Cross-Framework References
Automate OWASP LLM Top 10 LLM08 compliance
EchelonGraph continuously monitors this control across all your cloud accounts.
Start Free →