Product·9 min read

EchelonGraph Tier 3 (EcheDeep) is GA — Continuous, Zero-Knowledge eBPF Detection in Your Cluster

Tier 3 ships an eBPF DaemonSet that runs in your customer cluster, redacts PII at the kernel boundary, and submits envelope-encrypted findings sealed by a customer-controlled KMS. We never see your plaintext. Here's what's inside the v3.0.0 release.

E

EchelonGraph Engineering

Tier 3 Platform Team

TL;DR. Tier 3 (EcheDeep) is generally available. It's a Kubernetes DaemonSet that runs eBPF detection on every node in your cluster, redacts PII at the kernel boundary, and submits envelope-encrypted findings sealed by a customer-controlled KMS (AWS, GCP, or Vault). We never see plaintext from your environment.

Why we built this

Cloud security platforms make a hard trade-off: either you ship logs, syscalls, and packets to the vendor's SaaS for analysis (and trust them with the plaintext), or you keep everything on-cluster and lose all the graph correlation, IOC enrichment, and compliance scoring that makes a SaaS valuable.

EcheDeep refuses the trade-off.

The agent runs in your cluster. Kernel events flow through eBPF into a redaction pipeline. Findings — not raw events — get envelope-encrypted with a per-event Data Encryption Key (DEK) wrapped by your KMS, then submitted to our SaaS. Our backend processes the ciphertext and stores the wrapped DEK alongside it. To decrypt, our backend has to call back into _your_ KMS — which means you can revoke access at any time and lock us out.

> If you yank our KMS access, the ciphertext we already hold becomes > useless. That's the contract: you control the key, you control the > data.

What ships in v3.0.0

Continuous, on-cluster runtime detection (T3.1 → T3.6)

  • eBPF kernel + bridge: XDP, TC, and tracepoints capturing every
  • network connection, syscall, and sensitive-file open
  • Process pipeline: 8-rule detector covering reverse shells, crypto
  • miners, container escapes, lateral movement, data staging, and privilege escalation; 24-hour rolling baseline with per-namespace comm fingerprinting
  • Anomaly engine: net + DNS + bytes-rate sliding windows with
  • configurable suppression sets for false-positive feedback
  • Threat intelligence: URLhaus + Feodo Tracker + CISA KEV feeds
  • with 6-hour refresh, optional STIX bundle + TAXII collection
  • Shadow API discovery: HTTP traffic recognition with
  • LRU-evicted inventory, fed into the anomaly engine for endpoint-level detection

    Customer-controlled remediation (T3.7)

  • Per-tenant remediation modes: dry-run (default) → approval
  • pr (auto-PR via GitHub/GitLab connectors) → auto
  • Mark-applied + dismiss endpoints so customers can drive the
  • workflow from their existing ticketing
  • Source-attributed patches with full audit trail back to the
  • detector that fired

    Hardware KMS & envelope encryption (T3.8)

  • Provider-pluggable KMS: AWS KMS, GCP Cloud KMS, HashiCorp Vault
  • Transit (token / AppRole / Kubernetes auth)
  • Auto-renewing tokens: Vault token-renew goroutine runs when TTL
  • drops below half; AppRole + K8s service-account JWT login flows baked in
  • Healthcheck at boot: misconfigured KEKs fail loudly; we fall
  • back to legacy single-key encryption rather than block telemetry

    Custom Compliance Builder (T3.9)

  • Score snapshots: daily writes to compliance_score_snapshots
  • power 30/60/90-day trend charts
  • Cross-framework suggestions: implement one control, satisfy
  • multiple frameworks (token-overlap ranker, no LLM)
  • Scheduled reports: weekly / monthly HTML + CSV reports delivered
  • via email; print CSS for PDF generation
  • License feature gate: rollout-grace by default, flips to enforce
  • when LICENSE_GATE=enforce is set

    Observability + production hardening (T3.10)

  • 50+ Prometheus metrics exposed at /metrics, including
  • cross-package subsystem stats (KMS providers, threat-intel manager, process pipeline, remediation engine)
  • Cross-pod log correlation: every slog line carries
  • correlation_id, pod_name, cluster_id, tenant_id
  • Top-20 troubleshooting runbook in
  • TIER3_DEPLOYMENT.md
  • Technical SLA targets in
  • TIER3_SLA.md (99.9% ingester uptime, ≤15s P95 process-anomaly detection latency, ≤5min anomaly-baseline RPO)

    How to install

    Helm:

    helm repo add echelongraph https://charts.echelongraph.io
    helm install echelongraph-tier3 echelongraph/echelongraph-tier3 \
      --namespace echelongraph-system --create-namespace \
      --set tier3.tenantId=<your-tenant-id> \
      --set tier3.licenseToken=<your-license> \
      --set kms.provider=aws \
      --set kms.aws.keyArn=arn:aws:kms:us-east-1:...

    Per-cloud examples ship in the chart: examples/values-aws-eks.yaml, values-gcp-gke.yaml, values-azure-aks.yaml, values-onprem.yaml. The on-prem example uses Vault AppRole; the Azure example uses Vault Kubernetes auth.

    What's next

  • Real cluster-window perf benchmarks at 10K, 50K, 100K nodes
  • Third-party security audit (queued — gates v3.0.0 git tag)
  • Azure Key Vault provider (currently routed via Vault on Azure)
  • If you're a Pro or Enterprise customer, Tier 3 is included. Reach out at echelongraph.io/enterprise for a deployment review.

    — EchelonGraph Engineering

    Protect your infrastructure before the breach

    Map your attack surface, automate compliance, and detect insider threats in real time.

    Start free trial →