Skip to main content

Overview

OpenSRE can mask sensitive infrastructure identifiers (pod names, cluster names, hostnames, account IDs, service names, IP addresses, emails) before sending text to external LLMs, and restore the originals in any user-facing output (Slack report, problem MD, ingest). This lets teams use external models while keeping raw identifiers private to the investigation runtime. Masking is off by default. Enable it per investigation via environment variables — no code changes required.

How it works

  1. When masking is enabled, the investigation step replaces sensitive identifiers in collected evidence with stable placeholders like <POD_0>, <NAMESPACE_0>, <CLUSTER_1>. The placeholder→original map is stored in investigation state.
  2. The diagnosis model receives masked evidence, so raw identifiers never hit the external LLM.
  3. After the model returns its root-cause analysis, OpenSRE restores real identifiers in downstream state and display output.
  4. Report delivery (for example Slack) runs a final unmask pass before sending, as defence in depth.
The same identifier always maps to the same placeholder within a single investigation, so the LLM’s reasoning about <POD_0> remains coherent.

Environment variables

VariableDefaultDescription
OPENSRE_MASK_ENABLEDfalseMaster switch. Set to true / 1 / yes / on to activate masking.
OPENSRE_MASK_KINDSpod,namespace,cluster,hostname,account_id,ip_address,email,service_nameComma-separated list of identifier kinds to mask. Unknown kinds are ignored with a warning. Empty value uses all defaults.
OPENSRE_MASK_EXTRA_REGEX(empty)Optional JSON object mapping a label → regex for custom identifiers. Example: '{"jira_key": "\\\\b[A-Z]+-\\\\d+\\\\b"}'. Group 1 of the regex, if present, defines the span to mask.
Policies are read fresh from the environment at the start of each investigation — changes take effect on the next run without a restart.

Built-in identifier kinds

KindExample inputExample placeholder
podetl-worker-7d9f8b-xkp2q<POD_0>
namespacekube_namespace:tracer-testkube_namespace:<NAMESPACE_0>
clustereks_cluster:prod-us-east-1eks_cluster:<CLUSTER_0>
service_nameservice:checkout-apiservice:<SERVICE_NAME_0>
hostnamekind-control-plane, ip-10-0-1-23.ec2.internal<HOSTNAME_0>
account_id123456789012<ACCOUNT_ID_0>
ip_address192.168.1.50<IP_ADDRESS_0>
emailalice@example.com<EMAIL_0>

Round-trip guarantee

For the built-in detectors and extra regex patterns, mask → unmask round-trips the original payload byte-for-byte. See tests/masking/test_integration_with_k8s_fixture.py for a worked example against a realistic Datadog k8s alert.

Relationship to guardrails

The masking layer is complementary to the one-way GuardrailEngine. Guardrails handle hard-block rules (credit cards, API keys) and replace matches with [REDACTED] irreversibly. Masking handles infrastructure identifiers reversibly so they can be restored for user-facing output. Both can be active together: guardrails apply first at the LLM client layer, then masking at the node layer.

Example

export OPENSRE_MASK_ENABLED=true
export OPENSRE_MASK_KINDS=pod,namespace,cluster,hostname
opensre investigate -i tests/e2e/kubernetes/fixtures/datadog_k8s_alert.json
During the investigation the LLM sees masked evidence; the final Slack report shows the original pod, namespace, and cluster names.