Data format chaos costs you weeks of visibility

Why the federal agency breach shows that standardized telemetry formats aren’t optional anymore

When CISA analyzed the federal agency breach that went undetected for three weeks, they identified a familiar pattern: EDR alerts existed but weren’t continuously reviewed. Security teams had visibility tools, but critical signals got lost in the noise.

What the advisory doesn’t detail—but every security practitioner knows—is the infrastructure nightmare hiding behind that simple statement. Those unreviewed alerts likely came from dozens of sources, each speaking its own dialect of security telemetry.

Windows systems alone generate logs in multiple formats: DHCP logs, DNS logs, Event Logs, text files, XML, and JSON structures that vary by component. Add in endpoint agents, network devices, cloud services, and third-party applications, and you’re managing a babel of data formats that makes continuous monitoring exponentially harder.

This isn’t just a technical inconvenience. Format fragmentation directly impacts your ability to detect threats, route alerts intelligently, and respond before attackers move laterally across your infrastructure.

The hidden cost of format fragmentation

Consider what the federal agency’s security team faced during those three weeks:

Day 1: Attackers exploit CVE-2024-36401 in GeoServer. The application logs the exploit attempt in its custom format. The EDR agent on that system generates an alert in its vendor-specific schema. The network monitor captures unusual traffic patterns in yet another format.

Day 7: Lateral movement to a web server. Windows Event Logs record unusual authentication. Application logs show suspicious requests. The web application firewall generates alerts in its own structure.

Day 14: SQL server compromise. Database audit logs use a different format than system logs. The EDR agent reports anomalous behavior. Network telemetry shows data movement patterns.

Each event generates telemetry. But each source uses different field names, timestamps, severity scales, and data structures. Correlating these events requires:

Custom parsers for each format
Translation layers to normalize data
Complex routing rules based on format compatibility
Manual effort to connect related events across systems

By the time security analysts can piece together the attack chain from multiple incompatible data sources, three weeks have passed.

The routing problem nobody talks about

Data format fragmentation doesn’t just complicate analysis—it constrains where you can send telemetry data in the first place.

Your SIEM expects Common Event Format (CEF) or Log Event Extended Format (LEEF). Your threat intelligence platform needs Structured Threat Information eXpression (STIX). Your metrics system wants Prometheus format. Your data lake accepts JSON but chokes on Windows Event Log XML. Your compliance reporting tool requires a specific CSV structure.

So you build format converters. Lots of them. Each one is another point of failure, another component to maintain, another source of transformation errors that corrupt the context you need for accurate analysis.

As you go through the motions, you quickly realize the uphill challenge you’re facing. Good intentions don’t make it all the way, and your log format consolidation effort ends before everything is fixed. And let’s not get started on software updates that break the converters' parsing.

When the SentinelOne outage occurred on May 29, 2025, organizations lost access to their management consoles for several hours. Security teams with diverse telemetry sources faced an additional challenge: without their primary platform, could they route SentinelOne data to alternative destinations? Only if they’d already built format translation for every possible failover scenario.

Most hadn’t.

What OpenTelemetry changes

OpenTelemetry addresses format fragmentation at its source: standardized schemas for logs, metrics, and traces across all telemetry sources.

Instead of Windows Event Logs in XML, DHCP logs in text files, DNS logs in custom format, and application logs in JSON—all with different field names and structures—OpenTelemetry provides a unified data model:

Consistent field naming: service.name, host.name, user.id mean the same thing regardless of source
Standardized severity levels: no more translating between "Critical," "High," "1," and "Emergency"
Unified timestamps: every event uses the same timestamp format and timezone handling
Semantic conventions: predefined attributes for common concepts like HTTP requests, database queries, and authentication events

This isn’t just cleaner—it’s operationally transformative.

From weeks to hours: what standardization enables

Return to the federal agency scenario, but this time with OpenTelemetry-formatted telemetry:

Day 1: GeoServer exploitation generates telemetry in OpenTelemetry format. The EDR agent also outputs OpenTelemetry data. Network monitoring does the same.

Your telemetry pipeline receives all three data streams in the same format. Correlation is immediate: same host.name, same service.name, timestamps align perfectly. The enrichment layer adds context—this is a public-facing asset with known CVE exposure—and routes a high-priority alert directly to the security team.

Detection time: hours, not weeks.

The difference isn’t smarter security teams or better tools. It’s infrastructure that doesn’t fight against itself.

Routing without translation layers

Standardized formats transform your routing options. With OpenTelemetry:

Send to multiple destinations without custom converters: Your SIEM, metrics platform, and data lake all accept OpenTelemetry Protocol (OTLP). No format translation required. No data loss from conversion errors.
Route based on content, not compatibility: Instead of "Can this destination parse this format?", you ask "Does this data belong here?" Route high-severity security events to your SIEM, performance metrics to your observability platform, and compliance-relevant logs to your audit system—all from the same standardized stream.
Failover without rebuilding infrastructure: When your primary analysis platform experiences an outage (like the SentinelOne incident), you can route data to alternative destinations immediately. The format is already compatible.
Enrich once, use everywhere: Add contextual information—asset criticality, user department, threat intelligence like software version or patch level—to OpenTelemetry data at collection time. That enriched context flows to every destination without additional processing.

Simplifying alert deduplication and correlation

Alert fatigue is a common problem, but format fragmentation makes it worse. When similar events arrive in different formats, deduplication becomes difficult:

Is severity: critical the same as priority: 1 or level: emergency?
Does src_ip correlate with source.address and network.peer.ip?
Are user, username, user_name, and account referring to the same entity?

Without standardization, your deduplication logic needs format-specific rules for each variation. Miss one, and you generate duplicate alerts for the same incident.

OpenTelemetry’s semantic conventions solve this. When every source uses source.address and user.id, deduplication logic works universally. One rule set, not dozens.

The federal agency that missed alerts for three weeks likely faced this problem: important signals buried under duplicates and variations that their systems couldn’t automatically consolidate.

The security posture impact

Format standardization isn’t just about operational efficiency—it directly improves security outcomes:

Faster detection: Correlation happens in real-time, not after manual analysis.
Reduced alert fatigue: Deduplication works properly, so security teams see fewer, more accurate alerts.
Better context: Enrichment applies uniformly, giving analysts the information they need to assess severity quickly.
Improved incident response: When an incident spans multiple systems (like the three-week breach), investigators can follow the attack chain without translating between formats.
Audit and compliance: Standardized logs make compliance reporting straightforward. No custom queries for each log format.

The practical path forward

Moving from format chaos to OpenTelemetry standardization doesn’t require replacing your entire infrastructure overnight:

Start with new data sources: Configure new applications and systems to output OpenTelemetry format from day one.
Convert high-value sources first: Identify your most critical security telemetry sources—public-facing systems, authentication services, sensitive data stores—and prioritize converting them to OpenTelemetry.
Use an intelligent collector for translation: For legacy systems that can’t change output format, the NxLog Agent can receive data in various formats and output standardized OpenTelemetry data—or vice-versa, if your bottleneck is on the receiving end.
Enrich during collection: Add contextual attributes—asset tags, environment labels, criticality ratings—as data enters your pipeline, not at analysis time.
Route intelligently: With standardized format, route based on content attributes (severity, service type, data classification) rather than format compatibility.

What this means for your pipeline

If you’re managing telemetry from diverse sources—and if you’re running any modern infrastructure, you are—format fragmentation is costing you visibility, speed, and security posture.

The federal agency that missed three weeks of attacker activity had EDR alerts. They had logs. They had telemetry. What they lacked was infrastructure that could turn that fragmented data into clear, actionable intelligence quickly enough to matter.

OpenTelemetry doesn’t solve every security challenge, but it removes a significant obstacle: the data format chaos that makes correlation slow, routing complex, and alert fatigue inevitable.

If you’re dealing with alert overload, struggling to correlate events across systems, or want to simplify your telemetry routing, our team can help you build a telemetry pipeline that leverages OpenTelemetry’s standardization while maintaining compatibility with your existing infrastructure.

NXLog Platform is an on-premises solution for centralized log management with
versatile processing forming the backbone of security monitoring.

With our industry-leading expertise in log collection and agent management, we comprehensively
address your security log-related tasks, including collection, parsing, processing, enrichment, storage, management, and analytics.

Start free Contact us