Solving log collection challenges with Event Tracing for Windows
Event Tracing for Windows (ETW) logs kernel, application, and other system activity. ETW provides better data and uses fewer resources. By understanding the key characteristics of ETW, system administrators can make well-informed decisions about how to utilize the logs collected via ETW to improve IT Security.
About Event Tracing for Windows
Since it became available in Windows 2000, ETW has provided more detailed information on the operating system environment and application interaction than other logging services on Windows. In addition, ETW does this with less overhead and higher efficiency.
The architecture of ETW is straightforward, an event provider (any user-mode application, managed application, or driver) writes events to ETW sessions. When events are written, ETW adds supplementary information about the event time, the process and thread ID that generated the event, the processor ID, and the CPU usage data of the logging thread. This information is then ingested by event consumers, either from log files and/or by listening to tracing sessions in real time. Consumers then continue with any other configured processing activities.
The following three independently functioning component types determine what is logged, when it is logged, and where the log events are collected, all with relatively little system overhead. Together, these components define an event tracing session.
Controllers |
Controllers enable providers to log events to a session. They start, stop, and define event trace sessions, specify the session/log file name, location, and type, and define how to resolve date-time stamps. |
Providers |
Providers are applications equipped with event tracing instrumentation. When a controller enables them, they send log events to a consumer. |
Consumers |
Consumers consume events from one or more event tracing sessions and retrieve events stored in log files along with logs from other real-time sessions. In this context, the log collector(s) act as consumers, ingesting generated events from enabled providers. |
Apart from regular event tracing sessions, special purpose tracing sessions are also available, such as Private Logger, System Trace, NT Kernel Logger, AutoLogger, and Global Logger. These sessions have predefined data and location settings and often provide the only way to access certain data, such as when events are created early in a system’s boot process.
Event trace data
ETW events include rich metadata, localizable message strings, and schematized data payloads.
They can be read from memory or saved to a file after being decoded on the source system.
The Event Trace Log (ETL) file has a binary format, and its content must also be decoded for viewing.
Windows provides a GUI (Event Viewer) and a command-line utility (tracerpt
) for this purpose, as well as other event tracing tools such as logman
.
Below is an excerpt from an example of an event trace in XML format exported from an ETL file:
<EventData>
<Data Name="BufferSize">16384</Data>
<Data Name="Version">83951626</Data>
<Data Name="ProviderVersion">14393</Data>
<Data Name="NumberOfProcessors">1</Data>
<Data Name="MaxFileSize">2</Data>
<Data Name="LogFileMode">0x82</Data>
<Data Name="BuffersWritten">3</Data>
<Data Name="StartBuffers"> 1</Data>
<Data Name="PointerSize">8</Data>
<Data Name="CPUSpeed">2400</Data>
<Data Name="SessionNameString">WdiContextLog</Data>
<Data Name="LogFileNameString">C:\Windows\System32\WDI\LogFiles\WdiContextLog.etl.002</Data>
</EventData>
Note
|
Data availability and the functional particulars of an event session may vary depending on the operating system version. Please refer to the Event Tracing for Windows documentation for details. |
The advantages of ETW
ETW has matured enough to become a well-established and valuable technology by now. Below are some of the key points in favor of ETW.
- Provides more in-depth data
-
ETW provides more thorough, detailed, and timely data than its log management predecessors on Windows. Since the release of Windows Vista, Windows Event Log and ETW have used the same API to log events, allowing administrators to directly access data previously collected via Event Logging and previously unavailable data.
For example, the Microsoft-Windows-PowerShell ETW Provider records which command was run as part of its payload, whereas Windows Event Log merely shows that PowerShell was invoked.
- Widens log data reach
-
Often, ETW is the only way to access certain logs. DNS logs, for instance, are captured from the Microsoft-Windows-DNSServer ETW provider by enabling Analytical Logging. This is the preferred method for collecting logs from Windows Server 2012 R2 (with hotfix 2956577) and later, but it is not available in earlier versions of Windows Server.
- Consumes less system resources
-
ETW uses kernel-level providers and manages memory buffers per session, consuming considerably fewer resources than other logging technologies on Windows. This is especially important when deployed in production environments.
- Built up of independent components
-
ETW allows the dynamic updating of sessions. As the components (controllers, providers, and consumers) function independently, they can be stopped, started, and updated without rebooting the system or restarting the application(s) of interest. Troubleshooting live environments is practically unthinkable without the flexibility it enables.
The challenges of ETW
There are all good reasons to use ETW, but it also introduces some challenges for system administrators to consider.
- Technology changes
-
ETW has undergone substantial changes with each major release since its introduction in 2000.
The updates added new functions and data structures, typically available only in the updated version of the operating system. An excellent example of this is the format of log files: the original Managed Object Format (MOF) was mostly superseded by the Trace Message Format (TMF), which in turn gave way to the manifest-based XML format. Some earlier components even required a separate Program Debugging Database (PDB) to decode events. The newest format, TraceLogging (TL), was introduced with Windows 10 and embeds decoding information with the recorded logs.
It is clear that managing log collection across multiple file formats increases the overhead for system administrators.
- Tooling complexity
-
During ETW’s lifespan, tools for collecting, inspecting, analyzing, and processing event tracing data have proliferated to keep up with the evolving functional and structural changes introduced.
Finding a solution to structure and centralize the data used by these tools is a must to enable the execution of security analysis tasks such as pattern matching and correlation. However, due to the limited GUI availability and inconsistent CLI tooling options, the burden of managing ETW data can quickly become time-consuming and expensive.
- Capturing issues
-
ETW is a "best effort" framework and does not guarantee that all events are captured. Any factor, that could potentially reduce the logging performance of a given system, such as the availability of memory and reduced storage responsiveness is a potential source of problems for reliably capturing event traces.
ETW in real-world threat hunting
Administrators are already aware that attack tactics to cover malicious tracks, such as clearing the Windows Event Log, will emit their telemetry in the form of additional Event IDs. By knowing this, administrators are proactively monitoring and alerting for these Event IDs. However, these tactics can fall short if attackers start tampering with ETW, as it can be used as a mechanism to interfere with the flow of logging.
A technique covered by Palantir involves removing an ETW Provider from a trace session. As a result, the targeted Event Log is no longer supplied with the correct system telemetry. For administrators who are reliant on Windows Event Log-based threat detection only, this trick will likely go unnoticed.
There are a few options to mitigate the effects of tampering with ETW, as this technique can be detected early if there is a centralized log collection strategy in place to collect logs from the relevant ETW Provider. By collecting event traces with the capabilities NXLog Platform offers, administrators can collect and forward traces to a third-party suite, such as a SIEM, for further monitoring and incident response.
Other cases where ETW is utilized for attack detection and defense operations include Domain Name System (DNS) log collection to help mitigate attacks involving DNS. Collecting ETW traces also further informs threat detection and operational support in collecting telemetry from critical applications such as Windows Firewall.
Using NXLog Platform as a single solution to collect and manage ETW logs
To fully leverage the benefits of ETW and address the challenges associated with its usage, NXLog Platform provides a comprehensive solution for centralized log collection. It allows for seamless ETW log collection, processing, and forwarding, as well as storing them with the following features:
-
Includes features to act as both a controller to manage ETW sessions and as a consumer to ingest and process logs in real time or from log files. This feature eliminates the need for separate tools and enables dynamic session management without disrupting system operations.
-
NXLog Platform can forward ETW trace data without storing it on disk, ensuring minimal system resource usage. This capability makes NXLog Platform ideal for high-throughput environments where performance is critical.
-
It incorporates advanced parsing, formatting, and conversion capabilities, allowing administrators to enrich raw ETW data for further analysis and making it easier to integrate with other security tools, such as SIEMs.
-
With NXLog Platform, logs from ETW providers can be collected centrally, enabling the correlation of data from multiple sources. It helps administrators detect patterns and respond to incidents quickly, using existing infrastructure like SIEMs or other log management systems.
-
NXLog Platform offers easy setup for your ETW log collection, ensuring smooth integration.
-
It accommodates high-performance data storage to achieve unparalelled efficiency for your collected ETW log data.
NXLog Platform does it all through its flexible configuration options, providing a scalable solution for managing ETW log collection.
Some solutions rely on Microsoft-provided tools to capture the ETW trace to disk before decoding the trace file into a human-readable format for further processing and analysis. These methods are inefficient and unreliable as they consume system resources in both the storage requirements of capturing the trace to disk and in processing or decoding the trace file.
NXLog Platform allows administrators to use a single solution for all their ETW log collection requirements, which will collect ETW data at the source, then further parse, format, and forward the data as needed.
With structured log enrichment and centralized log collection, administrators can easily correlate data, create patterns, and visualize event data provided by the sources. They can utilize their already established infrastructure such as SIEMs, log management suites, and other dashboards - for data analysis, reporting, alerting, and incident response based on the insights provided by NXLog Platform.