Wrong character encodings in localized windows events

View thread

Thorsten1

I am trying to collect Windows 11 events on a localized system (in my case: German) and send them to Logstash.

Sometimes there are German umlauts within the value fields that are converted wrongly.This means that the JSON is no longer valid and Logstash cannot parse it.

In example, the “Domain” key contains the German word NT-AUTORITÄT (umlaut before the last T). This is translated to “AUTORIT0xC4T”, where 0xc4 seems to be a part of U+00C4. This is the unicode expression of Ä. The correct  UTF-8 character is 0xC3C4. Interestingly, lower case umlauts are translated correctly. At least in the “message” field.

Because 0xC4 alone is not a valid UTF-8 character, this cannot work. The parser in Logstash is then missing a byte and fails.

I played around with “AutodetectCharsets” of xm_charconv in the nxlog.conf file: nothing changedThen I set "convert_fields("auto", "utf-8");" within the <input> block: did not change anything, too.An then I set "convert_fields("utf-8", "utf-8");" within the <input> block. That fixed the wrong Ä in AUTORITÄT, but broke all small umlauts.

This is my nxLog configuration:

<Extension json_encoder>
   Module      xm_json
</Extension>
<Input eventlog>
   Module im_msvistalog
Exec $Message = replace($Message, "\r\n", " "); 
   <QueryXML> 
    <QueryList>
           <Query Id="0" Path="Application">
               <Select Path="Application">*</Select>
           </Query>
  <Query Id="1" Path="System">
               <Select Path="System">*</Select>
           </Query>
  <Query Id="2" Path="Security">
               <Select Path="Security">*</Select>
           </Query>
  <Query Id="3" Path="Setup">
               <Select Path="Setup">*</Select>
           </Query>
       </QueryList>
</QueryXML>
</Input>
<Output out>
   Module      om_tcp
   Host        10.10.2.10
   Port        5000
   Exec        to_json();
</Output>
<Output localfile>
   Module  om_file
   File 'C:\nxlog.txt'
   Exec to_json();
</Output>
<Route route1>
   Path eventlog => localfile
</Route>

And this is an example of a faulty line in c:\nxlog.txt

{"EventTime":"2025-05-05 18:27:39","Hostname":"Frodo","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":37,"SourceName":"Microsoft-Windows-Time-Service","ProviderGuid":"{06EDCFEB-0FD0-4E53-ACCA-A6F8BBF81BCB}","Version":0,"Task":0,"OpcodeValue":0,"RecordNumber":163093,"ProcessID":17416,"ThreadID":9680,"Channel":"System","Domain":"NT-AUTORITĔ","AccountName":"Lokaler Dienst","UserID":"S-1-5-19","AccountType":"Well Known Group","Message":"Der Zeitanbieter \"NtpClient\" empfängt derzeit gültige Zeitdaten von pool.ntp.org,0x9 (ntp.m|0x9|0.0.0.0:123->194.164.164.175:123).","Opcode":"Info","TimeSource":"pool.ntp.org,0x9 (ntp.m|0x9|0.0.0.0:123->194.164.164.175:123)","EventReceivedTime":"2025-05-05 18:27:41","SourceModuleName":"eventlog","SourceModuleType":"im_msvistalog"}

It is not such easy to post encoding problems. But in Notepad++ the output is shown as this:

And as you can see, there are valid lower case umlauts in the message field: “empfängt derzeit gültige”How can I fix this? :( 

Thank you very much!

   Thorsten