Parse log with unicode characters hanging out
Tags:
NXLog Enterprise Edition
#1
cschelin
I'm attempting to parse a Cerberus FTP log file. What I wind up with:
{ "EventReceivedTime": "2024-08-01 16:11:37", "SourceModuleName": "cerberus_log", "SourceModuleType": "im_file", "message": "[\u00002\u00000\u00002\u00004\u0000-\u00000\u00008\u0000-\u00000\u00001\u0000 \u00001\u00006\u0000:\u00001\u00001\u0000:\u00003\u00006\u0000]\u0000:\u0000C\u0000O\u0000N\u0000N\u0000E\u0000C\u0000T\u0000 \u0000[\u00001\u00005\u00002\u00004\u00009\u00002\u0000]\u0000 \u0000-\u0000 \u0000C\u0000o\u0000n\u0000n\u0000e\u0000c\u0000t\u0000i\u0000o\u0000n\u0000 \u0000t\u0000e\u0000r\u0000m\u0000i\u0000n\u0000a\u0000t\u0000e\u0000d\u0000" }
I've tried this, to no avail:
<Input cerberus_log> Module im_file File "C:\ProgramData\Cerberus LLC\Cerberus FTP Server\log\server.1.log" <Exec> $message = convert($raw_event, "utf-8", "iso8859-2"); if $message =~ s/(.)\\u0000// $message = $1; to_json(); </Exec> </Input>
How can I properly parse the log to remove the \u0000 characters before it goes out?
#2
cschelin
(Last updated
)
Additional note: when I use this instead:
$message = replace($message, "\u0000", "");
I get this:
{"EventReceivedTime":"2024-08-05 15:13:23","SourceModuleName":"cerberus_log","SourceModuleType":"im_file","message":"["}
In fact, that result happens no matter what I put in the replace-this string position.