to_json and special characters

Tags:

#1 vguyard
Hello, I have a question regarding the xm_json module of nxlog-ce v2.10. I am sending windows logs to out syslog server and using json message with a BSD header like so: ``` Module im_msvistalog * * * $SyslogFacilityValue = syslog_facility_value("local1"); Module om_udp Host 10.10.231.62 port 514 $Hostname = string(host_ip()); $Keywords = string($Keywords); $Message = to_json(); $Message =~ s/}$/,"field":"value"}\n/g; $Message =~ s/\\[r|n|t]/ /g; $Message =~ s/\s{2,}/ /g; to_syslog_bsd(); ``` So on output I convert the message to json, then add an extra field to the end of it, then remove the \t, \r, \n characters in the message and finally cleanup the extra whitespaces left by the previous substitution. This has a side-effect of modifying any string that contain the \t, \t or \n character in it, typically the **"A user DOMAIN\ruser1"** string will be changed to **"A user DOMAIN\ user1"** (space after backslash) mangling the json string in the process. So to prevent this, I changed the output to the following: ``` Module om_udp Host 10.10.231.62 port 514 $Hostname = string(host_ip()); $Keywords = string($Keywords); $Message = replace($Message, "\r", " "); $Message = replace($Message, "\n", " "); $Message = replace($Message, "\t", " "); $Message = to_json(); $Message =~ s/}$/,"field":"value"}\n/g; $Message =~ s/\\r\\n\\t\\t\\t/ /g; $Message =~ s/\s{2,}/ /g; # $Message =~ s/\\[r|n|t]/ /g; to_syslog_bsd(); ``` This time doing the substitutions before converting to json. Using this configuration when the `to_json();` is executed I see on **eventID 4672** that the **privilegelist** field is populated along with a **\r\n\t\t\t** sequence. I would have though that the replace actions would have gotten rid of those, is this an expected behavior or am I doing this the wrong way? For the moment I added `$Message =~ s/\\r\\n\\t\\t\\t/ /g;` to get rid of this specific sequence but how can I be sure that other messages are not affected with another sequence of tabulations and carriage return ? Thanks for your time. Vincent
#2 b0ti Nxlog ✓
#1 vguyard
Hello, I have a question regarding the xm_json module of nxlog-ce v2.10. I am sending windows logs to out syslog server and using json message with a BSD header like so: ``` Module im_msvistalog * * * $SyslogFacilityValue = syslog_facility_value("local1"); Module om_udp Host 10.10.231.62 port 514 $Hostname = string(host_ip()); $Keywords = string($Keywords); $Message = to_json(); $Message =~ s/}$/,"field":"value"}\n/g; $Message =~ s/\\[r|n|t]/ /g; $Message =~ s/\s{2,}/ /g; to_syslog_bsd(); ``` So on output I convert the message to json, then add an extra field to the end of it, then remove the \t, \r, \n characters in the message and finally cleanup the extra whitespaces left by the previous substitution. This has a side-effect of modifying any string that contain the \t, \t or \n character in it, typically the **"A user DOMAIN\ruser1"** string will be changed to **"A user DOMAIN\ user1"** (space after backslash) mangling the json string in the process. So to prevent this, I changed the output to the following: ``` Module om_udp Host 10.10.231.62 port 514 $Hostname = string(host_ip()); $Keywords = string($Keywords); $Message = replace($Message, "\r", " "); $Message = replace($Message, "\n", " "); $Message = replace($Message, "\t", " "); $Message = to_json(); $Message =~ s/}$/,"field":"value"}\n/g; $Message =~ s/\\r\\n\\t\\t\\t/ /g; $Message =~ s/\s{2,}/ /g; # $Message =~ s/\\[r|n|t]/ /g; to_syslog_bsd(); ``` This time doing the substitutions before converting to json. Using this configuration when the `to_json();` is executed I see on **eventID 4672** that the **privilegelist** field is populated along with a **\r\n\t\t\t** sequence. I would have though that the replace actions would have gotten rid of those, is this an expected behavior or am I doing this the wrong way? For the moment I added `$Message =~ s/\\r\\n\\t\\t\\t/ /g;` to get rid of this specific sequence but how can I be sure that other messages are not affected with another sequence of tabulations and carriage return ? Thanks for your time. Vincent

So on output I convert the message to json, then add an extra field to the end of it, then remove the \t, \r, \n characters in the message

When it is converted to json, the \t, \r, \n characters become an escape sequence.

Perhaps you are looking for something like this:

$Message = to_json();
$Message = replace($Message, '\t', " ");

Note that '\t' is two characters (i.e. the escape sequence for the tab character) and "\t" is a single character (i.e. the actual tab).