Parsing delimited log files with regex
Hi I'm using nxlog v2.9.1716.
I've created the following input:
<Input in> Module im_file File "C:\Program Files\LogFiles\*.log" SavePos TRUE Recursive TRUE
Exec if $raw_event =~ /^#/ drop();
Exec if $raw_event =~ ^([^;]+);([^;]+);([^;]+)(?>;([^;]+);([^;]+);([^;]+);([^;]+);([^;]+);([^;]+);([^;]+);([^;]+);([^;]+);(.+)$)?/gx; \
{ \
$date = $1; \
$time = $2; \
$site-instance = $3; \
$event = $4; \
$client-ip = $5; \
$via-header = $6; \
$http-x-forwarded-for = $7; \
$host-header = $8; \
$additional-info-1 = $9; \
$additional-info-2 = $10; \
$additional-info-3 = $11; \
$additional-info-4 = $12; \
$additional-info = $13; \
$EventTime = parsedate($date + " " + $time); \
$SourceName = "WAF"; \
}
</Input>
The regex being used has been successfully tested with https://regex101.com/
Sample data below:
2018-06-28 ; 10:23:52 ; W3SVC2 ; OnPreprocHeaders ; 10.10.10.10 ; ; 8.8.8.8 ; my.domain.com ; GET ; /account/login ; ALERT: '/account/' not allowed in URL ; HTTP/1.0 ; 0 ; ; Actional Intermediary
When I start the nxlog service, I get the following error:
2018-06-28 16:44:51 ERROR Couldn't parse Exec block at C:\Program Files (x86)\nxlog\conf\nxlog.conf:89; couldn't parse statement at line 89, character 24 in C:\Program Files (x86)\nxlog\conf\nxlog.conf; syntax error 2018-06-28 16:44:51 ERROR module 'in' has configuration errors, not adding to route '2' at C:\Program Files (x86)\nxlog\conf\nxlog.conf:116 2018-06-28 16:44:51 ERROR route 2 is not functional without input modules, ignored at C:\Program Files (x86)\nxlog\conf\nxlog.conf:116 2018-06-28 16:44:51 WARNING not starting unused module in 2018-06-28 16:44:51 INFO nxlog-ce-2.9.1716 started 2018-06-28 16:44:51 INFO reconnecting in 1 seconds
I also tried the following:
<Input in> Module im_file File "C:\Program Files\AQTRONIX Webknight\LogFiles\*.log" SavePos TRUE Recursive TRUE <Exec> if $Message =~ /^#/ drop(); $Message =~ ^(?<date>[^;]+);(?<time>[^;]+);(?<site_instance>[^;]+)(?>;(?<event>[^;]+);(?<client_ip>[^;]+);(?<via_header>[^;]+);(?<http_x_forwarded_for>[^;]+);(?<host_header>[^;]+);(?<additional_info_1>[^;]+);(?<additional_info_2>[^;]+);(?<additional_info_3>[^;]+);(?<additional_info_4>[^;]+);(?<additional_info>.+)$)? /gx; </Exec> </Input>
But I receive the following error on starting nxlog:
2018-06-28 17:15:54 ERROR Couldn't parse Exec block at C:\Program Files (x86)\nxlog\conf\nxlog.conf:70; couldn't parse statement at line 72, character 15 in C:\Program Files (x86)\nxlog\conf\nxlog.conf; syntax error 2018-06-28 17:15:54 ERROR module 'in' has configuration errors, not adding to route '2' at C:\Program Files (x86)\nxlog\conf\nxlog.conf:100 2018-06-28 17:15:54 ERROR route 2 is not functional without input modules, ignored at C:\Program Files (x86)\nxlog\conf\nxlog.conf:100 2018-06-28 17:15:54 WARNING not starting unused module in 2018-06-28 17:15:54 INFO nxlog-ce-2.9.1716 started
I tried various syntax changes, but just cannot see the issue.
This is the first time I've tried using a regex with nxlog.
Any help or guidance much appreciated.
As per the user guide on fields:
89.2.3. Fields
Fields are referenced in the NXLog language by prepending a dollar sign ($) to the field name. A field name can contain the characters [a-zA-Z0-9_.] but must begin with a letter or underscore (), as indicated by the following regular expression:
[[:alpha:]][[:alnum:]._]*
Fields containing special characters such as the space or minus (-) can be specified using curly braces such as ${file-size} or ${file size}.
For instance:
$site-instance = $3;
needs to either be ${site-instance} = $3;
or $site_instance = $3;
Using <Exec>
blocks can help more easily identify the position and line of errors.
Your regex also needs to start with /
. Example : if $raw_event =~ /^([^;]
I believe your last issue is on $raw_event
you are only getting one event at a time and matching the entire event so /gx
shouldn't be needed.