multiline extension not getting the endline regex condition
Hi,
I am trying to parse a log4net file into json.
Here's my sample log4net:
----------------
2015-01-27 01:06:18,859 [7] ERROR Web.Cms.Content.Base.Taxonomy.TaxonomyDetectionProvider [(null)] - Get taxonomy Type Failed for Tools
2015-01-27 06:34:31,051 [26] ERROR www.Status404 [(null)] - ErrorId: 20150127_102b01c6-3208-48c5-8c8b-ae4f92cf2b20
UserAgent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36
HostAddress: 192.168.10.2
RequestUrl: /ErrorPages/404.aspx
MachineName: QA01
Raw Url:/undefined/
Referrer: http://qa1.www.something.com/toolset.aspx
2015-01-27 06:34:33,270 [26] DEBUG Web.Caching.Core.CacheManagerBase [(null)] - Custom CacheProvider:Web.Caching.Core.AppFabricCacheManager,Web.Caching.Core Disabled
Now I am using xm_multiline to capture each log entries.
----------------
<Extension multiline>
Module xm_multiline
HeaderLine /^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3}/
EndLine /\r?\n\r?\n^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3}/
</Extension>
I use a regex to capture the timestamp as the header then I use a regex to capture twice newline then the next timestamp as endline. However it still treat the second and last entry as ONE log entry.
Here's the output:
----------------
{
"EventReceivedTime":"2015-01-27 01:06:35",
"SourceModuleName":"log4net",
"SourceModuleType":"im_file",
"time":"2015-01-27 01:06:18,859",
"thread":"7",
"level":"ERROR",
"logger":"Web.Cms.Content.Base.Taxonomy.TaxonomyDetectionProvider",
"ndc":"(null)",
"message":"Get taxonomy Type Failed for Tools"
}{
"EventReceivedTime":"2015-01-27 06:34:35",
"SourceModuleName":"log4net",
"SourceModuleType":"im_file",
"time":"2015-01-27 06:34:31,051",
"thread":"26",
"level":"ERROR",
"logger":"www.Status404",
"ndc":"(null)",
"message":" ErrorId: 20150127_102b01c6-3208-48c5-8c8b-ae4f92cf2b20\r\n UserAgent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99
Safari/537.36\r\n HostAddress: 192.168.10.2\r\n RequestUrl: /ErrorPages/404.aspx\r\n MachineName: QA01\r\n
Raw Url:/undefined/\r\n Referrer: http://qa1.www.something.com/toolset.aspx\r\n\r\n2015-01-27 06:34:33,270 [26] DEBUG Web.Caching.Core.CacheManagerBase [(null)] - Custom CacheProvider:Web.Caching.Core.AppFabricCacheManager,Web.Caching.Core Disabled"
}
I used this to produce that output:
----------------
Exec if $raw_event =~ /^(\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3}) \[(\S+)\] (\S+) (\S+) \[(\S+)\] \- (.*)/s \
{ \
$time = $1; \
$thread = $2; \
$level = $3; \
$logger = $4; \
$ndc = $5; \
$message = $6; \
to_json(); \
} \
else \
{ \
drop(); \
}
I've also tried to tweak it by using this to avoid the combining the last two entries as one. However I am not able to get the last entry anymore.
----------------
Exec if $raw_event =~ /^(\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3}) \[(\S+)\] (\S+) (\S+) \[(\S+)\] \- ([\s\S]*?)(\r?\n\r?\n|$)/ \
{ \
$time = $1; \
$thread = $2; \
$level = $3; \
$logger = $4; \
$ndc = $5; \
$message = $6; \
to_json(); \
} \
else \
{ \
drop(); \
}
Th sample log shows nothing that would be a marker for closing an event, as such you should only use HeaderLine. EndLine is optional.