multiline bug?

Tags: xml | xm_multiline

#1 pgs

Hi,

I'm trying to use the xm_multiline module with nxlog to forward content of a logfile to logstash The log contains different xml elements which are properly indented (opening and closing elements are located at the start of the line) . E.g.


<data
    version="x"
    xmlns:bla="http://www.example.com/bla">
    <val:InfoSet>
      ...
        ...
          ...
    </val:InfoSet>
</data>

<message  ...>
    <ns>bla</ns>
    ...
        ...
</message>

Because the elements have different names, I can only use < and </ to find the start and end line. I was hoping a filter like this should be enough to select the correct lines:


HeaderLine  /^</
EndLine     /^<//

But somehow nxlog gets confused with the / in the regex pattern. I also tried escaping which dindn't help. More testing showed that it needs at least one letter. I tried to specify all letters via regex but that didn't work:


HeaderLine  /^<[a-z]/

Only way that seems to work is to specify all letters in the square braket (with the exception of the lettern, which breaks). 


HeaderLine  /^<[abcdefghijklmopqrstuvwxyz]/    (left out n)

Here all my test results.

These lines worked:


HeaderLine  /^<m/
EndLine     /^</m/

HeaderLine  /^<m/
EndLine     /^<\/m/

HeaderLine  /^<[abcdefghijklm]/
EndLine     /^<\/[abcdefghijklm]/

HeaderLine  /^<[abcdefghijklmo]/
EndLine     /^<\/[abcdefghijklmo]/

HeaderLine  /^<[abcdefghijklmopqrstuvwxyz]/    (left out n)
EndLine     /^<\/[abcdefghijklmopqrstuvwxyz]/

HeaderLine  /^<[abcdefghijklmopqrstuvwxyz]/ (left out n + not escaped
EndLine     /^</[abcdefghijklmopqrstuvwxyz]/

These lines didn't work:


HeaderLine  /^</
EndLine     /^</m/

HeaderLine  /^<[a-z]/
EndLine     /^</m/

HeaderLine  /^<\w/
EndLine     /^</m/

HeaderLine  /^<[abcdefghijklmn]/
EndLine     /^<\/[abcdefghijklmn]/

HeaderLine  /^<[bcdefghijklmn]/
EndLine     /^<\/[bcdefghijklmn]/

HeaderLine  /^<[abcdefghijklmopqrstuvwxyzn]/
EndLine     /^<\/[abcdefghijklmopqrstuvwxyzn]/

HeaderLine  /^<[abcdefghijklmnopqrstuvwxyz]/
EndLine     /^</[abcdefghijklmnopqrstuvwxyz]/

Right now I still have a problem because many of my bessages start with <n. I think this is a bug in the module. Can you confirm so I can open a ticket? Thanks
 

Fyi, this is a duplicate of http://stackoverflow.com/questions/27429234/which-headerline-and-endline-for-multiline-xml-with-different-elements

 

#2 adm Nxlog ✓
#1 pgs
Hi, I'm trying to use the xm_multiline module with nxlog to forward content of a logfile to logstash The log contains different xml elements which are properly indented (opening and closing elements are located at the start of the line) . E.g. <data version="x" xmlns:bla="http://www.example.com/bla"> <val:InfoSet> ... ... ... </val:InfoSet> </data> <message ...> <ns>bla</ns> ... ... </message> Because the elements have different names, I can only use < and </ to find the start and end line. I was hoping a filter like this should be enough to select the correct lines: HeaderLine /^</ EndLine /^<// But somehow nxlog gets confused with the / in the regex pattern. I also tried escaping which dindn't help. More testing showed that it needs at least one letter. I tried to specify all letters via regex but that didn't work: HeaderLine /^<[a-z]/ Only way that seems to work is to specify all letters in the square braket (with the exception of the lettern, which breaks).  HeaderLine /^<[abcdefghijklmopqrstuvwxyz]/ (left out n) Here all my test results. These lines worked: HeaderLine /^<m/ EndLine /^</m/ HeaderLine /^<m/ EndLine /^<\/m/ HeaderLine /^<[abcdefghijklm]/ EndLine /^<\/[abcdefghijklm]/ HeaderLine /^<[abcdefghijklmo]/ EndLine /^<\/[abcdefghijklmo]/ HeaderLine /^<[abcdefghijklmopqrstuvwxyz]/ (left out n) EndLine /^<\/[abcdefghijklmopqrstuvwxyz]/ HeaderLine /^<[abcdefghijklmopqrstuvwxyz]/ (left out n + not escaped EndLine /^</[abcdefghijklmopqrstuvwxyz]/ These lines didn't work: HeaderLine /^</ EndLine /^</m/ HeaderLine /^<[a-z]/ EndLine /^</m/ HeaderLine /^<\w/ EndLine /^</m/ HeaderLine /^<[abcdefghijklmn]/ EndLine /^<\/[abcdefghijklmn]/ HeaderLine /^<[bcdefghijklmn]/ EndLine /^<\/[bcdefghijklmn]/ HeaderLine /^<[abcdefghijklmopqrstuvwxyzn]/ EndLine /^<\/[abcdefghijklmopqrstuvwxyzn]/ HeaderLine /^<[abcdefghijklmnopqrstuvwxyz]/ EndLine /^</[abcdefghijklmnopqrstuvwxyz]/ Right now I still have a problem because many of my bessages start with <n. I think this is a bug in the module. Can you confirm so I can open a ticket? Thanks   Fyi, this is a duplicate of http://stackoverflow.com/questions/27429234/which-headerline-and-endline-for-multiline-xml-with-different-elements  

The following error message is correct:

ERROR HeaderLine and Endline both match

You should fix your configuration so that the regular expressions in HeaderLine and Endline do not match the same line in the input. This is incorrect:

HeaderLine  /^</
EndLine     /^<//

It should be:

HeaderLine  /^<[^\/]/
EndLine     /^<//

It's unclear what your issue with the character n is about.