Logging suddenly stops for high-volume input but continues to work for low-volume input


#1 progssilb

I'm not sure how to characterize what's going on, but here goes...

My route path has two inputs, an itermittently high-volume input, and a low-volume input. The high-volume input can be thousands in a couple minutes, or it can be practically nothing. The low-volume input is, at most, one or two entries per ~3-4 minutes. There are also three outputs, two HTTP and a rotated file. They are disconnecting a fair bit, presumably due to timeouts or lack of pipelining or something.

In my current configuration, I consistently find that my log data for the high-volume log gets dropped after a couple minutes. I'm not sure if the timing correlates with the HTTP disconnects, but it might. Sometimes I get just over a thousand log lines through, sometimes I get a couple hundred log lines, sometimes I get a couple thousand. Interestingly, the low-volume log is unaffected.

I do have flow control enabled, and putting a buffer on the inputs did not seem to help. I didn't try disabling flow control because I don't understand it very well. I have to have both inputs going into the same route because how the messages interleave is important to the meaning of the entries.

Here's the path that I use:

  Path       vg_tsw_client, vg_tsw_combat => vg_tsw_pattern => vg_tsw_unparse_finder => vg_tsw_testfile, vg_tsw_es, vg_tsw_cdb

_client is the low-volume, _combat is the high volume. _pattern is a pm_pattern with ~25-30 regexes and a small script on every pattern (0-10 lines), _unparse_finder is a pm_null uses add_to_route to copy unmatched patterns to a new route (0 hits lately) and does some light data enrichment via Exec. _testfile is the rotated file output, _es and _cdb are the HTTP outputs.

Thanks in advance.

#2 b0ti Nxlog ✓
#1 progssilb
I'm not sure how to characterize what's going on, but here goes... My route path has two inputs, an itermittently high-volume input, and a low-volume input. The high-volume input can be thousands in a couple minutes, or it can be practically nothing. The low-volume input is, at most, one or two entries per ~3-4 minutes. There are also three outputs, two HTTP and a rotated file. They are disconnecting a fair bit, presumably due to timeouts or lack of pipelining or something. In my current configuration, I consistently find that my log data for the high-volume log gets dropped after a couple minutes. I'm not sure if the timing correlates with the HTTP disconnects, but it might. Sometimes I get just over a thousand log lines through, sometimes I get a couple hundred log lines, sometimes I get a couple thousand. Interestingly, the low-volume log is unaffected. I do have flow control enabled, and putting a buffer on the inputs did not seem to help. I didn't try disabling flow control because I don't understand it very well. I have to have both inputs going into the same route because how the messages interleave is important to the meaning of the entries. Here's the path that I use:   Path       vg_tsw_client, vg_tsw_combat => vg_tsw_pattern => vg_tsw_unparse_finder => vg_tsw_testfile, vg_tsw_es, vg_tsw_cdb _client is the low-volume, _combat is the high volume. _pattern is a pm_pattern with ~25-30 regexes and a small script on every pattern (0-10 lines), _unparse_finder is a pm_null uses add_to_route to copy unmatched patterns to a new route (0 hits lately) and does some light data enrichment via Exec. _testfile is the rotated file output, _es and _cdb are the HTTP outputs. Thanks in advance.

add_to_route() does not work well with flow control as it's not possible to apply back pressure in that case, but it shouldn't result in dropped messages.

I suggest trying to reduce your config until you find what's causing the issue.