6
responses

Hi,

we have a problem with nxlog cpu usage after it loses connection to syslog-server. After losing connection the cpu usage peaks to 100% until nxlog is stopped. We are using om_ssl module and the problem only happens on rhel 8-computers. All the other computers with windows/rhel 7 are working fine with same configurations.

I have tried without om_ssl-module, then it works just fine.

Our system:
Rocky Linux 8.6
Nxlog version ce-3.0.2272_rhel8.x86_64

As my syslog-server, i am using graylog.

Is anyone else having this problem?

AskedSeptember 5, 2022 - 2:12pm

Comments (6)

  • Gray Logger's picture

    Hi,

    Additionally to my first post:

    The cpu-usage thing doesn't seem to happen every time. Sometimes i restart the graylog-server and all the nxlog-computers are fine but sometimes rocky linux-computers CPU-usage jumps to 100%

    nxlog.conf

    #NoFreeOnExit TRUE
    #
    define ROOT /usr/bin
    User nxlog
    Group nxlog
    
    Moduledir /usr/lib/nxlog/modules
    CacheDir /var/spool/nxlog/
    PidFile /usr/lib/nxlog/nxlog.pid
    LogFile /var/log/nxlog/nxlog.log
    LogLevel INFO
    
    <Extension gelfExt>
      Module xm_gelf
      # Avoid truncation of the short_message field to 64 characters.
      ShortMessageLength 65536
    </Extension>
    
    <Extension _fileop>
        Module  xm_fileop
    </Extension>
    
    <Extension json>
        Module      xm_json
    </Extension>
    
    <Extension syslog>
        Module    xm_syslog
    </Extension>
    
    <Input in>
        Module im_tcp
        Port   PORT
        Host   HOST
    </Input>
    
    <Input fileRead>
            Module im_file
            File <path/to/file>
            PollInterval 1
            Recursive FALSE
            SavePos       True
        ReadFromLast  True
            RenameCheck TRUE
            Exec    parse_syslog();
    </Input>
    
    <Output fileWrite>
        Module om_file
        FILE <path/to/file>
        <Schedule>
            # This can likewise be used for `@weekly` or `@monthly` time periods.
               When @daily
             #When "0 0 * * *"
             #When */5 * * * *
             #When  * * * * *
    
            # The `Every` directive could also be used in this case.
            # Every 24 hour
    
            <Exec>
                file_cycle(file_name(), 7);
                reopen();
            </Exec>
        </Schedule>
    </Output>
    
    #<Processor buffer>
    #   Module      pm_buffer
    # 1 GB buffer
    #  MaxSize     50000
    #    Type        Mem
    #    CreateDir   True
    #    Directory   /var/log/nxlog
    #</Processor>
    
    <Output gelf>
            Module om_ssl
            Host HOST
            Port PORT
            OutputType  GELF_TCP
            CAFile <path/to/CAFile>
            CertFile <path/to/CertFile>
            CertKeyFile <path/to/CertKeyFile>
            <Exec>
               if not ($raw_event =~ /^[$&+,:;=?@#|'<>.^*()%!\-a-zA-Z0-9]/) drop();
              if ($raw_event =~ /-->\W\[context]/) drop();
              #if ($raw_event =~ /^(?:[\t ]*(?:\r?\n|\r))+/) drop();
              # These fields are needed for Graylog
              $gl2_source_collector = '';
              $collector_node_id = '';
              $operating_system = 'Linux';
            </Exec>
    </Output>
    
    <Route 1>
        Path   in => fileWrite
    </Route>
    
    <Route 2>
        Path  fileRead => gelf
    </Route>
    

    nxlog.log

    2022-09-09 12:57:26 INFO connecting to x.x.x.x
    2022-09-09 12:57:26 INFO successfully connected to x.x.x.x
    #here I restart the graylog-server
    2022-09-09 12:58:53 INFO remote closed SSL socket
    

    doesnt show much, after the message "remote closed SSL socket" cpu usage jumps to 100%. After graylog-server is back up the nxlog doesn't connect to it and logs show nothing until i restart nxlog. Then it connects to graylog and CPU usage is normal again.

  • Gray Logger's picture

    Hi,

    I tried with loglevel debug and this is what it says after losing connection:

    2022-09-13 08:40:10 DEBUG om_ssl read
    2022-09-13 08:40:10 INFO remote closed SSL socket
    2022-09-13 08:40:10 DEBUG STOP: gelf
    2022-09-13 08:40:10 DEBUG remove socket [21]
    2022-09-13 08:40:10 DEBUG worker 4 got signal for new job
    2022-09-13 08:40:10 DEBUG worker 4 got no event to process
    2022-09-13 08:40:10 DEBUG worker 4 waiting for new event
    2022-09-13 08:40:10 DEBUG worker 2 got signal for new job
    2022-09-13 08:40:10 DEBUG worker 2 got no event to process
    2022-09-13 08:40:10 DEBUG worker 2 waiting for new event
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:11 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:11 DEBUG event added to jobqueue
    2022-09-13 08:40:11 DEBUG future event, event thread sleeping 979685ms in cond_timedwait
    2022-09-13 08:40:11 DEBUG worker 5 got signal for new job
    2022-09-13 08:40:11 DEBUG worker 5 processing event 0x7f9c84000b60
    2022-09-13 08:40:11 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:11 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:11 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:11 DEBUG worker 5 waiting for new event
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:11 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:11 DEBUG future event, event thread sleeping 979197ms in cond_timedwait
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:12 DEBUG nx_event_to_jobqueue: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:12 DEBUG event added to jobqueue
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:12 DEBUG future event, event thread sleeping 10285ms in cond_timedwait
    2022-09-13 08:40:12 DEBUG worker 6 got signal for new job
    2022-09-13 08:40:12 DEBUG worker 6 processing event 0x7f9c7c000b60
    2022-09-13 08:40:12 DEBUG PROCESS_EVENT: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:12 DEBUG 'message_jurnal.log' matches 'message_jurnal.log' in directory '/var/log/message_jurnal/'
    2022-09-13 08:40:12 DEBUG im_file_add_file: /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:12 DEBUG check file /var/log/message_jurnal/message_jurnal.log (inode: 34254399)
    2022-09-13 08:40:12 DEBUG worker 6 waiting for new event
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:12 DEBUG future event, event thread sleeping 9883ms in cond_timedwait
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:12 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:12 DEBUG event added to jobqueue
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:12 DEBUG future event, event thread sleeping 1989257ms in cond_timedwait
    2022-09-13 08:40:12 DEBUG worker 1 got signal for new job
    2022-09-13 08:40:12 DEBUG worker 1 processing event 0x7f9c90000c30
    2022-09-13 08:40:12 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:12 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:12 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:12 DEBUG worker 1 waiting for new event
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:12 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:12 DEBUG future event, event thread sleeping 999943ms in cond_timedwait
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:13 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:13 DEBUG event added to jobqueue
    2022-09-13 08:40:13 DEBUG future event, event thread sleeping 980246ms in cond_timedwait
    2022-09-13 08:40:13 DEBUG worker 3 got signal for new job
    2022-09-13 08:40:13 DEBUG worker 3 processing event 0x7f9c8c000c10
    2022-09-13 08:40:13 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:13 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:13 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:13 DEBUG worker 3 waiting for new event
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:13 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:13 DEBUG future event, event thread sleeping 979876ms in cond_timedwait
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:14 DEBUG nx_event_to_jobqueue: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:14 DEBUG event added to jobqueue
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:14 DEBUG future event, event thread sleeping 19192ms in cond_timedwait
    2022-09-13 08:40:14 DEBUG worker 4 got signal for new job
    2022-09-13 08:40:14 DEBUG worker 4 processing event 0x2385fe0
    2022-09-13 08:40:14 DEBUG PROCESS_EVENT: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:14 DEBUG 'message_jurnal.log' matches 'message_jurnal.log' in directory '/var/log/message_jurnal/'
    2022-09-13 08:40:14 DEBUG im_file_add_file: /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:14 DEBUG check file /var/log/message_jurnal/message_jurnal.log (inode: 34254399)
    2022-09-13 08:40:14 DEBUG worker 4 waiting for new event
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:14 DEBUG future event, event thread sleeping 18760ms in cond_timedwait
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:14 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:14 DEBUG event added to jobqueue
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:14 DEBUG future event, event thread sleeping 1978282ms in cond_timedwait
    2022-09-13 08:40:14 DEBUG worker 2 got signal for new job
    2022-09-13 08:40:14 DEBUG worker 2 processing event 0x7f9c80000b60
    2022-09-13 08:40:14 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:14 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:14 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:14 DEBUG worker 2 waiting for new event
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:14 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:14 DEBUG future event, event thread sleeping 999897ms in cond_timedwait
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:15 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:15 DEBUG event added to jobqueue
    2022-09-13 08:40:15 DEBUG future event, event thread sleeping 977293ms in cond_timedwait
    2022-09-13 08:40:15 DEBUG worker 5 got signal for new job
    2022-09-13 08:40:15 DEBUG worker 5 processing event 0x23858f0
    2022-09-13 08:40:15 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:15 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:15 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:15 DEBUG worker 5 waiting for new event
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:15 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:15 DEBUG future event, event thread sleeping 976908ms in cond_timedwait
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:16 DEBUG nx_event_to_jobqueue: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:16 DEBUG event added to jobqueue
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:16 DEBUG future event, event thread sleeping 16180ms in cond_timedwait
    2022-09-13 08:40:16 DEBUG worker 6 got signal for new job
    2022-09-13 08:40:16 DEBUG worker 6 processing event 0x7f9c88000b60
    2022-09-13 08:40:16 DEBUG PROCESS_EVENT: MODULE_SPECIFIC (fileRead)
    2022-09-13 08:40:16 DEBUG 'message_jurnal.log' matches 'message_jurnal.log' in directory '/var/log/message_jurnal/'
    2022-09-13 08:40:16 DEBUG im_file_add_file: /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:16 DEBUG check file /var/log/message_jurnal/message_jurnal.log (inode: 34254399)
    2022-09-13 08:40:16 DEBUG worker 6 waiting for new event
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:16 DEBUG future event, event thread sleeping 15802ms in cond_timedwait
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:16 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:16 DEBUG event added to jobqueue
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:16 DEBUG future event, event thread sleeping 1978244ms in cond_timedwait
    2022-09-13 08:40:16 DEBUG worker 1 got signal for new job
    2022-09-13 08:40:16 DEBUG worker 1 processing event 0x7f9c84000b60
    2022-09-13 08:40:16 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:16 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:16 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:16 DEBUG worker 1 waiting for new event
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:16 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:16 DEBUG future event, event thread sleeping 999943ms in cond_timedwait
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:17 DEBUG nx_event_to_jobqueue: READ (fileRead)
    2022-09-13 08:40:17 DEBUG event added to jobqueue
    2022-09-13 08:40:17 DEBUG future event, event thread sleeping 977225ms in cond_timedwait
    2022-09-13 08:40:17 DEBUG worker 3 got signal for new job
    2022-09-13 08:40:17 DEBUG worker 3 processing event 0x7f9c90000c30
    2022-09-13 08:40:17 DEBUG PROCESS_EVENT: READ (fileRead)
    2022-09-13 08:40:17 DEBUG Module fileRead got EOF from /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:17 DEBUG got EOF for /var/log/message_jurnal/message_jurnal.log
    2022-09-13 08:40:17 DEBUG worker 3 waiting for new event
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileWrite:SCHEDULE]
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileRead:MODULE_SPECIFIC]
    2022-09-13 08:40:17 DEBUG new event in event_thread [fileRead:READ]
    2022-09-13 08:40:17 DEBUG future event, event thread sleeping 976903ms in cond_timedwait
    

    Then it just loops that forever and CPU usage is high.

  • NenadM's picture
    (NXLog)

    It seems like you are using the same file to write the logs to and read them out...basically, you use it as a buffer. But NXLog already has it's built in buffers and the mechanism called Flow Control. Because of it - when sending the logs to the gelf output stops, it stops reading from the file (both the modules, the im_file and the om_file). It's very likely that this CPU usage peaks are caused by this configuration issue.
    Please check the following document that explains the buffering and flow control in NXLog:

    https://docs.nxlog.co/userguide/configure/using-buffers.html

    PS. The Debug level logs snippet covers only 6 or 7 seconds after the gelf output gets blocked and it doesn't really contain anything suspicios.

    Please try with pm_buffer module or multiple routes.

  • Gray Logger's picture

    Hi,

    I tried without our buffering file with configuration below. The CPU usage peak still occurred, after losing connection to syslog server. I couldn't replicate this issue rhel7 or other systems. I tested this with two separate rhel 8 computers and the bug still happened.

    Also, in case the problem happens, there is huge amounts of logs in nxlog.log-file, but when it doesn't happen, log generation stops.

    Also found this interesting gitlab issue: https://gitlab.com/nxlog-public/nxlog-ce/-/issues/24

    define ROOT /usr/bin
    
    <Extension gelfExt>
      Module xm_gelf
      # Avoid truncation of the short_message field to 64 characters.
      ShortMessageLength 65536
    </Extension>
    
    <Extension _json>
        Module      xm_json
    </Extension>
    
    <Extension syslogExt>
      Module xm_syslog
    </Extension>
    
    User nxlog
    Group nxlog
    
    Moduledir /usr/lib/nxlog/modules
    CacheDir /var/spool/nxlog/
    PidFile /usr/lib/nxlog/nxlog.pid
    LogFile /var/log/nxlog/nxlog.log
    LogLevel INFO
    
    
    <Input file>
        Module im_file
        File '/var/log/messages'
        File '/var/log/graylog-server/server.log'
        PollInterval 1
        Recursive FALSE
        SavePos       True
        ReadFromLast  True
        RenameCheck FALSE
        #Exec $FileName = file_name(); # Send file name with each message
        <Exec>
             parse_syslog();
        </Exec>
    </Input>
    
    <Input audit>
        Module      im_file
        File        "/var/log/audit/audit.log"
        <Exec>
             parse_syslog();
        </Exec>
    </Input>
    
    <Output gelf>
        Module om_ssl
        Host HOST
        Port 12201
        OutputType  GELF_TCP
        CAFile /etc/graylog/server/newgraylogcert/cert.crt
        CertFile /etc/graylog/server/newgraylogcert/cert.crt
        CertKeyFile /etc/graylog/server/newgraylogcert/key.key
        <Exec>
          # These fields are needed for Graylog
          $gl2_source_collector = '${sidecar.nodeId}';
          $collector_node_id = '${sidecar.nodeName}';
          $operating_system = '${sidecar.operatingSystem}';
          $raw_event = to_json();
        </Exec>
    </Output>
    
    <Route route-1>
      Path file => gelf
    </Route>
    

    Logs after losing connection:

    https://controlc.com/2111513c

Answers (0)