Help! Nxlog handling big number of files

Tags:

#1 kullboys

Hello,

I have an application that logs some API requests and responses. Each request is logged in a different file, as a single line. In the system there are thousands of files, and Nxlog seems to have issues sending the logs to Elasticsearch. It reads the file, I can see im_file_add_file command in logs, but it takes a long time to actually send the message.

How does Nxlog process multiple files in a single directory? Thanks

#2 Zhengshi Nxlog ✓
#1 kullboys
Hello, I have an application that logs some API requests and responses. Each request is logged in a different file, as a single line. In the system there are thousands of files, and Nxlog seems to have issues sending the logs to Elasticsearch. It reads the file, I can see im_file_add_file command in logs, but it takes a long time to actually send the message. How does Nxlog process multiple files in a single directory? Thanks

There seems to be a couple items to unpack in this question. I will try to address them all.

Nxlog seems to have issues sending the logs to Elasticsearch

What do you mean with this one?
Whether you using om_elasticsearch in NXLog EE or one of the networking modules from NXLog CE, I would verify the steps in the process. Make sure they output module is getting the events, then make sure the events make it to your Elasticsearch box with some wireshark/tcpdump. To verify output module you could add Exec log_info("Out: " + $raw_event); to the Output module to test with. This can bloat your NXLog log file, so only use for testing. I also like to run NXLog in foreground mode when testing just because it is a little easier for me. nxlog -f

it takes a long time to actually send the message

A continuation from above. This may be due to the size of the directory in question with a delay in reading each file. See below for suggestions.

How does Nxlog process multiple files in a single directory?

NXLog will scan the directory to compare modification time and size of each file to see if there is a change and then open the files that are changes. The changed files are then opened and read/processed.
PollInterval,DirCheckInterval, ActiveFiles are related directives, though see below for our suggested first step on solving your issue that you are seeing.

Knowing this, we would suggest trimming down the directory so that there are less files to check. This example will remove the files after they have reached EOF.

<Input in>
    Module          im_file
    File            '/opt/nxlog/etc/input.log'
    SavePos	    True
    ReadFromLast    True
    <OnEOF>
	    Exec file_remove(file_name());
    </OnEOF>
</Input>

You could also use something like this in your OnEOF block to move the files to a location outside of the log path so that they are not read again after they are processed if you need to keep those files for some reason.
Exec file_rename(file_name(), "/tmp/"+file_basename(file_name())+".bak");

These solutions would help to keep the directory file count down and reduce the time needed to check all of those files.

Hope that helps!
-Jesse