Debugging NXLog

When other troubleshooting fails to identify or resolve an issue, inspecting the NXLog agent itself can prove useful. Some techniques are outlined below.

Generate core dumps

Core dumps can act as a helpful resource for the NXLog development and support teams when debugging issues. Contact support to find out the level of work available for your installation.

Core dumps on Linux

It is necessary to install the NXLog debug symbols package in order to produce useful core dump files.

Remove the User and Group directives from the configuration. NXLog needs to be running as root:root to produce a core dump.
Use ulimit to remove the core file size limit.
```
# ulimit -c unlimited
```
Run NXLog manually to test that it can create a core dump.
```
# /opt/nxlog/bin/nxlog -f
```

Find the NXLog process and kill it with the SIGABRT signal.

# kill -ABRT `ps aux | grep [/]opt/nxlog/bin/nxlog | awk '{print $2}'`

Verify that a core dump file was created at /opt/nxlog/var/spool/nxlog/core.

# ls -l /opt/nxlog/var/spool/nxlog/
total 26708
-rw------- 1 root root 27348992 Oct 30 08:51 core

If the core dump file was created successfully, run NXLog again as root in order to catch the next crash.
```
# /opt/nxlog/bin/nxlog -f
```

Core dumps on Microsoft Windows

Core dumps can be generated on Windows by using ProcDump from Microsoft Sysinternals.

ProcDump runs on Windows Vista and higher, and Windows Server 2008 and higher.

For example, run the following to write a full dump of the nxlog process when its handle count exceeds 10,000:

> procdump -ma nxlog -p "\Process(nxlog)\Handle Count" 10000

Inspect memory leaks

To inspect a memory leak, it is essential to observe the LogqueueSize and the memory use simultaneously. The first step is to check the state and dynamics of the log queue. While the log queue is being filled up, you may observe a monotonic upward trend in memory use, continuing until it plateaus at a maximum value. In some cases, an OoM (Out of Memory) error may occur during this time, resulting in a killed NXLog process. However, this is no proof of a memory leak, merely over-allocation.

Once the LogqueueSize is full, memory consumption should stabilize with only minor fluctuations. If NXLog never reaches an equilibrium, there is a legitimate reason to suspect a leak.

For more details on the calculations used to determine the impact that LogqueueSize has on memory consumption, refer to BatchSize/LogqueueSize for memory usage.

Inspecting memory leaks on Linux

We recommend using Valgrind on GNU/Linux to debug memory leaks.

Install the debug symbols (-dbg) package (for example, nxlog-dbg_3.0.1759_amd64.deb).

The NXLog debug symbols package is currently only available for Linux. This package is not included with NXLog by default, but can be provided on request.
Install Valgrind.
Set the NoFreeOnExit directive to TRUE in the NXLog configuration file. This directive ensures that modules are not unloaded when NXLog is stopped, which allows Valgrind to properly resolve backtraces into modules.
Stop the NXLog service if it’s currently running:
```
# systemctl stop nxlog
```

Start NXLog under Valgrind with the following command:

# sudo valgrind --tool=massif --pages-as-heap=yes --massif-out-file=/path/to/massif.out /opt/nxlog/bin/nxlog -f

# su -c "valgrind --tool=massif --pages-as-heap=yes --massif-out-file=/path/to/massif.out /opt/nxlog/bin/nxlog -f"

Let NXLog run for a while until the Valgrind process shows the memory increase, then interrupt it with Ctrl+C. The output is written to the path specified by the --massif-out-file argument.
Send the massif.out file with a bug report.
Optionally, create a report from the massif.out.xxxx file with the ms_print command:
```
# ms_print massif.out.xxxx
```
The output of the ms_print report contains an ASCII chart at the top showing the increase in memory usage. The chart shows the sample number with the highest memory usage, marked with (peak). This is normally at the end of the chart (the last sample). The backtrace from this sample indicates where the most memory is allocated.

Inspecting memory leaks on Microsoft Windows

Windows Process Explorer from Microsoft Sysinternals can be used to inspect memory use of all running programs.

Once a potential source of excessive memory use has been determined, use DebugView from Microsoft Sysinternals to inspect the application’s debug output.

Inspecting open file handles of NXLog agents

Another method of debugging an NXLog agent is to determine the open file handles associated with it. With this information, you can discern whether NXLog is accessing the correct log and configuration files.

Inspecting open file handles on Linux

Find the process number of the running NXLog instance. Run the ps utility program to list processes, and then search for the NXLog process with grep.
```
# ps -e | grep nxlog
12345 ?         0:00.00 nxlog
```
Search for all open file handles of the NXLog process with the lsof utility program.
```
# lsof -p 12345
```

To count the number of open file handles, send the output of lsof to the wc utility program.
```
# lsof -p 12345 | wc -l
```

Inspecting open file handles on Windows

The number of open file handles on Windows can be determined using the Handle utility from Microsoft Sysinternals.

> handle -p nxlog

To count the number of open file handles, pass the -s parameter to handle.
```
> handle -s -p nxlog
```