Agent-based versus agentless log collection - which option is best?

One of the harder decisions revolve around implementing agent-based vs agentless log collection. This post covers the two methods - their advantages and disadvantages - and provides some quick and actionable implementation notes.

Why does log collection agent choice matter?

When deploying a log collection strategy, administrators usually tend to zone in on already selected solutions that answers fundamental questions, such as "Will this solution collect and ship these types of log sources?" and "Will this solution be integrated with our systems and applications?". There is an expectation that for whichever log management capability is used, the components will somehow fall together providing that a log shipper can be integrated with a log source. Also there is the assumption that the best and most flexible way to ship the logs from point A to the log manager is via the agent-based method which involves installing a log agent that will collect, parse and forward the logs.

This notion is not entirely correct nor is the decision determined on the log collector. There is still the choice of which sources should or could be implemented as agent-based or agentless mode of log collection. While the default scenario is to opt for agent-based logging, there are still cases where agentless logging is the preferred option.

Agent-based Log Collection

agent based

Agent-based log collection tends to be the default choice. Despite being a specialized application, the agent has multiple functions. It not only collects and filters events, but it can also parse and convert the logs into other formats before forwarding. The following points should be considered when implementing agent-based log collection.

Agent software is required on all devices

Agents require agent software on all of the devices that require log collection. While this compact software takes up minimal space and makes the work of data collection a good deal easier, the implementation plan needs to take into account how each agent will be deployed and maintained in the network.

Deploying agent software is a learning curve

Considerations need to be made when deploying agents is that of technical skillsets. The system administrators deploying the agents on each device may not be overly happy when they are required to learn new skills to roll out each agent on the network. Despite the importance of centralized log collection for better enterprise security, they may prefer to minimize device changes and use the tools they already know.

Agent-based collection require additional work to meet extra security demands

Compliance regulations may also set strict limits on the kind of agents that can be deployed on production systems. Security operations will tend to plan the implementation on a higher level while the grunt work and hassle resides with the sysadmins. Thus, requiring additional work to implement agent-based log collection.

Agents act as efficient log collection filters

Placing agents on each system can reduce the amount of unnecessary data sent to the centralized logging server through the use of filters. Rather than sending everything received from system logs with no real idea of what is important, the system agent will make those decisions from the outset, avoiding processing and storage costs further up the event path.

Agents have cross-platform reporting capabilities

Agents can take system logs from Windows, Linux, and other compatible systems and log them into a usable format. After filtering for only relevant data, the agent then processes the information and converts to a useful way in the form of structured data.

Agents take up less network bandwidth and resources

Filters and compact messages mean less data is sent. System logging can take up considerably less bandwidth, as well as processing power and storage. Bottom line: in the long run, agents can help control the costs of centralized log collection.

Agents provide more secure and reliable log transmission

Agents can communicate with the centralized logging server using secure transmission methods such as TLS/SSL over TCP. Log data can be sent in compressed batches and can be buffered, making sure no events are lost in transmission, even on intermittent or saturated links.

Agentless Log Collection

agent less

Where agent based collection is not viable (for technical, administrative or compliance reasons), agentless log collection tends to be adopted. This is where a client, host, system or device forwards the logs out to a log collection instance using its native protocols (such as SNMP traps, WECS, WMI, Syslog) or stores them in a remotely accessible store such as a database table. There is little additional functionality involved compared to agent-based log collection. The following are items to consider when implementing this mode of log collection.

An option for when agent-based log collection is not feasible

Deploying log collection agents may not be feasible for all required devices in an environment. Examples where it is not possible to install an agent include embedded devices such as routers, printers, switches and firewalls where third party software installation is not supported, or highly regulated systems where installation of additional software is not permitted. An agentless log collection approach can be implemented instead, allowing devices to send logs to a remote data collector.

Agentless collection can be utilized without noticeable limitations

Installing and deploying an agent on each host is not necessarily the most efficient option. Agentless collection can instead be utilized on systems without noticeable limitations since a device or system only requires minimal configuration to send log data over the network. In large-scale enterprise networks, where multiple system administrators are involved in implementing log management deployments, the advantage of agentless implementations is that it may have noticeably flatter learning curves.

Use agentless collection in cloud environment to poll logs

Cloud environments, such as AWS, provide monitoring APIs. These APIs can polled for log data are regular intervals, without the need to install agents on each of the instances.

Virtualization and virtual machines provide APIs for remote collection

Virtualization software such as VMWare provide APIs or SDKs allowing for remote collection. For example, the vSphere Perl SDK allows for vCenter agentless log collection.

Agentless collection has trade-offs in terms of security and reliability

Agentless collection is commonly used with Syslog protocols where data transfer occurs over unencrypted UDP. The UDP protocol in itself also has reliability issues. Even where TCP syslog is used, there is often limited support for buffering and flow control.

How NXLog helps

The NXLog Enterprise Edition log collection suite provides both agent-based and agentless collection modes. Administrators can collect data from common system logs and log formats including Syslog, Windows Event Log, file-based logs and databases. In addition, specialized APIs and SDKs allow for remote collection providing there is integration support from NXLog. The mode of log collection, whether it is agent-based or agentless, is flexible and open to change over time depending on individual factors and requirements of a log collection deployment strategy. With NXLog, administrators have the choice of either agent-based, agentless or a combination of both modes to suit whichever requirements is needed to fulfil.

To read more about Log Processing Modes using specific NXLog modules, including agent-based and agentless monitoring, please see the User Guide. To read more about the advantages and disadvantages of both modes specific to the NXLog Enterprise Edition, please see our earlier post here.

Download a fully functional trial of the Enterprise Edition for free