12
responses

Good Morning All,

I was hoping to get some direction with a log file we want to parse. We have a directory containing log files on a network share. A new log file is created every day. The issue with these logs is that there are no newline or carriage returns, it is one gigantic line. New events are just added at the end of the string. I am familiar with NXLog to the point of inputs / outputs / routes but I am not sure exactly my next step here. I have some regex that when ran manually will break the log down into individual lines but I am not sure how to implement that in NXLog in a way that it will read the log file, split the entirety of it into individual lines and then export those lines without seeing a bunch of duplication. Or in what blocks to implement which step.

Some broad stroke guidance (or details) would be appreciated.

AskedJune 19, 2020 - 6:34pm

Answer (1)

Hi Ian, do you want to process your file once it is totally populated or do you need to read it in real time?

Comments (11)

  • ian.lee's picture

    Good Morning Manuel.

    Ideally I would like to process it in real-time but if that's a limitation then once per day (once the file rolls over) would be acceptable.

  • ian.lee's picture

    All fields are fixed width with each "event" being 200 chars long. All new events start with 2AUW generally (This regex covers most "[23][ABCD][U]\S").

    Here is a short sample.

    2AUW20200622000008001707600020B1 SAPSYS ARTJYRDG 0001ARTJYRDG& 2AUW20200622000008000860000014Be SAPSYS ARTJYRDG 0001ARTJYRDG& 2AU120200622000008001707600020B1 WF-BATCH ARTJYRDG 3001B&0&A 2AUW20200622000008001707600020B1 WF-BATCH RSCONN01 3001RSCONN01& 2AU120200622000008000860000014Be WF-BATCH ARTJYRDG 3001B&0&A 2AUW20200622000008002368000015Bf SAPSYS ARTJYRDG 0001ARTJYRDG& 2AUW20200622000008000860000014Be WF-BATCH RWSSSMI 3001RWSSSMI& 2AUW20200622000008001483200016B1 SAPSYS ARTJYRDG 0001ARTJYRDG& 2AU120200622000008002368000015Bf USERNAME ARTJYRDG 3001B&0&A 2AU120200622000008001483200016B1 USERNAME ARTJYRDG 3001B&0&A 2AUW20200622000008002368000015Bf USERNAME ZFIP020 3001ZFIP020& 2AUW20200622000008001851600019B1 SAPSYS ARTJYRDG 0001ARTJYRDG& 2AUW20200622000008001483200016B1 USERNAME ZPMIF_MONITOR_CREAWO 3001ZPMIF_MONITOR_CREAWO& 2AUW20200622000008002178400013Bd SAPSYS ARTJYRDG 0001ARTJYRDG& 2AU120200622000008001851600019B1 WF-BATCH ARTJYRDG 3001B&0&A

  • manuel.munoz's picture
    (NXLog)

    Well, I think I can get a sense of it.

    This regexp seems to make the trick of capturing the last event of the string each time. I used 57 as your events seem 60 in length, but if white spaces where removed and real size is 200, you should replace 57 with 197.

    /^.*([23][ABCD][U].{57})$/

    I don't know how this is going to behave performance-wise, nor the events per second you are expecting to receive.

  • manuel.munoz's picture
    (NXLog)

    As far as I understand you need to create a config with one im_file module to read from your file, and output module, say om_tcp to send those events elsewhere, and a route to connect both. At the end of your input module you could add something like the following, so only the last event per line is taken into account.

    Exec if $raw_event =~ /^.*([23][ABCD][U].{57})$/ $raw_event = $1;
    

    Please paste here your current config.

    One question, if the events get added to the same string every time, making it longer and longer, when is it going to stop? For sure the bigger the line the worst the performance will be.

  • ian.lee's picture

    Manuel,

    Here is what I have so far. Pretty simple. Once I get it flowing into a test file (event by event) I'll work with it to extract fields etc. The log file rotates once per day so that will have to by allowed for in the input but I think I can handle that part once the line extraction is happening.

    Panic Soft
    #NoFreeOnExit TRUE

    define ROOT C:\Program Files (x86)\nxlog
    define CERTDIR %ROOT%\cert
    define CONFDIR %ROOT%\conf
    define LOGDIR %ROOT%\data
    define LOGFILE %LOGDIR%\nxlog.log
    LogFile %LOGFILE%

    Moduledir %ROOT%\modules
    CacheDir %ROOT%\data
    Pidfile %ROOT%\data\nxlog.pid
    SpoolDir %ROOT%\data

    <Extension exec>
    Module xm_exec
    </Extension>

    <Input in>
    Module im_file
    File "\\\Network_Path_to_File\File.AUD"
    <Exec>
    if $raw_event =~ /^.*([23][ABCD][U].{197})$/
    $raw_event = $1;
    </Exec>
    </Input>

    <Output out>
    Module om_file
    File "C:\logs\logrewrite.log"
    </Output>

    <Route 1>
    Path in => out
    </Route>

  • ian.lee's picture

    Good Morning Manuel,

    That does not seem to have made a difference. I also tested with a small local file where I add one of the 200 character events manually at the end and still had the same results. I'm not sure if you have any other suggestions but I may have to get an enterprise license with some implementation hours to get this done.