Table of Contents
- Introduction
- Deployment
- Configuration
- 23. Configuration overview
- 24. NXLog Language
- 25. Reading and Receiving Logs
- 26. Processing Logs
- 26.1. Parsing Various Formats
- 26.2. Alerting
- 26.3. Using Buffers
- 26.4. Character Set Conversion
- 26.5. Detecting a Dead Agent or Log Source
- 26.6. Event Correlation
- 26.7. Extracting data
- 26.8. Filtering Messages
- 26.9. Format Conversion
- 26.10. Log Rotation and Retention
- 26.11. Message Classification
- 26.12. Parsing Multi-Line Messages
- 26.13. Rate Limiting and Traffic Shaping
- 26.14. Rewriting and Modifying Messages
- 26.15. Timestamps
- 27. Forwarding and Storing Logs
- 28. Centralized Log Collection
- 29. NXLog Failover Mode
- 30. High Availability
- 31. Encrypted Transfer
- 32. Reducing Bandwidth and Data Size
- 33. Reliable Message Delivery
- 34. Compression and Encryption
- OS Support
- Integration
- Troubleshooting
- Enterprise Edition Reference Manual
- NXLog Manager
- NXLog Add-Ons
26.4. Character Set Conversion
It is recommended to normalize logs to UTF-8. The xm_charconv module provides character set conversion: the convert_fields() procedure for converting an entire message (all event fields) and a convert() function for converting a string.
Example 93. Character Set Auto-Detection of Various Input Encodings
This configuration shows an example of character set auto-detection. The input file may contain differently encoded lines, but by invoking the convert_fields() procedure, each message will have the character set encoding of its fields detected and then converted to UTF-8 as needed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<Extension _charconv>
Module xm_charconv
AutodetectCharsets utf-8, euc-jp, utf-16, utf-32, iso8859-2
</Extension>
<Input filein>
Module im_file
File "tmp/input"
Exec convert_fields("auto", "utf-8");
</Input>
<Output fileout>
Module om_file
File "tmp/output"
</Output>
<Route r>
Path filein => fileout
</Route>