- Introduction
- Deployment
- Configuration
- OS Support
- Integration
- Troubleshooting
- Enterprise Edition Reference Manual
- 127. Man Pages
- 128. Configuration
- 129. Language
- 130. Extension Modules
- 130.1. Remote Management (xm_admin)
- 130.2. AIX Auditing (xm_aixaudit)
- 130.3. Apple System Logs (xm_asl)
- 130.4. Basic Security Module Auditing (xm_bsm)
- 130.5. Common Event Format (xm_cef)
- 130.6. Character Set Conversion (xm_charconv)
- 130.7. Delimiter-Separated Values (xm_csv)
- 130.8. Encryption (xm_crypto)
- 130.9. External Programs (xm_exec)
- 130.10. File Lists (xm_filelist)
- 130.11. File Operations (xm_fileop)
- 130.12. GELF (xm_gelf)
- 130.13. Go (xm_go)
- 130.14. Grok (xm_grok)
- 130.15. Java (xm_java)
- 130.16. JSON (xm_json)
- 130.17. Key-Value Pairs (xm_kvp)
- 130.18. LEEF (xm_leef)
- 130.19. Microsoft DNS Server (xm_msdns)
- 130.20. Multiline Parser (xm_multiline)
- 130.21. NetFlow (xm_netflow)
- 130.22. Microsoft Network Policy Server (xm_nps)
- 130.23. Pattern Matcher (xm_pattern)
- 130.24. Perl (xm_perl)
- 130.25. Python (xm_python)
- 130.26. Resolver (xm_resolver)
- 130.27. Rewrite (xm_rewrite)
- 130.28. Ruby (xm_ruby)
- 130.29. SNMP Traps (xm_snmp)
- 130.30. Remote Management (xm_soapadmin)
- 130.31. Syslog (xm_syslog)
- 130.32. W3C (xm_w3c)
- 130.33. WTMP (xm_wtmp)
- 130.34. XML (xm_xml)
- 130.35. Compression (xm_zlib)
- 131. Input Modules
- 132. Processor Modules
- 133. Output Modules
- NXLog Manager
- NXLog Add-Ons
130.6. Character Set Conversion (xm_charconv)
This module provides tools for converting strings between different character
sets (code pages). All the encodings available to
iconv are supported. On GNU/Linux
systems execute iconv -l
for a list of encoding names.
Note
|
To examine the supported platforms, see the list of installer packages in the Available Modules chapter. |
130.6.1. Configuration
The xm_charconv module accepts the following directives in addition to the common module directives.
- AutodetectCharsets
-
This optional directive accepts a comma-separated list of character set names. When
auto
is specified as the source encoding for convert() or convert_fields(), these character sets will be tried for conversion. This directive is not related to the LineReader directive or the corresponding InputType registered when LineReader is specified.
- BigEndian
-
This optional boolean directive specifies the endianness to use during the encoding conversion. If this directive is not specified, it defaults to the host’s endianness. This directive only affects the registered InputType, and is only applicable if the LineReader directive is set to a non-Unicode encoding and the CharBytes directive is set to 2 or 4.
- CharBytes
-
This optional integer directive specifies the byte-width of the encoding to use during conversion. Accepted values are 1 (the default), 2, and 4. Most variable width encodings will work with the default value. This directive only affects the registered InputType and is only applicable if the LineReader directive is set to a non-Unicode encoding.
- LineReader
-
If this optional directive is specified with an encoding, an InputType will be registered using the name of the xm_charconv module instance. The following Unicode encodings are supported: UTF-8, UCS-2, UCS-2BE, UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, and UTF-7. For other encodings, it may be necessary to also set the BigEndian and/or CharBytes directives.
130.6.4. Examples
This configuration shows an example of character set auto-detection. The input file can contain lines with different encodings, and the module normalizes output to UTF-8.
This configuration uses the InputType registered via the LineReader directive to read a file with the ISO-8859-2 encoding.