Skip to content

This input should be line oriented by default and use the line codec. #8

@guyboertje

Description

@guyboertje

See logstash-plugins/logstash-codec-multiline#14

Preliminary:

  • the line codec is replaced by the multiline codec.
  • stdin is given data larger than 32K

Fault:

  • the multiline codec expects line oriented data
  • stdin input reads data in 32K chunks
  • it is highly unlikely that the newline characters align on the 32K boundary.
  • when the last character of the chunk is not a newline, the multiline codec assumes \npiece_of_line_in_this_side_of_32K_block is a full line and buffers it as such. The other piece of the line in the next 32K block is also treated as a line
  • when the multiline codec combines these 'lines', one sees an extra newline in the middle of a natural line.

Proposal:

  • use FileWatch::BufferedTokenizer to line orient the data fed to the codec and make the plain codec the default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions