[filebeat][ABS] - Fix CSV decoder JSON escaping in azure-blob-storage input#50097
[filebeat][ABS] - Fix CSV decoder JSON escaping in azure-blob-storage input#50097ShourieG merged 4 commits intoelastic:mainfrom
Conversation
…e input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Add a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly
|
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
🤖 GitHub commentsJust comment with:
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughFixes a CSV-decoding bug in the Azure Blob Storage input that produced malformed JSON when CSV fields contained double quotes. The decoder now escapes values via ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/test |
|
@Mergifyio backport 8.19 9.3 9.4 |
✅ Backports have been createdDetails
Cherry-pick of 34634a3 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
|
… input (#50097) x-pack/filebeat/input/azureblobstorage: Fix CSV decoder JSON escaping in azure-blob-storage input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Added a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly. (cherry picked from commit 34634a3) # Conflicts: # x-pack/filebeat/input/azureblobstorage/job.go
… input (#50097) x-pack/filebeat/input/azureblobstorage: Fix CSV decoder JSON escaping in azure-blob-storage input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Added a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly. (cherry picked from commit 34634a3)
… input (#50097) x-pack/filebeat/input/azureblobstorage: Fix CSV decoder JSON escaping in azure-blob-storage input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Added a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly. (cherry picked from commit 34634a3)
… input (#50097) (#50110) x-pack/filebeat/input/azureblobstorage: Fix CSV decoder JSON escaping in azure-blob-storage input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Added a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly. (cherry picked from commit 34634a3) Co-authored-by: Shourie Ganguly <shourie.ganguly@elastic.co>
… input (#50097) (#50109) x-pack/filebeat/input/azureblobstorage: Fix CSV decoder JSON escaping in azure-blob-storage input The azure-blob-storage input's decode path only matched the decoder.Decoder interface, calling Decode() which builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Added a decoder.ValueDecoder switch case ahead of decoder.Decoder, matching the pattern already used by the GCS input. ValueDecoder's DecodeValue() uses json.Marshal which handles escaping correctly. (cherry picked from commit 34634a3) Co-authored-by: Shourie Ganguly <shourie.ganguly@elastic.co>
…ng in azure-blob-storage input (#50108) * [filebeat][ABS] - Fix CSV decoder JSON escaping in azure-blob-storage input (#50097) Fix CSV decoder type references and broken decodeValue in azure-blob-storage input The azure-blob-storage input's CSV decode path builds JSON via string concatenation without escaping field values. CSV values containing double quotes (e.g. RFC 2045 MIME type parameters) produce malformed JSON, causing downstream ingest pipeline failures. Add a valueDecoder switch case ahead of the plain decoder case so the CSV decoder's decodeValue() method is used, which serializes via json.Marshal for correct escaping. Also fix decodeValue() itself, which previously cleared its state then called decode(), causing a 'decode called before next' error. --------- Co-authored-by: Shourie Ganguly <shourie.ganguly@elastic.co>
Type of change
Proposed commit message
NOTE
We changed an existing test case because after adding decodeValue it builds a map[string]any, then calls json.Marshal. Go's json.Marshal sorts map keys alphabetically so the expected test results needed to align accordingly.
Checklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
None
How to test this PR locally
Related issues
Use cases
Screenshots
Logs