Option to analyze/load local paths, not just stdin

While drafting the "Custom Brimcap Configuration" article in #72, I found myself having to to create tiny wrapper scripts to deal with the expectation that a Brimcap analyzer expects its pcap input to be streamed on stdin. So for instance, my config YAML looked like:

```
analyzers:
  - cmd: /usr/local/bin/zeek-wrapper.sh
  - cmd: /usr/local/bin/suricata-wrapper.sh
```

And those wrapper scripts looked like:

```
$ cat zeek-wrapper.sh
#!/bin/bash
exec /opt/zeek/bin -C -r - --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); }" local

$ cat suricata-wrapper.sh 
#!/bin/bash -e
exec /usr/local/bin/suricata -r /dev/stdin
```

If the user's intent is to just run `brimcap load` or `brimcap analyze` on pcap file paths on their local workstation (as I expect will be most common), this extra layer of indirection isn't buying them much. What follows is just a straw man proposal, but I imagined we could add some kind of option in the YAML so the full analyzer command line could be brought in, but with some kind of substitution of the provided file path, e.g.:

substitute the provided file path, e.g.:

```
analyzers:
  - cmd: /opt/zeek/bin -C -r %PCAPPATH% --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); }" local
    inputmode: filepath
  - cmd: /usr/local/bin/suricata -r %PCAPPATH%
    inputpmode: filepath
```

The possible advantages I see with offering this approach:

1.  It keeps the config consolidated by avoiding the proliferation of wrapper scripts
2.  For analyzers that aren't prepared to accept input on stdin (such as the NetFlow example shown in the same article, or off-the-shelf Suricata on Windows, for which we maintain a separate build exclusively to add the stdin support), the user would avoid needing to create wrapper scripts that push stdin to a tmpfile just to pass it off to the analyzer

I bounced some of this off @mattnibs, and he had some valid rebuttals about why we'd not want to make this our _only_ approach. One of the advantages he pointed out about being stream-focused is that it offers the user the ability to analyze pcaps large enough that they'd be unwieldy to download in full before analysis. For instance, if my Brim app is running locally, this is a way to turn an S3-stored pcap into Zeek+Suricata logs and load those logs directly to a Pool in the Zed Lake behind my app, all without an explicit download of the pcap to to a local file:

```
$ aws s3 cp s3://brim-sampledata/wrccdc.pcap - | brimcap analyze - | zapi load -p wrccdc -
1tgSXaWvlzFDG4dcKfeI2nWo3Ax committed
```

He also noted the efficiency of a single pcap stream being forked to multiple analyzers, rather than each having to open and analyze a file separately.

All that said, I do still see value in avoiding the proliferation of wrapper scripts if a user is truly working with local pcaps and doesn't need the full efficiency benefits of the streamed approach, so I'm filing this one to possibly reconsider in the future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Option to analyze/load local paths, not just stdin #94

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Option to analyze/load local paths, not just stdin #94

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions