|
| 1 | +# Approach Glossary |
| 2 | + |
| 3 | +DFIQ's Approaches contain the important "how" to answer the Questions. Approaches are also the most complicated |
| 4 | +part of DFIQ, due to the amount of structured information they contain. DFIQ has a detailed |
| 5 | +[specification](https://dfiq.org/contributing/specification) that is a useful reference for |
| 6 | +creating new Approaches. However, some parts of Approaches need user-defined values that are beyond the specification. |
| 7 | +This page is a glossary of currently-used values, generated from the |
| 8 | +[DFIQ YAML files](https://github.com/google/dfiq/tree/main/data). |
| 9 | + |
| 10 | +When writing new Approaches, check this glossary first to see if there's already an existing term that fits with what |
| 11 | +you're trying to do. If not, you are free to create a new one, but trying to reuse existing terms first will increase |
| 12 | +consistency throughout DFIQ. These concepts (data type, processors, analysis steps) also may not be straight-forward at |
| 13 | +first; the hope is that seeing some common values (and the linked usages) will help make them more clear. |
| 14 | + |
| 15 | +## Data |
| 16 | + |
| 17 | +This section (`view.data`) can have multiple ways describing the data needed for this approach. They should be thought |
| 18 | +of as complementary or as alternates to each other (they can be "OR"d together, they do not need to be "AND"d). |
| 19 | +Each is specified by a pair of `type` and `value`. |
| 20 | + |
| 21 | +Example (from [Q1001.10](https://github.com/google/dfiq/blob/main/data/approaches/Q1001.10.yaml#L39)): |
| 22 | + |
| 23 | +``` |
| 24 | +view: |
| 25 | + data: |
| 26 | + - type: ForensicArtifact |
| 27 | + value: BrowserHistory |
| 28 | +``` |
| 29 | + |
| 30 | +Below are the current values of `type`, along with the `value`s set for each. |
| 31 | + |
| 32 | + |
| 33 | +#### CrowdStrike |
| 34 | + |
| 35 | +For `type: CrowdStrike`, current entries for `value`: |
| 36 | + |
| 37 | +- DnsRequest |
| 38 | +- PlatformEvents |
| 39 | +- ProcessRollup |
| 40 | + |
| 41 | +#### ForensicArtifact |
| 42 | +**Description**: This corresponds to the name of a ForensicArtifact, an existing repository of machine-readable digital forensic artifacts (https://github.com/ForensicArtifacts/artifacts). Using this type is preferred when the data is a host-based file/artifact, but other methods are available as well (if there isn't an existing relevant ForensicArtifact). |
| 43 | + |
| 44 | +For `type: ForensicArtifact`, current entries for `value`: |
| 45 | + |
| 46 | +- BrowserHistory |
| 47 | +- NTFSUSNJournal |
| 48 | +- SantaLogs |
| 49 | +- WindowsEventLogs |
| 50 | +- WindowsPrefetchFiles |
| 51 | +- WindowsXMLEventLogSysmon |
| 52 | + |
| 53 | +#### description |
| 54 | +**Description**: Text description of the data type. `description` is often using in conjunction with another data type to provide more context. It can also be used alone, either as a placeholder or when more robust, programmatic data types do not fit. |
| 55 | + |
| 56 | +For `type: description`, current entries for `value`: |
| 57 | + |
| 58 | +- Collect local browser history artifacts. These are often in the form of SQLite databases and JSON files in multiple directories. |
| 59 | +- Files used by the Windows Prefetch service. |
| 60 | +- Santa logs stored on the local disk; they may also be centralized off-system, but this artifact does not include those. |
| 61 | +- The NTFS $UsnJnrl file system metadata file. This ForensicArtifact definition does not include the $J alternate data stream, but many tools collect it anyway. |
| 62 | +- Windows Event Log files |
| 63 | + |
| 64 | + |
| 65 | +## Processors |
| 66 | + |
| 67 | +A processor is what takes the data collected and processes it in some way to produce structured data an investigator |
| 68 | +reviews. Multiple processors can be defined, as there are often multiple programs capable of doing similar processing |
| 69 | +(example: log2timeline, Magnet Axiom, and Hindsight can all process browser history artifacts and deliver similar |
| 70 | +results). |
| 71 | + |
| 72 | +Example (from [Q1001.10](https://github.com/google/dfiq/blob/main/data/approaches/Q1001.10.yaml#L58)): |
| 73 | + |
| 74 | +``` |
| 75 | + processors: |
| 76 | + - name: Plaso |
| 77 | +``` |
| 78 | + |
| 79 | +Below are the currently-defined processors: |
| 80 | + |
| 81 | +- Crowdstrike Investigate (UI) [🔎](https://github.com/google/dfiq/search?q="name:%20Crowdstrike%20Investigate%20%28UI%29"+language%3AYAML) |
| 82 | +- Hindsight [🔎](https://github.com/google/dfiq/search?q="name:%20Hindsight"+language%3AYAML) |
| 83 | +- Plaso [🔎](https://github.com/google/dfiq/search?q="name:%20Plaso"+language%3AYAML) |
| 84 | +- Splunk [🔎](https://github.com/google/dfiq/search?q="name:%20Splunk"+language%3AYAML) |
| 85 | + |
| 86 | +## Analysis Steps |
| 87 | + |
| 88 | +Under each analysis method will be a sequence of one or more maps with keys `description`, `type`, and `value`. |
| 89 | +If there is more than one map, they should be processed in sequence in the analysis method (if applicable). In this |
| 90 | +way, we can describe multiple chained steps of analysis (with the `description` being a way to communicate to the user |
| 91 | +what exactly each "step" is doing, enabling a "show-your-work"-type capability). |
| 92 | + |
| 93 | +Example (from [Q1001.10](https://github.com/google/dfiq/blob/main/data/approaches/Q1001.10.yaml#L63)): |
| 94 | + |
| 95 | +``` |
| 96 | + analysis: |
| 97 | + - name: OpenSearch |
| 98 | + steps: |
| 99 | + - description: &filter-desc Filter the results to just file downloads |
| 100 | + type: opensearch-query |
| 101 | + value: data_type:("chrome:history:file_downloaded" OR "safari:downloads:entry") |
| 102 | + - name: Python Notebook |
| 103 | + steps: |
| 104 | + - description: *filter-desc |
| 105 | + type: pandas |
| 106 | + value: query('data_type in ("chrome:history:file_downloaded", "safari:downloads:entry")') |
| 107 | +``` |
| 108 | + |
| 109 | +#### `type` |
| 110 | + |
| 111 | +The contents of the `description` and `value` fields will vary wildly with little repetition, depending on what the |
| 112 | +analysis step is doing, but the step `type` should be one of a few common values. |
| 113 | + |
| 114 | +Below are the currently-defined values of `type`: |
| 115 | + |
| 116 | +- GUI [🔎](https://github.com/google/dfiq/search?q="type:%20GUI"+language%3AYAML) |
| 117 | +- manual [🔎](https://github.com/google/dfiq/search?q="type:%20manual"+language%3AYAML) |
| 118 | +- opensearch-query [🔎](https://github.com/google/dfiq/search?q="type:%20opensearch-query"+language%3AYAML) |
| 119 | +- pandas [🔎](https://github.com/google/dfiq/search?q="type:%20pandas"+language%3AYAML) |
| 120 | +- splunk-query [🔎](https://github.com/google/dfiq/search?q="type:%20splunk-query"+language%3AYAML) |
| 121 | + |
| 122 | +#### Variable Substitution in step `value` |
| 123 | + |
| 124 | +The step's `value` may benefit from some using a specific term to make the step more precise. Common examples of this |
| 125 | +include adding time bounds and filtering down to a specific identifier (user name, host, FQDN, or PID, for example). |
| 126 | + |
| 127 | +DFIQ's convention for denoting a variable to be substituted when used is to wrap the term in **{ }**. |
| 128 | + |
| 129 | +==More standardization is needed here to define common variables (such as timestamps in a particular format).== |
| 130 | + |
| 131 | +Below are the currently-used variables in analysis steps: |
| 132 | + |
| 133 | +- {file_reference value} [🔎](https://github.com/google/dfiq/search?q="%7Bfile_reference%20value%7D"+language%3AYAML) |
| 134 | +- {hostname} [🔎](https://github.com/google/dfiq/search?q="%7Bhostname%7D"+language%3AYAML) |
0 commit comments