Skip to content

Conversation

tausbn
Copy link
Contributor

@tausbn tausbn commented Aug 25, 2025

The flask.request global object is commonly used in request handlers to access data in the active request. In our modelling, we handled this by treating the initial (module-local) definition of request as a source of remote flow. In practice this meant a lot of alerts would act as if from flask import request was the ultimate "source" of remote flow, and to find the actual request-handler-local instance of request one would have to inspect the data-flow path between source and sink.

To improve this state of affairs, I have made the following changes to the definition of FlaskRequestSource:

  • We no longer consider from flask import request to be a source.
  • Instead, we look at all places where that request value can flow, and include only the ones that are LocalSourceNodes (so that inside a request handler, the first occurrence of the request object is the source).

In practice, this leads to alerts that are much easier to decipher.

The `flask.request` global object is commonly used in request handlers
to access data in the active request. In our modelling, we handled this
by treating the initial (module-local) definition of `request` as a
source of remote flow. In practice this meant a lot of alerts would act
as if `from flask import request` was the ultimate "source" of remote
flow, and to find the actual request-handler-local instance of `request`
one would have to inspect the data-flow path between source and sink.

To improve this state of affairs, I have made the following changes to
the definition of `FlaskRequestSource`:

- We no longer consider `from flask import request` to be a source.
- Instead, we look at all places where that `request` value can flow,
and include only the ones that are `LocalSourceNode`s (so that inside a
request handler, the first occurrence of the `request` object is the
source).

In practice, this leads to alerts that are much easier to decipher.
tausbn added 2 commits August 28, 2025 13:23
As it turns out, referring to the request object using `flask.request`
is not uncommon, and this meant restricting to `Name` nodes was too
strong. With the changes in this commit, we now include those
occurrences as well.
Really starting to regret our widespread use of `flask.request` as _the_
example of a remote flow source.
@tausbn tausbn force-pushed the tausbn/python-refine-location-of-flask-request-sources branch from a842023 to 0f4f909 Compare August 29, 2025 12:01
@tausbn tausbn marked this pull request as ready for review September 2, 2025 13:00
@tausbn tausbn requested a review from a team as a code owner September 2, 2025 13:00
@Copilot Copilot AI review requested due to automatic review settings September 2, 2025 13:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refines the location of flask.request flow sources to provide more precise security vulnerability detection. Instead of treating the module-level import from flask import request as the ultimate source of remote flow, the change identifies the first occurrence of the request object within specific request handlers as the source.

Key changes:

  • Modified FlaskRequestSource definition to exclude module-level imports as sources
  • Added logic to identify LocalSourceNodes where request flows within request handlers
  • Updated expected test results to reflect the more precise source locations

Comment on lines +1 to +5
#select
| code_injection.py:7:10:7:13 | ControlFlowNode for code | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:7:10:7:13 | ControlFlowNode for code | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:8:10:8:13 | ControlFlowNode for code | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:8:10:8:13 | ControlFlowNode for code | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:10:10:10:12 | ControlFlowNode for cmd | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:10:10:10:12 | ControlFlowNode for cmd | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:21:20:21:27 | ControlFlowNode for obj_name | code_injection.py:18:16:18:22 | ControlFlowNode for request | code_injection.py:21:20:21:27 | ControlFlowNode for obj_name | This code execution depends on a $@. | code_injection.py:18:16:18:22 | ControlFlowNode for request | user-provided value |
Copy link
Preview

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The #select section appears at the beginning of the file rather than at the end, which is unusual for CodeQL test expected files. This ordering deviation could indicate a formatting issue or test structure problem that should be investigated.

Suggested change
#select
| code_injection.py:7:10:7:13 | ControlFlowNode for code | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:7:10:7:13 | ControlFlowNode for code | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:8:10:8:13 | ControlFlowNode for code | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:8:10:8:13 | ControlFlowNode for code | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:10:10:10:12 | ControlFlowNode for cmd | code_injection.py:6:12:6:18 | ControlFlowNode for request | code_injection.py:10:10:10:12 | ControlFlowNode for cmd | This code execution depends on a $@. | code_injection.py:6:12:6:18 | ControlFlowNode for request | user-provided value |
| code_injection.py:21:20:21:27 | ControlFlowNode for obj_name | code_injection.py:18:16:18:22 | ControlFlowNode for request | code_injection.py:21:20:21:27 | ControlFlowNode for obj_name | This code execution depends on a $@. | code_injection.py:18:16:18:22 | ControlFlowNode for request | user-provided value |

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant