Skip to content

Enhance StreamResponse behaviour #2020

@pdavid-cssopra

Description

@pdavid-cssopra

Improvement suggestions for the StreamResponse component.

Currently, the StreamResponse component is a RAM buffer between a source file and a destination file to be downloaded.
This minimalist approach poses several problems and shortcomings in resource management.

The data consumed is not scaled.

Indeed, every output server has a maximum output bandwidth.
Retrieving data from its source at a rate higher than the service rate overuses available resources.
Furthermore, when the input rate is greater than the output rate,
data will accumulate in buffering (RAM), which is far from unlimited and necessary for all other processes.

Only data is transferred between input and output, not state.

An interruption in the input stream cannot propagate cleanly to be transcribed into the output using the protocol.
An interruption of the output stream does not interrupt the reading on the input stream, which unnecessarily wastes resources.

Using a generator as a data source forces unstoppable synchronous data reading behavior, with errors getting lost in hidden scopes.

This remains unusual on modern systems, which no longer have significant constraints and can now fully asynchronously communicate between resources with varying speeds (web, RAM, disk) to avoid global latency.

Although this approach is described in the FastAPI documentation, it is no longer used in event-driven environments.

The lack of support for clearly localized and captureable error events prevents the transmission of state from the input source protocol to the output source protocol.

How to resolve these issues?

This will depend on the functionalities provided by the data source.

If the source supports partial content (client driven transfert):

It is then possible to create a fake file pointer which, during a "read" operation, extracts a chunk of data via a partial content query. If the source is not available for "reading," the source can properly raise a "permission" or "existence" error to simulate a permission or existence problem on the source.

By retrieving data on the fly, there is no internal storage; if the output stops retrieving data, there is no wasted space, and no residual memory objects to clean up. As long as the object emulating a file pointer exposes the expected methods, it can be used directly by FastAPI, based on the FileResponse object, and replacing the input "path" parameter with "file pointer".

Image

If the source does not have partial content (lazy bufferize)

In this case, the input throughput cannot be controlled, and without memory control, RAM should not be used.
The most resource-efficient approach remains multipart file: We will store the HTTP chunks in separate files.
When reading from the source, we create a new chunk file.
When reading from the output, we retrieve the chunk's contents, then delete it to free up space.

In this way, if the input and output speeds are comparable, only a few chunks will be stored simultaneously. It is important to properly separate file chunks to minimize concurrent data access and to be able to serve data that is not currently being written.

In this case, the file pointer emulation will read the available file chunks with each new "read" operation and can also be wrapped in an object similar to a FastAPI FileResponse object.

Image

The main problem with this solution is that it is necessary to ensure that the output loop does not serve chunks faster than the input loop collects them in order to avoid suffering from a service famine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions