Skip to content

Commit 6a60fb7

Browse files
authored
[PR #10970/bb5fc59 backport][3.12] Add warning about consuming the payload in middleware (#10984)
1 parent 12ce811 commit 6a60fb7

File tree

2 files changed

+97
-6
lines changed

2 files changed

+97
-6
lines changed

CHANGES/2914.doc.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Improved documentation for middleware by adding warnings and examples about
2+
request body stream consumption. The documentation now clearly explains that
3+
request body streams can only be read once and provides best practices for
4+
sharing parsed request data between middleware and handlers -- by :user:`bdraco`.

docs/web_advanced.rst

Lines changed: 93 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -569,10 +569,14 @@ A *middleware* is a coroutine that can modify either the request or
569569
response. For example, here's a simple *middleware* which appends
570570
``' wink'`` to the response::
571571

572-
from aiohttp.web import middleware
572+
from aiohttp import web
573+
from typing import Callable, Awaitable
573574

574-
@middleware
575-
async def middleware(request, handler):
575+
@web.middleware
576+
async def middleware(
577+
request: web.Request,
578+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
579+
) -> web.StreamResponse:
576580
resp = await handler(request)
577581
resp.text = resp.text + ' wink'
578582
return resp
@@ -614,20 +618,27 @@ post-processing like handling *CORS* and so on.
614618
The following code demonstrates middlewares execution order::
615619

616620
from aiohttp import web
621+
from typing import Callable, Awaitable
617622

618-
async def test(request):
623+
async def test(request: web.Request) -> web.Response:
619624
print('Handler function called')
620625
return web.Response(text="Hello")
621626

622627
@web.middleware
623-
async def middleware1(request, handler):
628+
async def middleware1(
629+
request: web.Request,
630+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
631+
) -> web.StreamResponse:
624632
print('Middleware 1 called')
625633
response = await handler(request)
626634
print('Middleware 1 finished')
627635
return response
628636

629637
@web.middleware
630-
async def middleware2(request, handler):
638+
async def middleware2(
639+
request: web.Request,
640+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
641+
) -> web.StreamResponse:
631642
print('Middleware 2 called')
632643
response = await handler(request)
633644
print('Middleware 2 finished')
@@ -646,6 +657,82 @@ Produced output::
646657
Middleware 2 finished
647658
Middleware 1 finished
648659

660+
Request Body Stream Consumption
661+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
662+
663+
.. warning::
664+
665+
When middleware reads the request body (using :meth:`~aiohttp.web.BaseRequest.read`,
666+
:meth:`~aiohttp.web.BaseRequest.text`, :meth:`~aiohttp.web.BaseRequest.json`, or
667+
:meth:`~aiohttp.web.BaseRequest.post`), the body stream is consumed. However, these
668+
high-level methods cache their result, so subsequent calls from the handler or other
669+
middleware will return the same cached value.
670+
671+
The important distinction is:
672+
673+
- High-level methods (:meth:`~aiohttp.web.BaseRequest.read`, :meth:`~aiohttp.web.BaseRequest.text`,
674+
:meth:`~aiohttp.web.BaseRequest.json`, :meth:`~aiohttp.web.BaseRequest.post`) cache their
675+
results internally, so they can be called multiple times and will return the same value.
676+
- Direct stream access via :attr:`~aiohttp.web.BaseRequest.content` does NOT have this
677+
caching behavior. Once you read from ``request.content`` directly (e.g., using
678+
``await request.content.read()``), subsequent reads will return empty bytes.
679+
680+
Consider this middleware that logs request bodies::
681+
682+
from aiohttp import web
683+
from typing import Callable, Awaitable
684+
685+
async def logging_middleware(
686+
request: web.Request,
687+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
688+
) -> web.StreamResponse:
689+
# This consumes the request body stream
690+
body = await request.text()
691+
print(f"Request body: {body}")
692+
return await handler(request)
693+
694+
async def handler(request: web.Request) -> web.Response:
695+
# This will return the same value that was read in the middleware
696+
# (i.e., the cached result, not an empty string)
697+
body = await request.text()
698+
return web.Response(text=f"Received: {body}")
699+
700+
In contrast, when accessing the stream directly (not recommended in middleware)::
701+
702+
async def stream_middleware(
703+
request: web.Request,
704+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
705+
) -> web.StreamResponse:
706+
# Reading directly from the stream - this consumes it!
707+
data = await request.content.read()
708+
print(f"Stream data: {data}")
709+
return await handler(request)
710+
711+
async def handler(request: web.Request) -> web.Response:
712+
# This will return empty bytes because the stream was already consumed
713+
data = await request.content.read()
714+
# data will be b'' (empty bytes)
715+
716+
# However, high-level methods would still work if called for the first time:
717+
# body = await request.text() # This would read from internal cache if available
718+
return web.Response(text=f"Received: {data}")
719+
720+
When working with raw stream data that needs to be shared between middleware and handlers::
721+
722+
async def stream_parsing_middleware(
723+
request: web.Request,
724+
handler: Callable[[web.Request], Awaitable[web.StreamResponse]]
725+
) -> web.StreamResponse:
726+
# Read stream once and store the data
727+
raw_data = await request.content.read()
728+
request['raw_body'] = raw_data
729+
return await handler(request)
730+
731+
async def handler(request: web.Request) -> web.Response:
732+
# Access the stored data instead of reading the stream again
733+
raw_data = request.get('raw_body', b'')
734+
return web.Response(body=raw_data)
735+
649736
Example
650737
^^^^^^^
651738

0 commit comments

Comments
 (0)