[PR #10552/44e669be backport][3.11] Cache parsing of the content-type (#10557)

patchback[bot] · bdraco · web-flow · commit 928e6d70480c · 2025-03-15T23:21:49.000Z
**This is a backport of PR #10552 as merged into master (44e669b).**  ## What do these changes do? When profiling some frequent POST requests, I found the bulk of the time was spent parsing the content-type string. Use the same strategy as we do for `parse_mimetype` to cache the parsing. ## Are there changes in behavior for the user? performance improvement ## Is it a substantial burden for the maintainers to support this? no ## Related issue number   ## Checklist - [x] I think the code is well written - [ ] Unit tests for the changes exist - [ ] Documentation reflects the changes - [ ] If you provide code modification, please add yourself to `CONTRIBUTORS.txt` * The format is &lt;Name&gt; &lt;Surname&gt;. * Please keep alphabetical order, the file is sorted by names. - [ ] Add a new news fragment into the `CHANGES/` folder * name it `<issue_or_pr_num>.<type>.rst` (e.g. `588.bugfix.rst`) * if you don't have an issue number, change it to the pull request number after creating the PR * `.bugfix`: A bug fix for something the maintainers deemed an improper undesired behavior that got corrected to match pre-agreed expectations. * `.feature`: A new behavior, public APIs. That sort of stuff. * `.deprecation`: A declaration of future API removals and breaking changes in behavior. * `.breaking`: When something public is removed in a breaking way. Could be deprecated in an earlier release. * `.doc`: Notable updates to the documentation structure or build process. * `.packaging`: Notes for downstreams about unobvious side effects and tooling. Changes in the test invocation considerations and runtime assumptions. * `.contrib`: Stuff that affects the contributor experience. e.g. Running tests, building the docs, setting up the development environment. * `.misc`: Changes that are hard to assign to any of the above categories. * Make sure to use full sentences with correct case and punctuation, for example: ```rst Fixed issue with non-ascii contents in doctest text files -- by :user:`contributor-gh-handle`. ``` Use the past tense or the present tense a non-imperative mood, referring to what's changed compared to the last released version of this project. <img width="570" alt="Screenshot 2025-03-15 at 11 25 10 AM" src="https://github.com/user-attachments/assets/cabaaa7c-3a39-4f90-b450-a6a0559d22d6" /> Co-authored-by: J. Nick Koston <nick@koston.org>
diff --git a/CHANGES/10552.misc.rst b/CHANGES/10552.misc.rst
@@ -0,0 +1 @@
+Improved performance of parsing content types by adding a cache in the same manner currently done with mime types -- by :user:`bdraco`.
diff --git a/aiohttp/helpers.py b/aiohttp/helpers.py
@@ -21,7 +21,7 @@
 from email.utils import parsedate
 from math import ceil
 from pathlib import Path
-from types import TracebackType
+from types import MappingProxyType, TracebackType
 from typing import (
     Any,
     Callable,
@@ -357,6 +357,20 @@ def parse_mimetype(mimetype: str) -> MimeType:
     )
 
 
+@functools.lru_cache(maxsize=56)
+def parse_content_type(raw: str) -> Tuple[str, MappingProxyType[str, str]]:
+    """Parse Content-Type header.
+
+    Returns a tuple of the parsed content type and a
+    MappingProxyType of parameters.
+    """
+    msg = HeaderParser().parsestr(f"Content-Type: {raw}")
+    content_type = msg.get_content_type()
+    params = msg.get_params(())
+    content_dict = dict(params[1:])  # First element is content type again
+    return content_type, MappingProxyType(content_dict)
+
+
 def guess_filename(obj: Any, default: Optional[str] = None) -> Optional[str]:
     name = getattr(obj, "name", None)
     if name and isinstance(name, str) and name[0] != "<" and name[-1] != ">":
@@ -710,10 +724,10 @@ def _parse_content_type(self, raw: Optional[str]) -> None:
             self._content_type = "application/octet-stream"
             self._content_dict = {}
         else:
-            msg = HeaderParser().parsestr("Content-Type: " + raw)
-            self._content_type = msg.get_content_type()
-            params = msg.get_params(())
-            self._content_dict = dict(params[1:])  # First element is content type again
+            content_type, content_mapping_proxy = parse_content_type(raw)
+            self._content_type = content_type
+            # _content_dict needs to be mutable so we can update it
+            self._content_dict = content_mapping_proxy.copy()
 
     @property
     def content_type(self) -> str:

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+Improved performance of parsing content types by adding a cache in the same manner currently done with mime types -- by :user:`bdraco`.