gh-113775: Fix HttpOnly Prefix Issue in MozillaCookieJar.save method #113795

kairi003 · 2024-01-07T18:27:21Z

MozillaCookieJar recognizes and writes the #HttpOnly_ prefix, based on curl's specifications. However, it is uncommon for cookies obtained via HTTP to be written with the HttpOnly prefix.

This issue stems from the case-sensitive nature of the HttpOnly attribute check and the fact that the value of the constant HTTPONLY_ATTR is "HTTPOnly" instead of the more commonly used "HttpOnly".

In accordance with RFC6265, this PR proposes the addition of a case_insensitive option to Cookie.has_nonstandard_attr. Additionally, it modifies the value of HTTPONLY_ATTR to the more widely recognized "HttpOnly".

This PR includes code changes, docs fixes, and additional tests.

Issue: HttpOnly prefix appending feature of MozillaCookieJar almost never works #113775

📚 Documentation preview 📚: https://cpython-previews--113795.org.readthedocs.build/

ghost · 2024-01-07T18:27:24Z

All commit authors signed the Contributor License Agreement.

Cookie.has_nonstandard_attr method is case-sensitive, but HTTP is case-insensitive for cookie attribute names. This option is useful for MozillaCookieJar to check the HttpOnly flag when saving.

This commit updates the value of the HTTPONLY_ATTR constant from "HTTPOnly" to the more commonly used "HttpOnly". It makes HTTP communication more consistent.

Modified attribute checking in MozillaCookieJar.save to be case-insensitive, aligning with HTTP standards. This change resolves the issue where HttpOnly prefix was not correctly appended due to case-sensitive checks.

bedevere-app · 2025-03-23T08:18:17Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

bedevere-app · 2025-03-23T08:26:17Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

picnixz

I don't really know whether it should considered a feature or a bug fix as we're aligning with the RFC now. I think it could be considered a bug fix but OTOH, we're changing the HTTPONLY_ATTR.

Now, be careful that the value is also used in _really_load:

if line.startswith(HTTPONLY_PREFIX):
    rest[HTTPONLY_ATTR] = ""
    line = line[len(HTTPONLY_PREFIX):]

In this case, shouldn't we do something as well with the case? because otherwise, files that were successfully parsed, wouldn't be now.

cc @vadmium as my favorite web RFC expert c:

picnixz · 2025-03-23T12:01:36Z

Lib/test/test_http_cookiejar.py

+        for key in ["foo1", "foo2", "foo3"]:
+            matches = [x for x in lines if key in x]
+            self.assertEqual(len(matches), 1,
+                             "Incorrect number of matches for cookie with value %r" % key)
+            self.assertTrue(matches[0].startswith("#HttpOnly_"),
+                            "Cookie with value %r is missing the HttpOnly prefix" % key)
+
+        # Check that the HttpOnly prefix is not added to the correct cookies
+        for key in ["foo4"]:
+            matches = [x for x in lines if key in x]
+            self.assertEqual(len(matches), 1,
+                             "Incorrect number of matches for cookie with value %r" % key)
+            self.assertFalse(matches[0].startswith("#HttpOnly_"),
+                             "Cookie with value %r has the HttpOnly prefix" % key)


You can use self.subTest(key=key) and drop the assertion messages. It could be easier.

Thank you, I did it in 7f72d4e

kairi003 · 2025-03-23T14:04:44Z

@picnixz
Thank you for your review!

In this case, shouldn't we do something as well with the case? Because otherwise, files that were successfully parsed wouldn't be now.

You're right, HTTPONLY_ATTR is used elsewhere and risks breaking existing code.
It reflects the HttpOnly attribute in the Cookies._rest field when loading cookies from files.

Here's what I've found upon further review:

Cookie._rest is a field with an underscore prefix, which implies it is not expected to be accessed directly.
The only 3 methods that manipulate Cookie._rest are has/get/set_nonstandard_attr(), and in the standard library, they are only used for managing the HttpOnly attribute (which is the scope of this change).
Generally, HttpOnly is checked for its presence rather than its value, so the only method affected by this fix should be has_nonstandard_attr().
HttpOnly is an attribute meant for browser security and is not considered in requests made by Python.

Therefore, this change only affects users who directly reference Cookie._rest or call Cookie.get_nonstandard_attr('HTTPOnly'). This is not a common operation.

If full backward compatibility is required, we could keep HTTPONLY_ATTR = "HTTPOnly".
Since the logic has been modified to be case-insensitive, this should be sufficient for RFC compliance.
However, the RFC shows the case of this attribute as "HttpOnly", and it is more common on the web.

I am not a specialist in this area, so I will defer to the judgment of the reviewers and those who have more expertise.

Thank you :)

kairi003 · 2025-03-23T14:37:47Z

As a side note, when you make a typical request, Cookie._rest will be set to "HttpOnly" in most cases.

import requests
resp = requests.get('https://google.com')
cookie = list(resp.cookies)[0]
print(cookie._rest)
# -> {'HttpOnly': None, 'SameSite': 'lax'}

picnixz · 2025-03-23T14:50:52Z

HttpOnly

That I know. I encountered this kind of issues when writing my own crawler so I'm not surprised. However, the best way to do make it smooth is to support both names, and emit a warning if HTTPOnly is encountered.

kairi003 · 2025-03-23T18:20:35Z

@picnixz
That is a very good idea.
Which do you think is better, actually having both values in _rest dict or automatically replacing HttpOnly when HTTPOnly is encountered?

Both values in `_rest` dict

HTTPONLY_ATTR = "HttpOnly"
HTTPONLY_ATTR_OLD = "HTTPOnly"

class MozillaCookieJar(FileCookieJar):
    def _really_load(self, f, filename, ignore_discard, ignore_expires):
        ...
        if line.startswith(HTTPONLY_PREFIX):
            rest[HTTPONLY_ATTR] = ""
            rest[HTTPONLY_ATTR_OLD] = ""
        ...

class Cookie:
    def has_nonstandard_attr(self, name, case_insensitive=False):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
        ...
    
    def get_nonstandard_attr(self, name, default=None):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
        ...
    
    def set_nonstandard_attr(self, name, value):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
        ...

Automatically replacing

class MozillaCookieJar(FileCookieJar):
    def _really_load(self, f, filename, ignore_discard, ignore_expires):
        ...
        if line.startswith(HTTPONLY_PREFIX):
            rest[HTTPONLY_ATTR] = ""
        ...

class Cookie:
    def has_nonstandard_attr(self, name, case_insensitive=False):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
            name = HTTPONLY_ATTR
        ...
    
    def get_nonstandard_attr(self, name, default=None):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
            name = HTTPONLY_ATTR
        ...
    
    def set_nonstandard_attr(self, name, value):
        if name == HTTPONLY_ATTR_OLD:
            warnings.warn(warning_message)
            name = HTTPONLY_ATTR
        ...

picnixz · 2025-03-23T18:24:43Z

I'm more worried with APIs overriding those methods so I think we should rather not change it automatically. We should either way document this change but I don't know which one would be the best (which is why I'd like to wait for another core dev). Personally, I would just replace HTTPOnly by HttpOnly to align with the RFC, but I don't know if this can break existing clients or apps in the wild. Agreed that currently it doesn't work well, but if you can find evidence that HTTPOnly is barely used, we could perhaps outright support it in _really_load for legacy reasons but without warning the end-user at all.

vadmium · 2025-04-02T01:05:18Z

I think adding the new case_insensitive public parameter is not great for a bug fix. I presume there should be an “Added/changed in version X” notice for it in the documentation.

As a new feature, I would make it a keyword-only parameter and probably also add it to the get_nonstandard_attr method. And I wonder if case_insensitive=True should be the default or only behaviour. On one hand this would be a change in behaviour; on the other hand it might mitigate compatibility concerns with changing HTTPONLY_ADDR.

Regarding the letter case of HTTPONLY_ATTR, changing it seems unnecessary to fix the bug. I understand changing it would only affect the “nonstandard attr” name in Cookie objects read from a file. That seems to have minimal reward and minimal risk, but it does seem nicer to use the casing from RFC 6265. I would only make the change in the next Python version.

bedevere-app bot added the awaiting review label Jan 7, 2024

bedevere-app bot mentioned this pull request Jan 7, 2024

HttpOnly prefix appending feature of MozillaCookieJar almost never works #113775

Open

kairi003 added 4 commits March 23, 2025 17:15

Add case_insensitive option to Cookie.has_nonstandard_attr method

ad24492

Cookie.has_nonstandard_attr method is case-sensitive, but HTTP is case-insensitive for cookie attribute names. This option is useful for MozillaCookieJar to check the HttpOnly flag when saving.

Change HTTPONLY_ATTR value from "HTTPOnly" to "HttpOnly"

eb730a3

This commit updates the value of the HTTPONLY_ATTR constant from "HTTPOnly" to the more commonly used "HttpOnly". It makes HTTP communication more consistent.

Fix HttpOnly Prefix Issue in MozillaCookieJar.save

028b607

Modified attribute checking in MozillaCookieJar.save to be case-insensitive, aligning with HTTP standards. This change resolves the issue where HttpOnly prefix was not correctly appended due to case-sensitive checks.

Fix loop variable name in test

6329495

kairi003 force-pushed the fix-issue-113775 branch from 145e60e to 6329495 Compare March 23, 2025 08:18

kairi003 force-pushed the fix-issue-113775 branch from 4f99b12 to 6329495 Compare March 23, 2025 08:26

📜🤖 Added by blurb_it.

2d1ea7a

picnixz reviewed Mar 23, 2025

View reviewed changes

Refactor test_mozilla_httponly_prefix to use subTest

7f72d4e

Merge branch 'main' into fix-issue-113775

96097a5

Uh oh!

gh-113775: Fix HttpOnly Prefix Issue in MozillaCookieJar.save method #113795

Are you sure you want to change the base?

gh-113775: Fix HttpOnly Prefix Issue in MozillaCookieJar.save method #113795

Conversation

kairi003 commented Jan 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Jan 7, 2024 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-app bot commented Mar 23, 2025

Uh oh!

bedevere-app bot commented Mar 23, 2025

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

picnixz Mar 23, 2025

Choose a reason for hiding this comment

Uh oh!

kairi003 Mar 23, 2025

Choose a reason for hiding this comment

Uh oh!

kairi003 commented Mar 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kairi003 commented Mar 23, 2025

Uh oh!

picnixz commented Mar 23, 2025

Uh oh!

kairi003 commented Mar 23, 2025

Uh oh!

picnixz commented Mar 23, 2025

Uh oh!

vadmium commented Apr 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kairi003 commented Jan 7, 2024 •

edited

Loading

ghost commented Jan 7, 2024 •

edited by ghost

Loading

kairi003 commented Mar 23, 2025 •

edited

Loading