Skip to content

optimize code for proxy handling of urllib.request  #127753

@NewUserHa

Description

@NewUserHa

Feature or enhancement

Proposal:

  1. the second loop of line starts from 1845 is duplicated from the first loop.

    cpython/Lib/urllib/request.py

    Lines 1817 to 1853 in 2041a95

    # Proxy handling
    def getproxies_environment():
    """Return a dictionary of scheme -> proxy server URL mappings.
    Scan the environment for variables named <scheme>_proxy;
    this seems to be the standard convention.
    """
    # in order to prefer lowercase variables, process environment in
    # two passes: first matches any, second pass matches lowercase only
    # select only environment variables which end in (after making lowercase) _proxy
    proxies = {}
    environment = []
    for name in os.environ:
    # fast screen underscore position before more expensive case-folding
    if len(name) > 5 and name[-6] == "_" and name[-5:].lower() == "proxy":
    value = os.environ[name]
    proxy_name = name[:-6].lower()
    environment.append((name, value, proxy_name))
    if value:
    proxies[proxy_name] = value
    # CVE-2016-1000110 - If we are running as CGI script, forget HTTP_PROXY
    # (non-all-lowercase) as it may be set from the web server by a "Proxy:"
    # header from the client
    # If "proxy" is lowercase, it will still be used thanks to the next block
    if 'REQUEST_METHOD' in os.environ:
    proxies.pop('http', None)
    for name, value, proxy_name in environment:
    # not case-folded, checking here for lower-case env vars only
    if name[-6:] == '_proxy':
    if value:
    proxies[proxy_name] = value
    else:
    proxies.pop(proxy_name, None)
    return proxies
  2. the registry access should be separated to a standalone function for cache, to be used by code like https://github.com/aio-libs/aiohttp/blob/e79b2d5df70a2644e81925cc49558962af91848d/aiohttp/client.py#L609-L617

    cpython/Lib/urllib/request.py

    Lines 2071 to 2104 in 2041a95

    def proxy_bypass_registry(host):
    try:
    import winreg
    except ImportError:
    # Std modules, so should be around - but you never know!
    return False
    try:
    internetSettings = winreg.OpenKey(winreg.HKEY_CURRENT_USER,
    r'Software\Microsoft\Windows\CurrentVersion\Internet Settings')
    proxyEnable = winreg.QueryValueEx(internetSettings,
    'ProxyEnable')[0]
    proxyOverride = str(winreg.QueryValueEx(internetSettings,
    'ProxyOverride')[0])
    # ^^^^ Returned as Unicode but problems if not converted to ASCII
    except OSError:
    return False
    if not proxyEnable or not proxyOverride:
    return False
    return _proxy_bypass_winreg_override(host, proxyOverride)
    def proxy_bypass(host):
    """Return True, if host should be bypassed.
    Checks proxy settings gathered from the environment, if specified,
    or the registry.
    """
    proxies = getproxies_environment()
    if proxies:
    return proxy_bypass_environment(host, proxies)
    else:
    return proxy_bypass_registry(host)
  3. the proxy setting from registry should be ignored when if proxies from environment exists? like the proxy_bypass()

Should open a PR?

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions