Skip to content

Wrap urllib3 LocationParseError from create_connection in InvalidURL (#5744)#7471

Open
jbbqqf wants to merge 1 commit into
psf:mainfrom
jbbqqf:fix/5744-locationparseerror-invalid-host
Open

Wrap urllib3 LocationParseError from create_connection in InvalidURL (#5744)#7471
jbbqqf wants to merge 1 commit into
psf:mainfrom
jbbqqf:fix/5744-locationparseerror-invalid-host

Conversation

@jbbqqf
Copy link
Copy Markdown

@jbbqqf jbbqqf commented May 23, 2026

Summary

Wrap urllib3.exceptions.LocationParseError raised from inside
urllib3.util.connection.create_connection in requests.exceptions.InvalidURL,
so callers see a requests-native exception instead of a raw urllib3 one. Fixes #5744.

PreparedRequest.prepare_url already catches LocationParseError at parse time
and converts it to InvalidURL. The same exception class can also be raised
later from urllib3's create_connection when the hostname survives parse_url
but contains a label that fails host.encode("idna") (e.g. a DNS label longer
than 63 characters per RFC 1035). The (_SSLError, _HTTPError) except clause
in HTTPAdapter.send does catch it — LocationParseError is a
urllib3.exceptions.HTTPError — but falls through to the trailing else: raise
and re-raises it bare. The fix adds an explicit isinstance(e, LocationValueError)
branch that wraps it in InvalidURL, mirroring the behaviour of
get_connection_with_tls_context earlier in the same method.

Reproduce BEFORE/AFTER yourself (copy-paste)

git clone https://github.com/psf/requests.git /tmp/requests-5744
cd /tmp/requests-5744
python -m venv .venv && . .venv/bin/activate
pip install -e .

# BEFORE — on origin/main, the urllib3 exception leaks.
git checkout main
python -c '
import requests, urllib3.exceptions
try:
    requests.get("http://" + "a"*64 + ".example.com", timeout=2)
except requests.exceptions.InvalidURL as e:
    print("OK requests-native:", type(e).__name__, e)
except urllib3.exceptions.LocationParseError as e:
    print("LEAK urllib3 exception:", type(e).__name__, e)
'
# Expected: "LEAK urllib3 exception: LocationParseError ..."

# AFTER — checkout this PR's branch and re-run the exact same script.
git fetch https://github.com/jbbqqf/requests fix/5744-locationparseerror-invalid-host
git checkout FETCH_HEAD
python -c '
import requests, urllib3.exceptions
try:
    requests.get("http://" + "a"*64 + ".example.com", timeout=2)
except requests.exceptions.InvalidURL as e:
    print("OK requests-native:", type(e).__name__, e)
except urllib3.exceptions.LocationParseError as e:
    print("LEAK urllib3 exception:", type(e).__name__, e)
'
# Expected: "OK requests-native: InvalidURL Failed to parse: ..., label empty or too long"

What I ran locally

$ python -m pytest tests/ 2>&1 | tail -3
========== 1 failed, 617 passed, 15 skipped, 1 xfailed, 18 warnings in 80.79s ==========

The single failure is test_proxy_error, which is environment-dependent
(it relies on a network resolution behaviour) and reproduces unchanged on
origin/main with the same setup — unrelated to this PR.

The new parametrize row in test_errors:

$ python -m pytest "tests/test_requests.py" -k "test_errors and example" -v
tests/test_requests.py::TestRequests::test_errors[http://aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.example.com-InvalidURL] PASSED

And the same test on origin/main (regression):

tests/test_requests.py::TestRequests::test_errors[...-InvalidURL] FAILED
> urllib3.exceptions.LocationParseError: Failed to parse: 'aaaa...aaaa.example.com', label empty or too long

Edge cases

Input Before After Reason
http://aaa...64a.example.com (label > 63 chars) urllib3.exceptions.LocationParseError leaks requests.exceptions.InvalidURL host.encode("idna") in create_connection raises LocationParseError
http://*.example.com InvalidURL (caught in prepare_url) InvalidURL (unchanged) already-existing path, validated by test_preparing_bad_url
http://fe80::5054:ff:fe5a:fc0 (zone-id-less IPv6) InvalidURL InvalidURL (unchanged) already covered by test_errors
Normal URL → ConnectTimeoutError ConnectTimeout ConnectTimeout (unchanged) MaxRetryError branch still catches
Normal URL → ProtocolError ConnectionError ConnectionError (unchanged) dedicated (ProtocolError, OSError) branch still catches
Normal URL → urllib3 _InvalidHeader InvalidHeader InvalidHeader (unchanged) branch ordered before the new one

PR drafted with assistance from Claude Code (Anthropic). The change was reviewed manually against requests' source. The reproducer block above is the one I used during development; reviewers can paste it verbatim.

…lidURL

When a hostname has labels that fail validation (e.g. a label longer than
63 characters, as constrained by RFC 1035), prepare_url's existing handler
catches LocationParseError at parse time and surfaces it as InvalidURL.

The same exception can also be raised later from urllib3's
util.connection.create_connection, where it bubbles up through
HTTPAdapter.send. The (_SSLError, _HTTPError) except clause in send
catches it (LocationParseError is a urllib3.HTTPError) but the final
`else: raise` re-raises it bare, so callers see a raw urllib3 exception
instead of a requests-native one.

Add an explicit branch that converts LocationValueError (parent of
LocationParseError) to InvalidURL, mirroring the behaviour of
get_connection_with_tls_context earlier in send.

Regression test: parametrize test_errors with a 64-character DNS label
and assert that requests.exceptions.InvalidURL is raised.

Fixes psf#5744
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

urllib3 LocationParseError (label empty or too long) uncaught by requests

1 participant