Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion python/pyhive/hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,8 @@ def __init__(
check_hostname=None,
ssl_cert=None,
thrift_transport=None,
ssl_context=None
ssl_context=None,
connection_timeout=None,
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection_timeout parameter should be validated to ensure it's a positive value. Negative or zero timeout values could cause unexpected behavior. Consider adding validation like:

if connection_timeout is not None and connection_timeout <= 0:
    raise ValueError("connection_timeout must be a positive value")

Copilot uses AI. Check for mistakes.
):
"""Connect to HiveServer2

Expand All @@ -175,6 +176,7 @@ def __init__(
Incompatible with host, port, auth, kerberos_service_name, and password.
:param ssl_context: A custom SSL context to use for HTTPS connections. If provided,
this overrides check_hostname and ssl_cert parameters.
:param connection_timeout: Millisecond timeout for Thrift connections.
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation should clarify that connection_timeout is ignored when using a custom thrift_transport. Users need to configure the timeout on their custom transport object directly. Consider adding a note to the docstring like: "Note: connection_timeout is ignored when thrift_transport is provided. Configure timeout on the custom transport object instead."

Suggested change
:param connection_timeout: Millisecond timeout for Thrift connections.
:param connection_timeout: Millisecond timeout for Thrift connections when using the
built-in HTTP/HTTPS transport.
Note: connection_timeout is ignored when thrift_transport is provided. Configure
timeout on the custom transport object instead.

Copilot uses AI. Check for mistakes.
The way to support LDAP and GSSAPI is originated from cloudera/Impyla:
https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62
/impala/_thrift_api.py#L152-L160
Expand All @@ -193,6 +195,8 @@ def __init__(
),
ssl_context=ssl_context,
)
if connection_timeout:
thrift_transport.setTimeout(connection_timeout)

if auth in ("BASIC", "NOSASL", "NONE", None):
# Always needs the Authorization header
Expand Down Expand Up @@ -236,6 +240,8 @@ def __init__(
if auth is None:
auth = 'NONE'
socket = thrift.transport.TSocket.TSocket(host, port)
if connection_timeout:
socket.setTimeout(connection_timeout)
if auth == 'NOSASL':
# NOSASL corresponds to hive.server2.authentication=NOSASL in hive-site.xml
self._transport = thrift.transport.TTransport.TBufferedTransport(socket)
Expand Down
11 changes: 11 additions & 0 deletions python/pyhive/tests/test_hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,17 @@ def test_basic_ssl_context(self):
cursor.execute('SELECT 1 FROM one_row')
self.assertEqual(cursor.fetchall(), [(1,)])

def test_connection_timeout(self):
"""Test that a connection timeout is set without error."""
with contextlib.closing(hive.connect(
host=_HOST,
port=10000,
connection_timeout=10 * 1000
)) as connection:
with contextlib.closing(connection.cursor()) as cursor:
# Use the same query pattern as other tests
cursor.execute('SELECT 1 FROM one_row')
self.assertEqual(cursor.fetchall(), [(1,)])
Comment on lines +262 to +272
Copy link

Copilot AI Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test only verifies that a connection can be established with the timeout parameter, but doesn't actually test timeout behavior. Consider adding a test that verifies the timeout is properly set on the underlying socket/transport. This could be done by checking the socket's timeout value after connection, or by testing that a connection attempt to an unreachable host times out as expected.

Copilot uses AI. Check for mistakes.

def _restart_hs2():
subprocess.check_call(['sudo', 'service', 'hive-server2', 'restart'])
Expand Down
Loading