Session refactor #6

mayfield · 2017-03-01T20:58:10Z

This is a review seeking PR for @smtakeda (and @blandonnimrat)

It does test well for me on python3.5 and python2.7 but I would consider it as part of ongoing work to increase download concurrency safely. Please feel free to provide feedback and criticism here and/or merge if you are happy enough with the intent.

I took some liberty in renaming existing mechanisms where it helped me make sense of the various pieces falling together but I can backtrack on them if they're undesired (named access_url -> fetch).

There is more overview in the commit logs as well.

Make session management explicitly managed by callers to `access_url`. This enables each caller to protect the session from threading issues in a way that best matches their particular model. For Snowflakerestful this is managed as a pool of sessions; For chunk_downloader I will implement something similar that provides for good reuse and cleanup of each session while achieving as much connection pooling as can be done safely.

Fairly large refactor in the spirit of consolidating HTTP call-patterns and configuration. Changed `SnowflakeRestful.access_url` from a staticmethod to a regular instance method, reborn as a `fetch`. The name change helps distinguish the type of activity being performed, but is otherwise cosmetic. Requests session pooling is manged in one place by the SnowflakeRestful instance regardless of the type of HTTP activity. This provides the best session reuse possible with consistent thread safety characteristics. Note that the `requests.Session` objects are created outside their final execution thread, but they are never used by more than one request thread at a time. Other changes: - Moved network.py logger instance to module level. - DRY optimizations for _get_request, _post_request, etc. Namely consolidation of proxy and timeout settings. - Default `fetch`'s keyword arg `token` to `network.NOTOKEN`. I believe this is semantically acceptable (or desirable) but should be reviewed. - Renamed `chunk_downloader`'s `_get_request` to `_fetch_chunk` to more aptly associate it with `SnowflakeRestful.fetch()`. - Renamed `request_thread` function to `request_exec` as it is not always a thread target.

smtakeda · 2017-03-01T23:50:38Z

Thanks @mayfield! I'm OOO this week, so I'll check out by next week.

smtakeda · 2017-03-02T00:31:21Z

network.py

-        self._session = None
+        sessions = list(self._active_sessions)
+        if sessions:
+            self.logger.warn("Closing %d active sessions" % len(sessions))


Let's use parameterized logging, e.g., self.logger.warn("Closing %s active sessions", len(sessions))

I only avoid that syntax because it only works for logging functions and I sometimes switch back and forth with other printers like print(). But I can use printf style formatting from now on.

smtakeda

Overall looks good to me. I'll fix logger and syntax error later.

smtakeda · 2017-03-02T00:58:00Z

network.py

+        s = requests.Session()
+        s.mount(u'http://', HTTPAdapter(max_retries=REQUESTS_RETRY))
+        s.mount(u'https://', HTTPAdapter(max_retries=REQUESTS_RETRY))
+        s._reuse_count = itertools.count()


Would this be used later or just monitoring?

Yeah, I was using it before these commits were pushed. I'm probably going to implement session garbage collection and may use reuse_count as a mechanism for deciding when to close out old sessions. But if not I'll yank before next PR. Thanks!

smtakeda · 2017-03-03T05:08:48Z

network.py

-        request_thread_timeout = 60  # one request thread timeout
-
-        def request_thread(result_queue):
+    def fetch(self, method, full_url, headers, *, data=None, timeout=None, **kwargs):


This causes an exception in python 2.7.

Traceback (most recent call last): File "/home/stakeda/basic_test.py", line 13, in <module> import snowflake.connector File "/home/stakeda/Snowflake/trunk/Python/pvenv_2.7/lib/python2.7/site-packages/snowflake/connector/__init__.py", line 21, in <module> from .connection import SnowflakeConnection File "/home/stakeda/Snowflake/trunk/Python/pvenv_2.7/lib/python2.7/site-packages/snowflake/connector/connection.py", line 14, in <module> from . import network File "/home/stakeda/Snowflake/trunk/Python/pvenv_2.7/lib/python2.7/site-packages/snowflake/connector/network.py", line 607 def fetch(self, method, full_url, headers, *, data=None, timeout=None, **kwargs): ^ SyntaxError: invalid syntax

Maybe just remove * for now.

Interesting, I was pretty sure I tested that particular signature on 2.7, but I do recall that keyword only arguments handling changed in py3k. (Disclaimer: I rarely use python2)

* Thread safety refactor of http session management. Make session management explicitly managed by callers to `access_url`. This enables each caller to protect the session from threading issues in a way that best matches their particular model. For Snowflakerestful this is managed as a pool of sessions; For chunk_downloader I will implement something similar that provides for good reuse and cleanup of each session while achieving as much connection pooling as can be done safely. * Session consolidation Fairly large refactor in the spirit of consolidating HTTP call-patterns and configuration. Changed `SnowflakeRestful.access_url` from a staticmethod to a regular instance method, reborn as a `fetch`. The name change helps distinguish the type of activity being performed, but is otherwise cosmetic. Requests session pooling is manged in one place by the SnowflakeRestful instance regardless of the type of HTTP activity. This provides the best session reuse possible with consistent thread safety characteristics. Note that the `requests.Session` objects are created outside their final execution thread, but they are never used by more than one request thread at a time. Other changes: - Moved network.py logger instance to module level. - DRY optimizations for _get_request, _post_request, etc. Namely consolidation of proxy and timeout settings. - Default `fetch`'s keyword arg `token` to `network.NOTOKEN`. I believe this is semantically acceptable (or desirable) but should be reviewed. - Renamed `chunk_downloader`'s `_get_request` to `_fetch_chunk` to more aptly associate it with `SnowflakeRestful.fetch()`. - Renamed `request_thread` function to `request_exec` as it is not always a thread target.

mayfield added 2 commits February 28, 2017 21:37

smtakeda reviewed Mar 2, 2017

View reviewed changes

smtakeda approved these changes Mar 4, 2017

View reviewed changes

smtakeda merged commit 1d323ea into snowflakedb:master Mar 4, 2017

mayfield deleted the session_pool branch March 4, 2017 17:31

seoyeonberry mentioned this pull request Jun 18, 2020

SNOW-169193: Snowflake Exception has occurred: OperationalError 250003: Failed to execute request: 'SSLSocket' object has no attribute 'connection' #325

Closed

dependabot bot mentioned this pull request Mar 18, 2021

Bump snowflake-connector-python[pandas] from 2.3.7 to 2.4.1 transferwise/pipelinewise#662

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Session refactor #6

Session refactor #6

Uh oh!

mayfield commented Mar 1, 2017

Uh oh!

smtakeda commented Mar 1, 2017

Uh oh!

smtakeda Mar 2, 2017

Uh oh!

mayfield Mar 4, 2017

Uh oh!

smtakeda left a comment

Uh oh!

smtakeda Mar 2, 2017

Uh oh!

mayfield Mar 4, 2017

Uh oh!

smtakeda Mar 3, 2017

Uh oh!

mayfield Mar 4, 2017

Uh oh!

Uh oh!

Session refactor #6

Session refactor #6

Uh oh!

Conversation

mayfield commented Mar 1, 2017

Uh oh!

smtakeda commented Mar 1, 2017

Uh oh!

smtakeda Mar 2, 2017

Choose a reason for hiding this comment

Uh oh!

mayfield Mar 4, 2017

Choose a reason for hiding this comment

Uh oh!

smtakeda left a comment

Choose a reason for hiding this comment

Uh oh!

smtakeda Mar 2, 2017

Choose a reason for hiding this comment

Uh oh!

mayfield Mar 4, 2017

Choose a reason for hiding this comment

Uh oh!

smtakeda Mar 3, 2017

Choose a reason for hiding this comment

Uh oh!

mayfield Mar 4, 2017

Choose a reason for hiding this comment

Uh oh!

Uh oh!