Skip to content

Commit 0681f03

Browse files
UN-2793 [FEAT] Added exponential backoff retry mechanism for platform service connections (#199)
* feat/added-retries-for-platform-service-calls [FEAT] Added exponential backoff retry mechanism for platform service connections - Implemented retry_utils module with configurable retry behavior - Added @retry_on_connection_error decorator to platform service calls - Supports exponential backoff with jitter to prevent thundering herd - Configuration via environment variables for max retries, delays, and backoff factor - Automatically retries on ConnectionError and errno 111 (Connection refused) - Improved error handling and logging for better debugging - Bumped SDK version to v0.78.0 * Delete mypy-errors.txt Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestions from code review Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> * UN-2793 Moved ConnectionError handling to allow the retry decorator as expected * UN-2793 [FEAT] Refactor retry mechanism to use backoff library with configurable exceptions - Replace custom retry implementation with battle-tested backoff library - Add configurable exception types while preserving existing OSError errno logic - Implement generic retry decorator factory supporting multiple service prefixes - Maintain backward compatibility with existing platform service retry behavior - Add comprehensive environment variable configuration for retry parameters - Improve logging with structured backoff details and exception context - Reduce codebase complexity from 236 lines to ~108 lines (55% reduction) - Support custom retry configurations per service via prefix-based env vars * Apply suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestion from @chandrasekharan-zipstack Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestion from @chandrasekharan-zipstack Signed-off-by: Chandrasekharan M <[email protected]> * UN-2793 [FEAT] Added retry decorator for prompt service calls * UN-2793 Removed use of backoff lib and added own decorator for retries * minor: Removed a default argument to make calls to decorator explicit * misc: Raised err to validate envs for retry * Update src/unstract/sdk/utils/retry_utils.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> * Apply suggestion from @coderabbitai[bot] Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <[email protected]> --------- Signed-off-by: Chandrasekharan M <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent 67971af commit 0681f03

File tree

6 files changed

+2375
-2034
lines changed

6 files changed

+2375
-2034
lines changed

src/unstract/sdk/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
__version__ = "v0.77.3"
1+
__version__ = "v0.78.0"
2+
23

34
def get_sdk_version() -> str:
45
"""Returns the SDK version."""

src/unstract/sdk/adapter.py

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import json
2+
import logging
23
from typing import Any
34

45
import requests
@@ -9,6 +10,9 @@
910
from unstract.sdk.helper import SdkHelper
1011
from unstract.sdk.platform import PlatformBase
1112
from unstract.sdk.tool.base import BaseTool
13+
from unstract.sdk.utils.retry_utils import retry_platform_service_call
14+
15+
logger = logging.getLogger(__name__)
1216

1317

1418
class ToolAdapter(PlatformBase):
@@ -24,10 +28,12 @@ def __init__(
2428
platform_host: str,
2529
platform_port: str,
2630
) -> None:
27-
"""Args:
31+
"""Constructor for ToolAdapter.
32+
33+
Args:
2834
tool (AbstractTool): Instance of AbstractTool
2935
platform_host (str): Host of platform service
30-
platform_port (str): Port of platform service
36+
platform_port (str): Port of platform service.
3137
3238
Notes:
3339
- PLATFORM_SERVICE_API_KEY environment variable is required.
@@ -38,14 +44,19 @@ def __init__(
3844
tool=tool, platform_host=platform_host, platform_port=platform_port
3945
)
4046

47+
@retry_platform_service_call
4148
def _get_adapter_configuration(
4249
self,
4350
adapter_instance_id: str,
4451
) -> dict[str, Any]:
45-
"""Get Adapter
46-
1. Get the adapter config from platform service
47-
using the adapter_instance_id
52+
"""Get Adapter.
4853
54+
Get the adapter config from platform service
55+
using the adapter_instance_id. This method automatically
56+
retries on connection errors with exponential backoff.
57+
58+
Retry behavior is configurable via environment variables:
59+
Check decorator for details
4960
Args:
5061
adapter_instance_id (str): Adapter instance ID
5162
@@ -70,18 +81,14 @@ def _get_adapter_configuration(
7081
f"'{adapter_type}', provider: '{provider}', name: '{adapter_name}'",
7182
level=LogLevel.DEBUG,
7283
)
73-
except ConnectionError:
74-
raise SdkError(
75-
"Unable to connect to platform service, please contact the admin."
76-
)
7784
except HTTPError as e:
7885
default_err = (
7986
"Error while calling the platform service, please contact the admin."
8087
)
8188
msg = AdapterUtils.get_msg_from_request_exc(
8289
err=e, message_key="error", default_err=default_err
8390
)
84-
raise SdkError(f"Error retrieving adapter. {msg}")
91+
raise SdkError(f"Error retrieving adapter. {msg}") from e
8592
return adapter_data
8693

8794
@staticmethod
@@ -121,4 +128,10 @@ def get_adapter_config(
121128
platform_host=platform_host,
122129
platform_port=platform_port,
123130
)
124-
return tool_adapter._get_adapter_configuration(adapter_instance_id)
131+
132+
try:
133+
return tool_adapter._get_adapter_configuration(adapter_instance_id)
134+
except ConnectionError as e:
135+
raise SdkError(
136+
"Unable to connect to platform service, please contact the admin."
137+
) from e

src/unstract/sdk/platform.py

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,20 @@
1+
import logging
12
from typing import Any
23

34
import requests
45
from requests import ConnectionError, RequestException, Response
56
from unstract.sdk.constants import (
7+
LogLevel,
68
MimeType,
79
PromptStudioKeys,
810
RequestHeader,
911
ToolEnv,
1012
)
1113
from unstract.sdk.helper import SdkHelper
1214
from unstract.sdk.tool.base import BaseTool
15+
from unstract.sdk.utils.retry_utils import retry_platform_service_call
16+
17+
logger = logging.getLogger(__name__)
1318

1419

1520
class PlatformBase:
@@ -86,6 +91,7 @@ def _get_headers(self, headers: dict[str, str] | None = None) -> dict[str, str]:
8691
request_headers.update(headers)
8792
return request_headers
8893

94+
@retry_platform_service_call
8995
def _call_service(
9096
self,
9197
url_path: str,
@@ -97,6 +103,10 @@ def _call_service(
97103
"""Talks to platform-service to make GET / POST calls.
98104
99105
Only GET calls are made to platform-service though functionality exists.
106+
This method automatically retries on connection errors with exponential backoff.
107+
108+
Retry behavior is configurable via environment variables.
109+
Check decorator for details
100110
101111
Args:
102112
url_path (str): URL path to the service endpoint
@@ -130,9 +140,13 @@ def _call_service(
130140

131141
response.raise_for_status()
132142
except ConnectionError as connect_err:
133-
msg = "Unable to connect to platform service. Please contact admin."
134-
msg += " \n" + str(connect_err)
135-
self.tool.stream_error_and_exit(msg)
143+
logger.exception("Connection error to platform service")
144+
msg = (
145+
"Unable to connect to platform service. Will retry with backoff, "
146+
"please contact admin if retries ultimately fail."
147+
)
148+
self.tool.stream_log(msg, level=LogLevel.ERROR)
149+
raise ConnectionError(msg) from connect_err
136150
except RequestException as e:
137151
# Extract error information from the response if available
138152
error_message = str(e)
@@ -200,4 +214,3 @@ def get_llm_profile(self, llm_profile_id: str) -> dict[str, Any]:
200214
headers=None,
201215
method="GET",
202216
)
203-

src/unstract/sdk/prompt.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
from unstract.sdk.platform import PlatformHelper
1616
from unstract.sdk.tool.base import BaseTool
1717
from unstract.sdk.utils.common_utils import log_elapsed
18+
from unstract.sdk.utils.retry_utils import retry_prompt_service_call
1819

1920
logger = logging.getLogger(__name__)
2021

@@ -185,6 +186,7 @@ def _get_headers(self, headers: dict[str, str] | None = None) -> dict[str, str]:
185186
request_headers.update(headers)
186187
return request_headers
187188

189+
@retry_prompt_service_call
188190
def _call_service(
189191
self,
190192
url_path: str,
@@ -196,6 +198,14 @@ def _call_service(
196198
"""Communicates to prompt service to fetch response for the prompt.
197199
198200
Only POST calls are made to prompt-service though functionality exists.
201+
This method automatically retries on connection errors with exponential backoff.
202+
203+
Retry behavior is configurable via environment variables:
204+
- PROMPT_SERVICE_MAX_RETRIES (default: 3)
205+
- PROMPT_SERVICE_MAX_TIME (default: 60s)
206+
- PROMPT_SERVICE_BASE_DELAY (default: 1.0s)
207+
- PROMPT_SERVICE_MULTIPLIER (default: 2.0)
208+
- PROMPT_SERVICE_JITTER (default: true)
199209
200210
Args:
201211
url_path (str): URL path to the service endpoint

0 commit comments

Comments
 (0)