-
Notifications
You must be signed in to change notification settings - Fork 935
Description
Problem
In src/_bentoml_impl/server/proxy.py (line 67), there is a # TODO: support aiohttp client comment. When a user provides their own aiohttp.ClientSession on their service class via self.client, BentoML detects it but cannot use it correctly.
The root cause: BentoML builds request URLs as relative paths (e.g. /predict) and relies on base_url being set on the ClientSession. BentoML's own client sets base_url=http://localhost:{proxy_port}, so it works. But the user's client has no base_url — they cannot know BentoML's internal proxy_url at the time they create their client.
Additionally, the current proxy hardcodes localhost as the proxy host, which means it cannot forward requests to a remote or external ML server running on a different machine or network.
Use Cases This Enables
- Custom auth/SSL: User brings a pre-configured client with custom headers, SSL certificates, or retry policies.
- Remote ML server: User's custom ML server runs on a different machine or container (e.g.,
http://192.168.1.50:8000), not onlocalhost.
Current Behavior
# proxy.py line 61-68
proxy_port = service.config.get("http", {}).get("proxy_port", 8000)
proxy_url = f"http://localhost:{proxy_port}" # hardcoded localhost
if instance_client is not None and isinstance(instance_client, aiohttp.ClientSession):
# TODO: support aiohttp client
client = instance_client # used as-is — breaks because no base_urlLater in reverse_proxy:
url = yarl.URL.build(path=f"/{path}") # relative path only — no host
client.request(url=url) # crashes for user-provided clientProposed Fix
1. Allow configurable proxy host in service config:
proxy_host = service.config.get("http", {}).get("proxy_host", "localhost")
proxy_port = service.config.get("http", {}).get("proxy_port", 8000)
proxy_url = f"http://{proxy_host}:{proxy_port}"2. Always build the full absolute URL in reverse_proxy, instead of relying on base_url:
# In reverse_proxy — always construct full URL
full_url = f"{proxy_url.rstrip('/')}/{path}"
client.request(url=full_url, ...)3. Accept user's client but override its base_url awareness by always passing full URLs.
Files Affected
src/_bentoml_impl/server/proxy.py
Notes
- When
proxy_hostis a remote host,should_start_processmust beFalse— BentoML cannot spawn a process on a remote machine. - This allows users to bring pre-configured clients (auth, SSL, retries) while BentoML handles URL construction.