Skip to content

High latency & Blocking behavior in RequestExecutor (Connection Pooling + time.sleep)Β #89

@eduards-vavere

Description

@eduards-vavere

I am currently using okta-jwt-verifier in a FastAPI (AsyncIO) environment. We have observed a consistent latency baseline of ~250ms per token verification request, along with occasional thread starvation/timeouts under load.

Upon investigating the source code, I found three design patterns in RequestExecutor that seem to cause these performance issues in async environments.

1. Lack of Connection Pooling (Latency)

In request_executor.py, the fire_request method initializes the aiohttp session inside a context manager for every single request.

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L29

# src/okta_jwt_verifier/request_executor.py
async def fire_request(self, uri, **params):
    # New session created for every call -> No Keep-Alive
    async with AsyncCacheControl(cache=self.cache) as cached_sess:
        async with cached_sess.get(uri, **params) as resp:
            # ...

This forces a full TCP/TLS handshake (~200ms+) for every verification attempt, effectively disabling HTTP Keep-Alive.

2. Strict Adherence to no-cache Headers (Latency)

The library uses AsyncCacheControl to respect upstream HTTP headers. We noticed that Okta's keys endpoints often return headers like Cache-Control: no-cache or must-revalidate.

Because the library strictly follows these headers, it ignores the local cache and triggers a network call (re-validation) for every verification request. Combined with issue #1 (no connection pooling), this results in significant latency per request.

3. Blocking time.sleep in Retry Logic (Starvation)

I also noticed that the get method handles rate-limiting wait times using time.sleep(0.1) inside an async method:

https://github.com/okta/okta-jwt-verifier-python/blob/master/src/okta_jwt_verifier/request_executor.py#L46

# src/okta_jwt_verifier/request_executor.py
while self.requests_count >= self.max_requests:
    time.sleep(0.1) # <--- Synchronous Sleep

In an AsyncIO environment (like FastAPI or Aiohttp), time.sleep blocks the entire Event Loop, freezing the server for all concurrent requests, not just the current task.

Suggestions / Questions

  1. Connection Pooling: Would it be possible to refactor RequestExecutor to persist self.session across calls, or allow injecting an external aiohttp.ClientSession?
  2. Async Sleep: Should the retry logic use await asyncio.sleep(0.1) instead of time.sleep(0.1) to avoid blocking the main loop?
  3. TTL Option: Would you be open to adding a cache_ttl option to allow opting-in to "Eventual Consistency" (e.g. caching keys for 1 hour) regardless of upstream no-cache headers?

Environment:

  • Python 3.11
  • FastAPI / Uvicorn
  • okta-jwt-verifier version: 0.3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions