Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
051a85b
test: add `__init__` to make `tests/` a package
teocns Jan 10, 2025
2d97251
test: add llm_event_spy fixture for tests
teocns Jan 10, 2025
4a19dab
test: add VCR.py fixture for HTTP interaction recording
teocns Jan 10, 2025
51e2da2
deps: group integration-testing
teocns Jan 10, 2025
0180e2c
test: add fixture to mock package availability in tests
teocns Jan 10, 2025
73a0110
test: Add integration tests for OpenAI provider and features
teocns Jan 10, 2025
538bf98
test: add tests for concurrent API requests handling
teocns Jan 10, 2025
d679b93
Improve vcr.py configuration
teocns Jan 10, 2025
93744ce
ruff
teocns Jan 10, 2025
512c95d
chore(pyproject): update pytest options and loop scope
teocns Jan 10, 2025
e29d2b2
chore(tests): update vcr.py ignore_hosts and options
teocns Jan 11, 2025
8f02961
pyproject.toml
teocns Jan 11, 2025
a012a0f
centralize teardown in conftest.py (clear singletons, end all sessions)
teocns Jan 11, 2025
f51850e
change vcr_config scope to session
teocns Jan 11, 2025
e22513b
integration: auto start agentops session
teocns Jan 11, 2025
cb014b2
Move unit tests to dedicated folder (tests/unit)
teocns Jan 11, 2025
2c3b19d
Isolate vcr_config import into tests/integration
teocns Jan 12, 2025
6dbe54b
configure pytest to run only unit tests by default, and include integ…
teocns Jan 12, 2025
fb2be21
ci(python-tests): separate job between unit-integration tests
teocns Jan 12, 2025
caa08df
set python-tests timeout to 5 minutes
teocns Jan 12, 2025
120c455
ruff
teocns Jan 12, 2025
37edbc0
Implement jwt fixture, centralized reusable mock_req into conftest.py
teocns Jan 12, 2025
6be254a
ci(python-tests): simplify env management, remove cov from integratio…
teocns Jan 12, 2025
4d0d5a2
ruff
teocns Jan 12, 2025
2a860c8
fix: cassette for test_concurrent_api_requests
teocns Jan 12, 2025
e65f646
Cleanup vcr.py comments
teocns Jan 13, 2025
fa96e79
add a `TODO` for removing `vcrpy` git version after its release
dot-agi Jan 13, 2025
d83210b
refactor openai assistants response handling for easier testing
dot-agi Jan 13, 2025
751873b
add more keys for different llm providers
dot-agi Jan 13, 2025
558848a
add integration tests for other providers
dot-agi Jan 13, 2025
fa7325b
remove openai version limitation
dot-agi Jan 13, 2025
64ce1d0
add providers as deps
dot-agi Jan 13, 2025
9af78b9
chore: add mistralai to test dependencies
teocns Jan 13, 2025
3af0cd6
remove `mistral` from dependency since its incorrect
dot-agi Jan 14, 2025
6b94f37
ruff
dot-agi Jan 14, 2025
80c6a07
re-record cassettes
dot-agi Jan 14, 2025
bcac9b8
tests/fixtures/providers: fallback to `test-api-key` if no provider i…
teocns Jan 14, 2025
3eb9bc9
set keys for `litellm`
dot-agi Jan 14, 2025
7a6ac5a
Improve tests/integration/test_llm_providers.py openai assistants
teocns Jan 14, 2025
6ea858c
Make integration tests appropriately skip, regenerate x1 cassette
teocns Jan 14, 2025
8f1a958
explicit tests/integration/conftest finxtures import
teocns Jan 14, 2025
ce740c1
deps: improve dev packages versionings
teocns Jan 15, 2025
9b21b5e
Make integration tests run with python 3.12
teocns Jan 15, 2025
82e2105
add uv.lock
teocns Jan 15, 2025
98c325c
test concurrent api requests: remove matcher on method, possibly caus…
teocns Jan 15, 2025
dd3c402
Run static-analysis with python 3.12.2
teocns Jan 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
uv.lock binary
57 changes: 48 additions & 9 deletions .github/workflows/python-tests.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
# :: Use nektos/act to run this locally
# :: Example:
# :: `act push -j python-tests --matrix python-version:3.10 --container-architecture linux/amd64`
# :: `act push -j unit-tests --matrix python-version:3.10 --container-architecture linux/amd64`
#
# This workflow runs two separate test suites:
# 1. Unit Tests (python-tests job):
# - Runs across Python 3.9 to 3.13
# - Located in tests/unit directory
# - Coverage report uploaded to Codecov for Python 3.11 only
#
# 2. Integration Tests (integration-tests job):
# - Runs only on Python 3.13
# - Located in tests/integration directory
# - Longer timeout (15 min vs 10 min for unit tests)
# - Separate cache for dependencies

name: Python Tests
on:
workflow_dispatch: {}
Expand All @@ -23,10 +36,12 @@ on:
- 'tests/**/*.ipynb'

jobs:
python-tests:
unit-tests:
runs-on: ubuntu-latest
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AGENTOPS_API_KEY: ${{ secrets.AGENTOPS_API_KEY }}
PYTHONUNBUFFERED: "1"

strategy:
matrix:
Expand All @@ -49,14 +64,10 @@ jobs:
run: |
uv sync --group test --group dev

- name: Run tests with coverage
timeout-minutes: 10
- name: Run unit tests with coverage
timeout-minutes: 5
run: |
uv run -m pytest tests/ -v --cov=agentops --cov-report=xml
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AGENTOPS_API_KEY: ${{ secrets.AGENTOPS_API_KEY }}
PYTHONUNBUFFERED: "1"
uv run -m pytest tests/unit -v --cov=agentops --cov-report=xml

# Only upload coverage report for python3.11
- name: Upload coverage to Codecov
Expand All @@ -68,3 +79,31 @@ jobs:
flags: unittests
name: codecov-umbrella
fail_ci_if_error: true # Should we?

integration-tests:
runs-on: ubuntu-latest
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AGENTOPS_API_KEY: ${{ secrets.AGENTOPS_API_KEY }}
PYTHONUNBUFFERED: "1"

steps:
- uses: actions/checkout@v4

- name: Setup UV
uses: astral-sh/setup-uv@v5
continue-on-error: true
with:
python-version: "3.12"
enable-cache: true
cache-suffix: uv-3.12-integration
cache-dependency-glob: "**/pyproject.toml"

- name: Install dependencies
run: |
uv sync --group test --group dev

- name: Run integration tests
timeout-minutes: 5
run: |
uv run pytest tests/integration
2 changes: 1 addition & 1 deletion .github/workflows/static-analysis.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ jobs:
with:
enable-cache: true
cache-dependency-glob: "**/pyproject.toml"
python-version: "3.11.10"
python-version: "3.12.2"

- name: Install packages
run: |
Expand Down
127 changes: 64 additions & 63 deletions agentops/llms/providers/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,69 @@

return response

def handle_assistant_response(self, response, kwargs, init_timestamp, session: Optional[Session] = None) -> dict:
"""Handle response based on return type"""
from openai.pagination import BasePage

Check warning on line 141 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L141

Added line #L141 was not covered by tests

action_event = ActionEvent(init_timestamp=init_timestamp, params=kwargs)

Check warning on line 143 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L143

Added line #L143 was not covered by tests
if session is not None:
action_event.session_id = session.session_id

Check warning on line 145 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L145

Added line #L145 was not covered by tests

try:

Check warning on line 147 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L147

Added line #L147 was not covered by tests
# Set action type and returns
action_event.action_type = (

Check warning on line 149 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L149

Added line #L149 was not covered by tests
response.__class__.__name__.split("[")[1][:-1]
if isinstance(response, BasePage)
else response.__class__.__name__
)
action_event.returns = response.model_dump() if hasattr(response, "model_dump") else response
action_event.end_timestamp = get_ISO_time()
self._safe_record(session, action_event)

Check warning on line 156 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L154-L156

Added lines #L154 - L156 were not covered by tests

# Create LLMEvent if usage data exists
response_dict = response.model_dump() if hasattr(response, "model_dump") else {}

Check warning on line 159 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L159

Added line #L159 was not covered by tests

if "id" in response_dict and response_dict.get("id").startswith("run"):
if response_dict["id"] not in self.assistants_run_steps:
self.assistants_run_steps[response_dict.get("id")] = {"model": response_dict.get("model")}

Check warning on line 163 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L163

Added line #L163 was not covered by tests

if "usage" in response_dict and response_dict["usage"] is not None:
llm_event = LLMEvent(init_timestamp=init_timestamp, params=kwargs)

Check warning on line 166 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L166

Added line #L166 was not covered by tests
if session is not None:
llm_event.session_id = session.session_id

Check warning on line 168 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L168

Added line #L168 was not covered by tests

llm_event.model = response_dict.get("model")
llm_event.prompt_tokens = response_dict["usage"]["prompt_tokens"]
llm_event.completion_tokens = response_dict["usage"]["completion_tokens"]
llm_event.end_timestamp = get_ISO_time()
self._safe_record(session, llm_event)

Check warning on line 174 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L170-L174

Added lines #L170 - L174 were not covered by tests

elif "data" in response_dict:
for item in response_dict["data"]:
if "usage" in item and item["usage"] is not None:
llm_event = LLMEvent(init_timestamp=init_timestamp, params=kwargs)

Check warning on line 179 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L179

Added line #L179 was not covered by tests
if session is not None:
llm_event.session_id = session.session_id

Check warning on line 181 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L181

Added line #L181 was not covered by tests

llm_event.model = self.assistants_run_steps[item["run_id"]]["model"]
llm_event.prompt_tokens = item["usage"]["prompt_tokens"]
llm_event.completion_tokens = item["usage"]["completion_tokens"]
llm_event.end_timestamp = get_ISO_time()
self._safe_record(session, llm_event)

Check warning on line 187 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L183-L187

Added lines #L183 - L187 were not covered by tests

except Exception as e:
self._safe_record(session, ErrorEvent(trigger_event=action_event, exception=e))

Check warning on line 190 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L189-L190

Added lines #L189 - L190 were not covered by tests

kwargs_str = pprint.pformat(kwargs)
response = pprint.pformat(response)
logger.warning(

Check warning on line 194 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L192-L194

Added lines #L192 - L194 were not covered by tests
f"Unable to parse response for Assistants API. Skipping upload to AgentOps\n"
f"response:\n {response}\n"
f"kwargs:\n {kwargs_str}\n"
)

return response

Check warning on line 200 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L200

Added line #L200 was not covered by tests

def override(self):
self._override_openai_v1_completion()
self._override_openai_v1_async_completion()
Expand Down Expand Up @@ -234,68 +297,6 @@
"""Override OpenAI Assistants API methods"""
from openai._legacy_response import LegacyAPIResponse
from openai.resources import beta
from openai.pagination import BasePage

def handle_response(response, kwargs, init_timestamp, session: Optional[Session] = None) -> dict:
"""Handle response based on return type"""
action_event = ActionEvent(init_timestamp=init_timestamp, params=kwargs)
if session is not None:
action_event.session_id = session.session_id

try:
# Set action type and returns
action_event.action_type = (
response.__class__.__name__.split("[")[1][:-1]
if isinstance(response, BasePage)
else response.__class__.__name__
)
action_event.returns = response.model_dump() if hasattr(response, "model_dump") else response
action_event.end_timestamp = get_ISO_time()
self._safe_record(session, action_event)

# Create LLMEvent if usage data exists
response_dict = response.model_dump() if hasattr(response, "model_dump") else {}

if "id" in response_dict and response_dict.get("id").startswith("run"):
if response_dict["id"] not in self.assistants_run_steps:
self.assistants_run_steps[response_dict.get("id")] = {"model": response_dict.get("model")}

if "usage" in response_dict and response_dict["usage"] is not None:
llm_event = LLMEvent(init_timestamp=init_timestamp, params=kwargs)
if session is not None:
llm_event.session_id = session.session_id

llm_event.model = response_dict.get("model")
llm_event.prompt_tokens = response_dict["usage"]["prompt_tokens"]
llm_event.completion_tokens = response_dict["usage"]["completion_tokens"]
llm_event.end_timestamp = get_ISO_time()
self._safe_record(session, llm_event)

elif "data" in response_dict:
for item in response_dict["data"]:
if "usage" in item and item["usage"] is not None:
llm_event = LLMEvent(init_timestamp=init_timestamp, params=kwargs)
if session is not None:
llm_event.session_id = session.session_id

llm_event.model = self.assistants_run_steps[item["run_id"]]["model"]
llm_event.prompt_tokens = item["usage"]["prompt_tokens"]
llm_event.completion_tokens = item["usage"]["completion_tokens"]
llm_event.end_timestamp = get_ISO_time()
self._safe_record(session, llm_event)

except Exception as e:
self._safe_record(session, ErrorEvent(trigger_event=action_event, exception=e))

kwargs_str = pprint.pformat(kwargs)
response = pprint.pformat(response)
logger.warning(
f"Unable to parse response for Assistants API. Skipping upload to AgentOps\n"
f"response:\n {response}\n"
f"kwargs:\n {kwargs_str}\n"
)

return response

def create_patched_function(original_func):
def patched_function(*args, **kwargs):
Expand All @@ -309,7 +310,7 @@
if isinstance(response, LegacyAPIResponse):
return response

return handle_response(response, kwargs, init_timestamp, session=session)
return self.handle_assistant_response(response, kwargs, init_timestamp, session=session)

Check warning on line 313 in agentops/llms/providers/openai.py

View check run for this annotation

Codecov / codecov/patch

agentops/llms/providers/openai.py#L313

Added line #L313 was not covered by tests

return patched_function

Expand Down
57 changes: 36 additions & 21 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,30 +41,47 @@ dependencies = [

[dependency-groups]
test = [
"openai>=1.0.0,<2.0.0",
"langchain",
"openai>=1.0.0",
"anthropic",
"cohere",
"litellm",
"ai21>=3.0.0",
"groq",
"ollama",
"mistralai",
# ;;
# The below is a really hard dependency, that can be installed only between python >=3.10,<3.13.
# CI will fail because all tests will automatically pull this dependency group;
# we need a separate group specifically for integration tests which will run on pinned 3.1x
# ------------------------------------------------------------------------------------------------------------------------------------
# "crewai-tools @ git+https://github.com/crewAIInc/crewAI-tools.git@a14091abb24527c97ccfcc8539d529c8b4559a0f; python_version>='3.10'",
# ------------------------------------------------------------------------------------------------------------------------------------
# ;;
"autogen<0.4.0",
"pytest-cov",
"fastapi[standard]",
]

dev = [
# Testing essentials
"pytest>=7.4.0,<8.0.0", # Testing framework with good async support
"pytest-depends", # For testing complex agent workflows
"pytest-asyncio", # Async test support for testing concurrent agent operations
"pytest-mock", # Mocking capabilities for isolating agent components
"pyfakefs", # File system testing
"pytest-recording", # Alternative to pytest-vcr with better Python 3.x support
"vcrpy @ git+https://github.com/kevin1024/vcrpy.git@81978659f1b18bbb7040ceb324a19114e4a4f328",
"pytest>=8.0.0", # Testing framework with good async support
"pytest-depends", # For testing complex agent workflows
"pytest-asyncio", # Async test support for testing concurrent agent operations
"pytest-mock", # Mocking capabilities for isolating agent components
"pyfakefs", # File system testing
"pytest-recording", # Alternative to pytest-vcr with better Python 3.x support
# TODO: Use release version after vcrpy is released with this fix.
"vcrpy @ git+https://github.com/kevin1024/vcrpy.git@5f1b20c4ca4a18c1fc8cfe049d7df12ca0659c9b",
# Code quality and type checking
"ruff", # Fast Python linter for maintaining code quality
"mypy", # Static type checking for better reliability
"types-requests", # Type stubs for requests library

"ruff", # Fast Python linter for maintaining code quality
"mypy", # Static type checking for better reliability
"types-requests", # Type stubs for requests library
# HTTP mocking and environment
"requests_mock>=1.11.0", # Mock HTTP requests for testing agent external communications
"python-dotenv", # Environment management for secure testing

"python-dotenv", # Environment management for secure testing
# Agent integration testing
"pytest-sugar>=1.0.0",
"pdbpp>=0.10.3",
]

# CI dependencies
Expand All @@ -89,19 +106,17 @@ constraint-dependencies = [
# For Python ≥3.10 (where autogen-core might be present), use newer versions
"opentelemetry-api>=1.27.0; python_version>='3.10'",
"opentelemetry-sdk>=1.27.0; python_version>='3.10'",
"opentelemetry-exporter-otlp-proto-http>=1.27.0; python_version>='3.10'"
"opentelemetry-exporter-otlp-proto-http>=1.27.0; python_version>='3.10'",
]

[tool.autopep8]
max_line_length = 120

[tool.pytest.ini_options]
asyncio_mode = "auto"
asyncio_default_fixture_loop_scope = "function" # WARNING: Changing this may break tests. A `module`-scoped session might be faster, but also unstable.
test_paths = [
"tests",
]
addopts = "--tb=short -p no:warnings"
asyncio_default_fixture_loop_scope = "module" # WARNING: Changing this may break tests. A `module`-scoped session might be faster, but also unstable.
testpaths = ["tests/unit"] # Default to unit tests
addopts = "--tb=short -p no:warnings --import-mode=importlib --ignore=tests/integration" # Ignore integration by default
pythonpath = ["."]
faulthandler_timeout = 30 # Reduced from 60
timeout = 60 # Reduced from 300
Expand Down
Empty file added tests/__init__.py
Empty file.
32 changes: 32 additions & 0 deletions tests/fixtures/event.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from collections import defaultdict
from typing import TYPE_CHECKING

import pytest

if TYPE_CHECKING:
from pytest_mock import MockerFixture


@pytest.fixture(scope="function")
def llm_event_spy(agentops_client, mocker: "MockerFixture") -> dict[str, "MockerFixture"]:
"""
Fixture that provides spies on both providers' response handling

These fixtures are reset on each test run (function scope). To use it,
simply pass it as an argument to the test function. Example:

```
def test_my_test(llm_event_spy):
# test code here
llm_event_spy["litellm"].assert_called_once()
```
"""
from agentops.llms.providers.anthropic import AnthropicProvider
from agentops.llms.providers.litellm import LiteLLMProvider
from agentops.llms.providers.openai import OpenAiProvider

return {
"litellm": mocker.spy(LiteLLMProvider(agentops_client), "handle_response"),
"openai": mocker.spy(OpenAiProvider(agentops_client), "handle_response"),
"anthropic": mocker.spy(AnthropicProvider(agentops_client), "handle_response"),
}
26 changes: 26 additions & 0 deletions tests/fixtures/packaging.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import builtins
import pytest


@pytest.fixture
def hide_available_pkg(monkeypatch):
"""
Hide the availability of a package by mocking the __import__ function.

Usage:
@pytest.mark.usefixtures('hide_available_pkg')
def test_message():
with pytest.raises(ImportError, match='Install "pkg" to use test_function'):
foo('test_function')

Source:
https://stackoverflow.com/questions/60227582/making-a-python-test-think-an-installed-package-is-not-available
"""
import_orig = builtins.__import__

def mocked_import(name, *args, **kwargs):
if name == "pkg":
raise ImportError()
return import_orig(name, *args, **kwargs)

monkeypatch.setattr(builtins, "__import__", mocked_import)
Loading
Loading