Skip to content

Conversation

@RobGeada
Copy link
Contributor

@RobGeada RobGeada commented Oct 24, 2025

This PR:

  1. Fixes the janky path differences between the built image and the local library for the HF detector- this means that the sys.path.insert(0, os.path.abspath("..")) is no longer needed in the HF detector code.
  2. Creates the InstrumentedDetector class, which can be inherited by other detectors to enable easy metric reporting
  3. Refactors all built-in detectors onto the InstrumentedDetector
  4. HFDetector now inherits the InstrumenteDetector and reports the same metrics as the built-in detectors

Summary by Sourcery

Modularize Prometheus metrics by introducing an InstrumentedDetector class, refactor built-in and HF detectors to use it, remove path hack in HF code, update deployment config, and add tests to validate metrics.

New Features:

  • Add InstrumentedDetector base class for standardized Prometheus metrics instrumentation
  • Enable HFDetector to inherit instrumentation and report detection, request, error, and runtime metrics

Bug Fixes:

  • Fix path discrepancies between built image and local library for HF detector, removing the need for manual sys.path insertion

Enhancements:

  • Refactor built-in detectors to extend InstrumentedDetector and consolidate metric logic
  • Remove redundant instrumentation code in built-in registry and streamline metric gauge descriptions

Deployment:

  • Update Kubernetes serving runtime port to 8000 and inject DETECTOR_NAME from pod metadata

Tests:

  • Add end-to-end tests for Prometheus metrics counts and runtime measurements for HF detector

Summary by CodeRabbit

  • New Features

    • Centralized detector instrumentation with per-detector labels, runtime measurement, and automatic attachment during detector lifecycle.
    • Detectors now expose identifiable names (DETECTOR_NAME) for metrics and registration.
  • Bug Fixes

    • Removed stray test debug output.
  • Chores

    • Adjusted runtime/runtime metrics wording and Prometheus port; reduced worker count in runtime startup.
    • Removed legacy load-testing script and added a dev dependency.
  • Tests

    • Added end-to-end metrics tests validating Prometheus counters and runtime accumulation.

@sourcery-ai
Copy link

sourcery-ai bot commented Oct 24, 2025

Reviewer's Guide

This PR extracts Prometheus instrumentation into a reusable base class, refactors built-in and HF detectors to inherit and leverage it for automatic metrics reporting, updates FastAPI app setup and deployment configs to wire in the new instrumentation, and adds end-to-end tests validating request, detection, error, and runtime counters.

Entity relationship diagram for Prometheus metrics counters

erDiagram
    INSTRUMENTS {
        detections int
        requests int
        errors int
        runtime float
    }
    DETECTOR {
        registry_name string
        function_name string
    }
    INSTRUMENTS ||--o| DETECTOR : "labels(detector_kind, detector_name)"
Loading

Class diagram for InstrumentedDetector and its integration

classDiagram
    class InstrumentedDetector {
        +registry_name: str
        +instruments: dict
        +__init__(registry_name: str = "default")
        +add_instruments(gauges)
        +increment_detector_instruments(function_name: str, is_detection: bool)
        +increment_error_instruments(function_name: str)
        +instrument_runtime(function_name: str)
    }
    class BaseDetectorRegistry {
        +__init__(registry_name)
        +handle_request(...)
        +get_registry()
        +throw_internal_detector_error(...)
        +get_detection_functions_from_params(params: dict)
    }
    class Detector {
        +risk_names: list
        +__init__()
        +initialize_model(model_files_path)
        +initialize_device()
        +process_sequence_classification(...)
        +run(input)
        +close()
    }
    InstrumentedDetector <|-- BaseDetectorRegistry
    InstrumentedDetector <|-- Detector
    BaseDetectorRegistry <|-- ABC
Loading

File-Level Changes

Change Details Files
Extracted Prometheus instrumentation into a reusable InstrumentedDetector base class
  • Created detectors/common/instrumented_detector.py with context managers and counter methods
  • Removed duplicate instrumentation methods and attributes from BaseDetectorRegistry
  • Updated BaseDetectorRegistry to subclass InstrumentedDetector and call super().init
detectors/common/instrumented_detector.py
detectors/built_in/base_detector_registry.py
Refactored HF Detector to use the new instrumentation abstraction
  • Changed Detector class to inherit InstrumentedDetector and invoke super().init
  • Wrapped run() logic in instrument_runtime context and called increment_detector_instruments
  • Set registry_name and function_name dynamically (model path or DETECTOR_NAME env var)
  • Removed legacy sys.path insertion hack
detectors/huggingface/detector.py
Wired metrics into FastAPI application lifecycle
  • Passed detector.model_name to app.set_detector and hooked in app.state.instruments
  • Updated detector_unary_handler to retrieve detectors list instead of single header ID
  • Refined metric counter descriptions in detectors/common/app.py
detectors/huggingface/app.py
detectors/common/app.py
Updated deployment configuration for Prometheus metrics exposure
  • Changed prometheus.io/port annotation from 8080 to 8000
  • Added DETECTOR_NAME env var sourced from pod metadata
detectors/huggingface/deploy/servingruntime.yaml
Added end-to-end tests validating Prometheus metrics
  • Created tests/detectors/huggingface/test_metrics.py with TestClient fixtures
  • Mocked model outputs to control detection and latency behavior
  • Asserted accuracy of request, detection, error, and runtime counters
tests/detectors/huggingface/test_metrics.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link

coderabbitai bot commented Oct 24, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Refactors instrumentation into a new InstrumentedDetector, moves instrumentation logic out of BaseDetectorRegistry, updates HuggingFace Detector and app to register/attach instruments and use the first registered detector, adjusts Dockerfile copy targets and Prometheus port/env, adds metrics tests, and removes a Locust load-test.

Changes

Cohort / File(s) Summary
Instrumentation core
detectors/common/instrumented_detector.py, detectors/built_in/base_detector_registry.py
Add InstrumentedDetector with instrumentation API (add_instruments, instrument_runtime, increment_*). Refactor BaseDetectorRegistry to inherit from InstrumentedDetector and remove its prior instrumentation state/methods.
HuggingFace detector & app
detectors/huggingface/detector.py, detectors/huggingface/app.py
Detector now extends InstrumentedDetector, sets registry_name and function_name in initialize_model, wraps per-text processing with instrument_runtime, and calls increment_detector_instruments. app.py registers detector during lifespan, attaches instruments, and handler no longer requires a detector_id header (uses first registered detector).
Docker / deployment
detectors/Dockerfile.hf, detectors/huggingface/deploy/servingruntime.yaml
Adjust Docker COPY destinations to /app/detectors/..., create /common and copy log_conf.yaml there; expose same port but update Prometheus annotation from 8080→8000 and add DETECTOR_NAME env var sourced from pod metadata.
Common app metrics
detectors/common/app.py
Remove Gauge import and sys.path/Instrumentator usage; keep Counters and update metric descriptions to reference "detector function" instead of "built-in detector function".
Dev dependencies
detectors/common/requirements-dev.txt
Add protobuf==6.33.0.
Tests
tests/detectors/huggingface/test_metrics.py, tests/detectors/builtIn/test_filetype.py
Add metric-focused tests validating Prometheus-like counters and runtime aggregation; remove a stray debug print from an existing test.
Removed
detectors/huggingface/locustfile.py
Delete Locust-based load-testing script.
No-op / formatting
detectors/huggingface/requirements.txt
No functional change (trailing newline/edit artifact).

Sequence Diagram(s)

sequenceDiagram
    participant Lifespan
    participant App
    participant Handler
    participant Detector
    participant Instrumented
    participant Metrics

    note over Lifespan,App: Startup — instantiate & register detector
    Lifespan->>Detector: instantiate Detector(model_path)
    Detector->>Instrumented: super().__init__(registry_name)
    Lifespan->>App: app.set_detector(detector, name)
    Lifespan->>Detector: detector.add_instruments(app.state.instruments)

    note over Handler,App: Incoming request
    Handler->>App: get_all_detectors()
    App-->>Handler: detectors map
    Handler->>Detector: detectors[0].run(request)

    rect rgb(230,245,255)
    Detector->>Instrumented: instrument_runtime(function_name) (enter)
    Instrumented->>Metrics: record start
    Detector->>Detector: process texts
    Instrumented->>Metrics: record runtime increment
    end

    Detector->>Instrumented: increment_detector_instruments(function_name, is_detection)
    Instrumented->>Metrics: increment requests/detections/errors
    Handler-->>App: return ContentsAnalysisResponse
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay extra attention to:
    • Correct labeling/use of registry_name and function_name across Detector and InstrumentedDetector.
    • Lifespan registration and lifecycle/cleanup in detectors/huggingface/app.py.
    • Metrics initialization and label consistency between InstrumentedDetector and app.state.instruments.
    • Dockerfile path changes and the new /common copy step.
    • Tests in tests/detectors/huggingface/test_metrics.py for flakiness due to timing assumptions.

Poem

🐇 I hopped through code to sew a thread,

Counters ticking where detectors tread.
Names and runtimes snugly kept,
Logs tucked safe where copies slept.
Metrics hum — a tidy bed.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.48% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Feat: Modularize Prometheus metrics and add them to the HF runtime" accurately reflects the primary objectives of the changeset. The "modularize Prometheus metrics" component refers to the central refactoring where instrumentation logic is extracted from BaseDetectorRegistry into a new reusable InstrumentedDetector base class, which is then adopted by both built-in and HuggingFace detectors. The "add them to the HF runtime" component directly describes how HFDetector is updated to inherit from InstrumentedDetector and report Prometheus metrics. The title is specific, uses concrete terminology ("Prometheus metrics," "HF runtime"), avoids vague terms, and clearly communicates the main architectural change in a single concise sentence.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18e1d2b and 89087c2.

📒 Files selected for processing (1)
  • detectors/Dockerfile.hf (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • detectors/Dockerfile.hf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Sourcery review
  • GitHub Check: build-and-push-ci

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `detectors/common/instrumented_detector.py:9-10` </location>
<code_context>
         raise HTTPException(status_code=500, detail="Detection error, check detector logs")


-    @contextlib.contextmanager
-    def instrument_runtime(self, function_name: str):
-        try:
-            start_time = time.time()
</code_context>

<issue_to_address>
**issue:** Duplicate definition of instrument_runtime context manager.

Please remove the duplicate instrument_runtime method to avoid confusion and simplify maintenance.
</issue_to_address>

### Comment 2
<location> `detectors/huggingface/app.py:46-47` </location>
<code_context>
 ):
-    detector: Detector = app.get_detector()
-    if not detector:
+    detectors: List[Detector] = list(app.get_all_detectors().values())
+    if not len(detectors) and not detectors[0]:
         raise RuntimeError("Detector is not initialized")
-    return ContentsAnalysisResponse(root=detector.run(request))
</code_context>

<issue_to_address>
**issue (bug_risk):** Logical error in detector existence check.

The current condition can cause an IndexError when 'detectors' is empty. Use 'if not detectors or not detectors[0]' to prevent this.
</issue_to_address>

### Comment 3
<location> `tests/detectors/huggingface/test_metrics.py:79` </location>
<code_context>
+
+
+
+    def test_prometheus(self, client: TestClient):
+        for i in range(20):
+            send_request(client=client, detect=i%3==0)
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for error metric increments.

Please add a test that triggers an error and checks that the error metric is incremented, such as by sending an unsupported model type or malformed input.
</issue_to_address>

### Comment 4
<location> `tests/detectors/huggingface/test_metrics.py:80-81` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 5
<location> `tests/detectors/huggingface/test_metrics.py:91-93` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 6
<location> `tests/detectors/huggingface/test_metrics.py:98-99` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Avoid loops in tests. ([`no-loop-in-tests`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/Python/Default-Rules/no-loop-in-tests))

<details><summary>Explanation</summary>Avoid complex code, like loops, in test functions.

Google's software engineering guidelines says:
"Clear tests are trivially correct upon inspection"
To reach that avoid complex code in tests:
* loops
* conditionals

Some ways to fix this:

* Use parametrized tests to get rid of the loop.
* Move the complex logic into helpers.
* Move the complex part into pytest fixtures.

> Complexity is most often introduced in the form of logic. Logic is defined via the imperative parts of programming languages such as operators, loops, and conditionals. When a piece of code contains logic, you need to do a bit of mental computation to determine its result instead of just reading it off of the screen. It doesn't take much logic to make a test more difficult to reason about.

Software Engineering at Google / [Don't Put Logic in Tests](https://abseil.io/resources/swe-book/html/ch12.html#donapostrophet_put_logic_in_tests)
</details>
</issue_to_address>

### Comment 7
<location> `detectors/common/instrumented_detector.py:37-40` </location>
<code_context>
    def increment_detector_instruments(self, function_name: str, is_detection: bool):
        """Increment the detection and request counters, automatically update rates"""
        if self.instruments.get("requests"):
            self.instruments["requests"].labels(self.registry_name, function_name).inc()

        # The labels() function will initialize the counters if not already created.
        # This prevents the counters not existing until they are first incremented
        # If the counters have already been created, this is just a cheap dict.get() call
        if self.instruments.get("errors"):
            _ = self.instruments["errors"].labels(self.registry_name, function_name)
        if self.instruments.get("runtime"):
            _ = self.instruments["runtime"].labels(self.registry_name, function_name)

        # create and/or increment the detection counter
        if self.instruments.get("detections"):
            detection_counter = self.instruments["detections"].labels(self.registry_name, function_name)
            if is_detection:
                detection_counter.inc()

</code_context>

<issue_to_address>
**suggestion (code-quality):** We've found these issues:

- Swap positions of nested conditionals ([`swap-nested-ifs`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-nested-ifs/))
- Move assignments closer to their usage ([`move-assign`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/move-assign/))

```suggestion
        if is_detection:
            if self.instruments.get("detections"):
                detection_counter = self.instruments["detections"].labels(self.registry_name, function_name)
```
</issue_to_address>

### Comment 8
<location> `detectors/huggingface/app.py:47-49` </location>
<code_context>
@app.post(
    "/api/v1/text/contents",
    response_model=ContentsAnalysisResponse,
    description="""Detectors that work on content text, be it user prompt or generated text. \
                    Generally classification type detectors qualify for this. <br>""",
    responses={
        404: {"model": Error, "description": "Resource Not Found"},
        422: {"model": Error, "description": "Validation Error"},
    },
)
async def detector_unary_handler(
    request: ContentAnalysisHttpRequest,
):
    detectors: List[Detector] = list(app.get_all_detectors().values())
    if not len(detectors) and not detectors[0]:
        raise RuntimeError("Detector is not initialized")
    return ContentsAnalysisResponse(root=detectors[0].run(request))

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Split conditional into multiple branches ([`split-or-ifs`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/split-or-ifs/))
- Merge duplicate blocks in conditional ([`merge-duplicate-blocks`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/merge-duplicate-blocks/))
- Remove redundant conditional ([`remove-redundant-if`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-redundant-if/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
detectors/common/app.py (1)

157-159: Fix logging format bug.

logger.info("config:", var) triggers formatting errors. Use a placeholder or f-string.

-    logger.info("config:", os.getenv("CONFIG_FILE_PATH"))
+    logger.info("config: %s", os.getenv("CONFIG_FILE_PATH"))
detectors/built_in/base_detector_registry.py (1)

22-29: Improve error path: count, log with stack, raise proper HTTP 500.

Use exc_info (or logger.exception) for trace, and raise FastAPI’s HTTPException.

     def throw_internal_detector_error(self, function_name: str, logger: logging.Logger, exception: Exception, increment_requests: bool):
         """consistent handling of internal errors within a detection function"""
         if increment_requests and self.instruments.get("requests"):
             self.instruments["requests"].labels(self.registry_name, function_name).inc()
         self.increment_error_instruments(function_name)
-        logger.error(exception)
-        raise HTTPException(status_code=500, detail="Detection error, check detector logs")
+        logger.exception("Detection error in %s.%s", self.registry_name, function_name)
+        raise HTTPException(status_code=500, detail="Detection error, check detector logs")
detectors/huggingface/detector.py (1)

85-93: Generalize CausalLM detection beyond Granite.

The current code hardcodes detection to only GraniteForCausalLM while using the generic AutoModelForCausalLM loader, which supports any model with a *ForCausalLM architecture. This is inconsistent with the parallel checks for ForTokenClassification (line 80) and ForSequenceClassification (line 88), which use generic patterns. Changing the check to ForCausalLM aligns the detector with Hugging Face conventions and enables support for Llama, Mistral, and other CausalLM variants without additional code changes.

-        elif any("GraniteForCausalLM" in arch for arch in config.architectures):
+        elif any("ForCausalLM" in arch for arch in config.architectures):
             self.is_causal_lm = True
             self.model = AutoModelForCausalLM.from_pretrained(model_files_path)
detectors/huggingface/app.py (1)

16-26: Detector cleanup never runs: wrong retrieval key and variable shadowing.

You store the detector under detector.model_name but later call app.get_detector() with the default key, so close() is skipped.

Simplest: close all registered detectors and avoid shadowing the earlier detector variable.

-    # Clean up the ML models and release the resources
-    detector: Detector = app.get_detector()
-    if detector and hasattr(detector, 'close'):
-        detector.close()
-    app.cleanup_detector()
+    # Clean up the ML models and release the resources
+    for d in app.get_all_detectors().values():
+        if hasattr(d, "close"):
+            d.close()
+    app.cleanup_detector()
🧹 Nitpick comments (7)
detectors/Dockerfile.hf (1)

99-101: Make detectors a proper package to avoid namespace gotchas.

Touching __init__.py files avoids relying on PEP 420 across environments and linters.

 COPY ./common /app/detectors/common
 COPY ./huggingface/detector.py /app/detectors/huggingface/
 RUN mkdir /common; cp /app/detectors/common/log_conf.yaml /common/
+RUN mkdir -p /app/detectors/huggingface /app/detectors/common && \
+    touch /app/detectors/__init__.py /app/detectors/huggingface/__init__.py /app/detectors/common/__init__.py
tests/detectors/huggingface/test_metrics.py (2)

1-7: Tidy imports and args to quiet linters.

Remove unused imports and rename *args to *_args.

-import random
 import os
 import time
-from collections import namedtuple
 from unittest import mock
-from unittest.mock import Mock, MagicMock
+from unittest.mock import Mock
-        def detection_fn(*args, **kwargs):
+        def detection_fn(*_args, **kwargs):
             output = Mock()

36-39: Make metric parsing resilient to extra spaces.

Split once to avoid issues with multiple spaces.

-            key, value = m.split(" ")
+            key, value = m.split(" ", 1)
detectors/huggingface/detector.py (1)

108-116: Stabilize function_name to avoid time‑series churn.

Trim common KServe/Deployment suffixes in addition to -predictor. Example:

+import re
@@
         if os.environ.get("DETECTOR_NAME"):
             pod_name = os.environ.get("DETECTOR_NAME")
             if "-predictor" in pod_name:
                 pod_name = pod_name.split("-predictor")[0]
-            self.function_name = pod_name
+            # Drop common replica/deployment hashes, e.g. "-deployment-xxxxx" or "-xxxxx"
+            self.function_name = re.sub(r'(-deployment)?-[a-z0-9]{5,}$', '', pod_name)
         else:
             self.function_name = os.path.basename(model_files_path)

Place the import re near the other imports.

detectors/common/instrumented_detector.py (2)

9-17: Record runtime on success and failure; use a monotonic clock.

Move the increment into finally and use time.perf_counter() to avoid wall‑clock jumps.

-    @contextlib.contextmanager
-    def instrument_runtime(self, function_name: str):
-        try:
-            start_time = time.time()
-            yield
-            if self.instruments.get("runtime"):
-                self.instruments["runtime"].labels(self.registry_name, function_name).inc(time.time() - start_time)
-        finally:
-            pass
+    @contextlib.contextmanager
+    def instrument_runtime(self, function_name: str):
+        start_time = time.perf_counter()
+        try:
+            yield
+        finally:
+            runtime = time.perf_counter() - start_time
+            if self.instruments.get("runtime"):
+                self.instruments["runtime"].labels(self.registry_name, function_name).inc(runtime)

20-22: Rename parameter for clarity (it isn’t only gauges).

The mapping holds counters and more; align the name to avoid confusion.

-    def add_instruments(self, gauges):
-        self.instruments = gauges
+    def add_instruments(self, instruments):
+        self.instruments = instruments
detectors/huggingface/app.py (1)

2-2: Remove unused typing import.

Annotated isn’t used.

-from typing import Annotated, List
+from typing import List
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfe7f83 and 6c77d54.

📒 Files selected for processing (12)
  • detectors/Dockerfile.hf (1 hunks)
  • detectors/built_in/base_detector_registry.py (1 hunks)
  • detectors/common/app.py (2 hunks)
  • detectors/common/instrumented_detector.py (1 hunks)
  • detectors/common/requirements-dev.txt (1 hunks)
  • detectors/huggingface/app.py (3 hunks)
  • detectors/huggingface/deploy/servingruntime.yaml (2 hunks)
  • detectors/huggingface/detector.py (7 hunks)
  • detectors/huggingface/locustfile.py (0 hunks)
  • detectors/huggingface/requirements.txt (1 hunks)
  • tests/detectors/builtIn/test_filetype.py (0 hunks)
  • tests/detectors/huggingface/test_metrics.py (1 hunks)
💤 Files with no reviewable changes (2)
  • detectors/huggingface/locustfile.py
  • tests/detectors/builtIn/test_filetype.py
🧰 Additional context used
🧬 Code graph analysis (4)
tests/detectors/huggingface/test_metrics.py (3)
detectors/huggingface/detector.py (1)
  • Detector (38-317)
detectors/common/app.py (1)
  • set_detector (122-124)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
detectors/huggingface/app.py (4)
detectors/common/app.py (3)
  • DetectorBaseAPI (35-136)
  • set_detector (122-124)
  • get_all_detectors (130-132)
detectors/huggingface/detector.py (2)
  • Detector (38-317)
  • run (278-302)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
detectors/common/scheme.py (1)
  • ContentsAnalysisResponse (140-143)
detectors/huggingface/detector.py (1)
detectors/common/instrumented_detector.py (4)
  • InstrumentedDetector (4-57)
  • instrument_runtime (10-17)
  • instrument_runtime (50-57)
  • increment_detector_instruments (24-41)
detectors/built_in/base_detector_registry.py (2)
detectors/common/instrumented_detector.py (1)
  • InstrumentedDetector (4-57)
detectors/common/scheme.py (1)
  • ContentAnalysisResponse (126-137)
🪛 Ruff (0.14.1)
tests/detectors/huggingface/test_metrics.py

53-53: Unused function argument: args

(ARG001)

detectors/common/instrumented_detector.py

50-50: Redefinition of unused instrument_runtime from line 10

(F811)

detectors/huggingface/app.py

48-48: Avoid specifying long messages outside the exception class

(TRY003)

detectors/huggingface/detector.py

298-298: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (5)
detectors/huggingface/deploy/servingruntime.yaml (1)

12-14: Port alignment looks good; confirm scrape settings.

Port/path (8000, /metrics) match the container config. Ensure your cluster scrape config (annotations or ServiceMonitor) is in place; if relying on annotations, verify prometheus.io/scrape: "true" is present on the Pod/Service.

Also applies to: 27-31, 42-44

detectors/common/requirements-dev.txt (1)

6-6: Verify protobuf pin compatibility.

protobuf==6.33.0 is fine for dev, but check for conflicts with any TF/gRPC tools in this repo.

Would you like a quick script to run pip check against your dev env to surface conflicts?

detectors/huggingface/requirements.txt (1)

1-1: No functional change.

transformers==4.50.0 unchanged; OK.

detectors/huggingface/detector.py (1)

288-302: Instrumentation placement looks right.

Runtime wraps the whole request; counters increment once per request with a detection flag. Matches tests.

detectors/common/instrumented_detector.py (1)

29-36: Good defensive label pre‑initialization.

Pre-warming the labeled series avoids first-use surprises and is cheap. LGTM.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
detectors/built_in/base_detector_registry.py (1)

1-3: Import HTTPException from FastAPI, not http.client.

The current import of HTTPException from http.client does not support status_code and detail keyword arguments used on Line 28. This will cause a runtime error when throw_internal_detector_error is called.

Apply this diff to fix the import:

 import logging
 from abc import ABC, abstractmethod
-from http.client import HTTPException
+from fastapi import HTTPException
 from typing import List
detectors/huggingface/deploy/servingruntime.yaml (1)

38-41: High-cardinality risk: metadata.name includes pod-specific suffixes.

Using metadata.name as the DETECTOR_NAME creates a unique label value for each pod (including replica hashes or random suffixes), which causes Prometheus cardinality explosion and metric churn on pod restarts or scaling events.

While detector.py (lines 113-115) strips the -predictor suffix, other pod-specific identifiers remain and will still create high cardinality.

Prefer using a stable identifier from KServe labels:

         - name: DETECTOR_NAME
           valueFrom:
             fieldRef:
-              fieldPath: metadata.name
+              fieldPath: metadata.labels['serving.kserve.io/inferenceservice']

If the KServe label is unavailable, consider setting a static value and handling it in code as a fallback.

detectors/huggingface/app.py (2)

45-47: Use HTTPException instead of RuntimeError for HTTP errors.

Raising RuntimeError for a service initialization failure is not appropriate for a REST API. Use HTTPException with status code 503 to properly communicate service unavailability to clients.

Apply this diff:

+from fastapi import HTTPException
+
 async def detector_unary_handler(
         request: ContentAnalysisHttpRequest,
 ):
     detectors: List[Detector] = list(app.get_all_detectors().values())
     if not len(detectors) or not detectors[0]:
-        raise RuntimeError("Detector is not initialized")
+        raise HTTPException(status_code=503, detail="Detector not initialized")
     return ContentsAnalysisResponse(root=detectors[0].run(request))

48-48: Blocking call: run detector inference in a threadpool.

detectors[0].run(request) performs synchronous, CPU/GPU-bound model inference directly in the async handler, which blocks FastAPI's event loop and prevents concurrent request handling.

Wrap the call in a threadpool to preserve async concurrency:

+from starlette.concurrency import run_in_threadpool
+
 async def detector_unary_handler(
         request: ContentAnalysisHttpRequest,
 ):
     detectors: List[Detector] = list(app.get_all_detectors().values())
     if not len(detectors) or not detectors[0]:
         raise HTTPException(status_code=503, detail="Detector not initialized")
-    return ContentsAnalysisResponse(root=detectors[0].run(request))
+    result = await run_in_threadpool(detectors[0].run, request)
+    return ContentsAnalysisResponse(root=result)
🧹 Nitpick comments (1)
tests/detectors/huggingface/test_metrics.py (1)

53-62: Consider removing or prefixing the unused args parameter.

The detection_fn doesn't use the args parameter. You can either remove it or prefix it with an underscore to indicate it's intentionally unused.

Apply this diff:

-        def detection_fn(*args, **kwargs):
+        def detection_fn(**kwargs):
             output = Mock()
             if kwargs["input_ids"].shape[-1] > 10:
                 output.logits = torch.tensor([[0.0, 1.0]])
             else:
                 output.logits = torch.tensor([[1.0, 0.0]])
 
             if kwargs["input_ids"].shape[-1] > 100:
                 time.sleep(.25)
             return output
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c77d54 and 8c75c56.

📒 Files selected for processing (12)
  • detectors/Dockerfile.hf (1 hunks)
  • detectors/built_in/base_detector_registry.py (1 hunks)
  • detectors/common/app.py (2 hunks)
  • detectors/common/instrumented_detector.py (1 hunks)
  • detectors/common/requirements-dev.txt (1 hunks)
  • detectors/huggingface/app.py (3 hunks)
  • detectors/huggingface/deploy/servingruntime.yaml (2 hunks)
  • detectors/huggingface/detector.py (10 hunks)
  • detectors/huggingface/locustfile.py (0 hunks)
  • detectors/huggingface/requirements.txt (1 hunks)
  • tests/detectors/builtIn/test_filetype.py (0 hunks)
  • tests/detectors/huggingface/test_metrics.py (1 hunks)
💤 Files with no reviewable changes (2)
  • detectors/huggingface/locustfile.py
  • tests/detectors/builtIn/test_filetype.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • detectors/common/app.py
  • detectors/Dockerfile.hf
  • detectors/huggingface/requirements.txt
  • detectors/common/instrumented_detector.py
  • detectors/common/requirements-dev.txt
🧰 Additional context used
🧬 Code graph analysis (4)
detectors/huggingface/detector.py (2)
detectors/common/instrumented_detector.py (3)
  • InstrumentedDetector (5-45)
  • instrument_runtime (11-18)
  • increment_detector_instruments (23-40)
detectors/common/scheme.py (3)
  • ContentAnalysisHttpRequest (112-123)
  • ContentAnalysisResponse (126-137)
  • ContentsAnalysisResponse (140-143)
detectors/built_in/base_detector_registry.py (2)
detectors/common/instrumented_detector.py (1)
  • InstrumentedDetector (5-45)
detectors/common/scheme.py (1)
  • ContentAnalysisResponse (126-137)
detectors/huggingface/app.py (4)
detectors/common/app.py (3)
  • DetectorBaseAPI (35-136)
  • set_detector (122-124)
  • get_all_detectors (130-132)
detectors/huggingface/detector.py (2)
  • Detector (40-320)
  • run (281-305)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
detectors/common/scheme.py (2)
  • ContentAnalysisHttpRequest (112-123)
  • ContentsAnalysisResponse (140-143)
tests/detectors/huggingface/test_metrics.py (3)
detectors/huggingface/detector.py (1)
  • Detector (40-320)
detectors/common/app.py (1)
  • set_detector (122-124)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
🪛 Ruff (0.14.1)
detectors/huggingface/detector.py

301-301: Avoid specifying long messages outside the exception class

(TRY003)

detectors/huggingface/app.py

47-47: Avoid specifying long messages outside the exception class

(TRY003)

tests/detectors/huggingface/test_metrics.py

53-53: Unused function argument: args

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (4)
detectors/huggingface/detector.py (2)

108-118: Good implementation of registry and function name initialization.

The code correctly derives registry_name from the model type and handles function_name by stripping the -predictor suffix from Kubernetes pod names. The fallback to basename(model_files_path) ensures resilience when the environment variable is absent.


281-305: Instrumentation integration looks correct.

The run method properly wraps the processing loop with instrument_runtime and calls increment_detector_instruments with the computed detection flag. This aligns well with the InstrumentedDetector base class pattern.

tests/detectors/huggingface/test_metrics.py (1)

79-104: Well-structured metrics validation tests.

Both test methods effectively validate the Prometheus metrics:

  • test_prometheus correctly verifies detection counts (7 out of 20 requests) and request/error totals
  • test_runtime_metrics validates runtime accumulation with appropriate tolerance (±0.2s for 2.0s expected)

The deterministic mocking approach ensures stable, repeatable test results.

detectors/huggingface/app.py (1)

15-25: Lifespan initialization and cleanup look correct.

The detector is properly initialized once at startup, registered with the app, and instrumented. The cleanup logic safely handles resource release with appropriate guards.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
detectors/huggingface/app.py (1)

20-25: Fix detector cleanup; current code may skip closing.

You register under detector.model_name but retrieve with default key, so close() might not be called. Iterate and close all detectors.

-    detector: Detector = app.get_detector()
-    if detector and hasattr(detector, 'close'):
-        detector.close()
-    app.cleanup_detector()
+    for d in app.get_all_detectors().values():
+        if d and hasattr(d, "close"):
+            d.close()
+    app.cleanup_detector()
♻️ Duplicate comments (2)
detectors/huggingface/deploy/servingruntime.yaml (1)

38-41: DETECTOR_NAME source may be per‑pod (repeat of prior note).

Using metadata.name yields pod‑scoped values. If you later need aggregation by service/model, prefer the KServe InferenceService label. If pod granularity is intentional, ignore.

-        - name: DETECTOR_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.name
+        - name: DETECTOR_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.labels['serving.kserve.io/inferenceservice']
detectors/huggingface/app.py (1)

45-49: Return proper HTTP error for uninitialized detector.

Use FastAPI’s HTTPException(503, ...) instead of RuntimeError to signal readiness.

-    if not len(detectors) or not detectors[0]:
-        raise RuntimeError("Detector is not initialized")
+    if not detectors or not detectors[0]:
+        from fastapi import HTTPException
+        raise HTTPException(status_code=503, detail="Detector not initialized")
🧹 Nitpick comments (8)
detectors/huggingface/deploy/servingruntime.yaml (1)

11-14: Add scrape toggle for portability (optional).

Consider adding prometheus.io/scrape: "true" alongside port/path to work with generic Prometheus scrapers that honor this triplet.

   annotations:
+    prometheus.io/scrape: 'true'
     prometheus.io/port: '8000'
     prometheus.io/path: '/metrics'
detectors/Dockerfile.hf (1)

83-91: Clarify layered requirements installation.

You overwrite requirements.txt and run pip install -r requirements.txt twice. It works, but is opaque and can mask conflicts. Prefer explicit paths or a combined constraints flow.

-COPY ./common/requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-
-COPY ./huggingface/requirements.txt .
+COPY ./common/requirements.txt /tmp/requirements-common.txt
+RUN pip install --no-cache-dir -r /tmp/requirements-common.txt
+
+COPY ./huggingface/requirements.txt /tmp/requirements-hf.txt
 RUN if [ "$TARGETARCH" = "ppc64le" ] || [ "$TARGETARCH" = "s390x" ]; then \
        source /opt/rh/gcc-toolset-13/enable                            ; \
     fi                                                                && \
-    pip install --no-cache-dir -r requirements.txt
+    pip install --no-cache-dir -r /tmp/requirements-hf.txt
tests/detectors/huggingface/test_metrics.py (2)

36-41: Make metrics parsing robust to spaces.

Use rsplit(" ", 1) so label values with spaces won’t break parsing.

-        if "trustyai" in m and "{" in m:
-            key, value = m.split(" ")
-            metric_dict[key] = float(value)
+        if "trustyai" in m and "{" in m:
+            try:
+                key, value = m.rsplit(" ", 1)
+                metric_dict[key] = float(value)
+            except ValueError:
+                pass

53-71: Silence Ruff warning and simplify call signature.

detection_fn doesn’t use positional args; drop them and adjust the call.

-        def detection_fn(*args, **kwargs):
+        def detection_fn(**kwargs):
             output = Mock()
             if kwargs["input_ids"].shape[-1] > 10:
                 output.logits = torch.tensor([[0.0, 1.0]])
             else:
                 output.logits = torch.tensor([[1.0, 0.0]])
             if kwargs["input_ids"].shape[-1] > 100:
                 time.sleep(.25)
             return output
@@
-            def __call__(self, *args, **kwargs):
-                return detection_fn(*args, **kwargs)
+            def __call__(self, **kwargs):
+                return detection_fn(**kwargs)

If other call sites pass positionals, this change will surface quickly in tests; expected: no positional args are used.

detectors/built_in/base_detector_registry.py (1)

22-29: Include traceback in error logs for diagnostics.

Capture the stack trace when logging internal detector errors.

-        logger.error(exception)
+        logger.error("Internal detector error", exc_info=exception)

Or:

-        logger.error(exception)
+        logger.exception("Internal detector error")
detectors/huggingface/detector.py (3)

55-55: Consider improving initialization order.

The super().__init__() call uses the default registry_name value, which is then overwritten in initialize_model on line 108. While functional, the initialization flow would be clearer if registry_name were set before or passed to the parent constructor.

Also applies to: 108-108


292-304: Good instrumentation integration, consider error handling.

The instrumentation is correctly implemented with runtime measurement and counter updates. The is_detection logic properly checks whether any text triggered detections.

Consider adding a try-catch block to call self.increment_error_instruments(self.function_name) when exceptions occur, ensuring error metrics are tracked. If error handling is managed at a higher level, this can be deferred.


301-301: Minor: Consider exception message style.

Static analysis suggests keeping exception messages concise or defining custom exception classes. The current message is clear, so this is a low-priority style consideration.

Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c75c56 and 18e1d2b.

📒 Files selected for processing (12)
  • detectors/Dockerfile.hf (1 hunks)
  • detectors/built_in/base_detector_registry.py (1 hunks)
  • detectors/common/app.py (2 hunks)
  • detectors/common/instrumented_detector.py (1 hunks)
  • detectors/common/requirements-dev.txt (1 hunks)
  • detectors/huggingface/app.py (3 hunks)
  • detectors/huggingface/deploy/servingruntime.yaml (2 hunks)
  • detectors/huggingface/detector.py (10 hunks)
  • detectors/huggingface/locustfile.py (0 hunks)
  • detectors/huggingface/requirements.txt (1 hunks)
  • tests/detectors/builtIn/test_filetype.py (0 hunks)
  • tests/detectors/huggingface/test_metrics.py (1 hunks)
💤 Files with no reviewable changes (2)
  • tests/detectors/builtIn/test_filetype.py
  • detectors/huggingface/locustfile.py
✅ Files skipped from review due to trivial changes (1)
  • detectors/huggingface/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (2)
  • detectors/common/instrumented_detector.py
  • detectors/common/app.py
🧰 Additional context used
🧬 Code graph analysis (4)
detectors/huggingface/app.py (4)
detectors/common/app.py (3)
  • DetectorBaseAPI (32-133)
  • set_detector (119-121)
  • get_all_detectors (127-129)
detectors/huggingface/detector.py (2)
  • Detector (40-320)
  • run (281-305)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
detectors/common/scheme.py (2)
  • ContentAnalysisHttpRequest (112-123)
  • ContentsAnalysisResponse (140-143)
detectors/huggingface/detector.py (2)
detectors/common/instrumented_detector.py (3)
  • InstrumentedDetector (5-45)
  • instrument_runtime (11-18)
  • increment_detector_instruments (23-40)
detectors/common/scheme.py (3)
  • ContentAnalysisHttpRequest (112-123)
  • ContentAnalysisResponse (126-137)
  • ContentsAnalysisResponse (140-143)
detectors/built_in/base_detector_registry.py (2)
detectors/common/instrumented_detector.py (1)
  • InstrumentedDetector (5-45)
detectors/common/scheme.py (1)
  • ContentAnalysisResponse (126-137)
tests/detectors/huggingface/test_metrics.py (3)
detectors/huggingface/detector.py (1)
  • Detector (40-320)
detectors/common/app.py (1)
  • set_detector (119-121)
detectors/common/instrumented_detector.py (1)
  • add_instruments (20-21)
🪛 Ruff (0.14.1)
detectors/huggingface/app.py

47-47: Avoid specifying long messages outside the exception class

(TRY003)

detectors/huggingface/detector.py

301-301: Avoid specifying long messages outside the exception class

(TRY003)

tests/detectors/huggingface/test_metrics.py

53-53: Unused function argument: args

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-and-push-ci
  • GitHub Check: Sourcery review
🔇 Additional comments (4)
detectors/common/requirements-dev.txt (1)

6-6: protobuf==6.33.0 exists and has no detected conflicts.

The version is a valid release on PyPI with no dependencies. Transformers 4.50.0 lists protobuf as optional (under extras) with no explicit version constraint, so no conflict exists. The specific pin is appropriate for a dev requirements file.

detectors/huggingface/detector.py (3)

108-118: LGTM: Well-designed runtime identification logic.

The function_name initialization properly handles Kubernetes pod naming conventions and provides a sensible fallback. The registry_name assignment is clear and consistent.


244-244: LGTM: Improved readability.

The set union operation for combining safe labels is cleaner, and the multi-line condition formatting enhances readability without changing the logic.

Also applies to: 263-265


307-320: LGTM: Clean resource management.

The cleanup logic properly moves the model to CPU before deletion, clears references, runs garbage collection, and empties the CUDA cache. This is thorough resource management.

Comment on lines +45 to +76
def client(self):
current_dir = os.path.dirname(__file__)
parent_dir = os.path.dirname(os.path.dirname(current_dir))
os.environ["MODEL_DIR"] = os.path.join(parent_dir, "dummy_models", "bert/BertForSequenceClassification")

detector = Detector()

# patch the model to allow for control over detections - long messages will flag
def detection_fn(*args, **kwargs):
output = Mock()
if kwargs["input_ids"].shape[-1] > 10:
output.logits = torch.tensor([[0.0, 1.0]])
else:
output.logits = torch.tensor([[1.0, 0.0]])

if kwargs["input_ids"].shape[-1] > 100:
time.sleep(.25)
return output

class ModelMock:
def __init__(self):
self.config = Mock()
self.config.id2label = detector.model.config.id2label
self.config.problem_type = detector.model.config.problem_type
def __call__(self, *args, **kwargs):
return detection_fn(*args, **kwargs)

detector.model = ModelMock()
app.set_detector(detector, detector.registry_name)
detector.add_instruments(app.state.instruments)
return TestClient(app)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Ensure test fixture isn’t overridden by app lifespan.

Creating TestClient(app) may trigger the app lifespan that constructs a new Detector, potentially replacing your patched one. Please confirm your Starlette/FastAPI versions’ TestClient behavior and, if needed, patch after startup or disable lifespan for this test.

You can verify at runtime:

If it overrides, set a no-op lifespan in the test:

from contextlib import asynccontextmanager
@asynccontextmanager
async def noop(app): yield
app.router.lifespan_context = noop

🏁 Script executed:

#!/bin/bash
python - <<'PY'
from detectors.huggingface.app import app
from starlette.testclient import TestClient
print("before:", app.get_all_detectors())
tc = TestClient(app)
print("after:", app.get_all_detectors())
PY

Length of output: 288


🏁 Script executed:

#!/bin/bash
# Find lifespan definitions in app.py
rg -n "lifespan" --type=py

Length of output: 489


🏁 Script executed:

#!/bin/bash
# Inspect the app.py file to understand its structure
cat detectors/huggingface/app.py

Length of output: 1698


🏁 Script executed:

#!/bin/bash
# Check the full test file context around the client fixture
cat tests/detectors/huggingface/test_metrics.py

Length of output: 3985


🏁 Script executed:

#!/bin/bash
# Check if there's a setup.py or requirements file to understand dependencies
fd -e "requirements.txt|setup.py|pyproject.toml" . | head -5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for any lifespan-related patterns in test files
rg -n "lifespan|set_detector|add_instruments" tests/ --type=py -B2 -A2

Length of output: 2184


🏁 Script executed:

#!/bin/bash
# Check what get_all_detectors does
rg -n "get_all_detectors\|set_detector" --type=py -B3 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for FastAPI app initialization and startup/shutdown hooks
rg -n "@app\.(on_event|startup|shutdown)\|app\.add_event_handler" --type=py -B2 -A3

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check FastAPI/Starlette version in dependencies
rg -n "fastapi|starlette" --type=txt --type=py | grep -E "version|==" | head -10

Length of output: 52


🏁 Script executed:

#!/bin/bash
# Look for app initialization and detector setup logic
rg -n "app\s*=\s*FastAPI|app\s*=\s*Starlette" detectors/ --type=py -B2 -A10

Length of output: 3216


🏁 Script executed:

#!/bin/bash
# Check if there's any existing test that verifies lifespan behavior or detector state
rg -n "client\s*=\s*TestClient\|@pytest.fixture" tests/ --type=py -B5 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for Detector() instantiation patterns to understand where it's initialized
rg -n "class\s+Detector|Detector()" detectors/huggingface/ --type=py -B2 -A3

Length of output: 3489


🏁 Script executed:

#!/bin/bash
# Check app.state.instruments and how it's initialized
rg -n "app\.state\|state\.instruments\|registry_name" detectors/huggingface/ --type=py -B2 -A2

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for @app.get or @app.post to understand if detector is used in endpoints
rg -n "@app\.|get_detector\|set_detector" detectors/huggingface/app.py --type=py -B1 -A5

Length of output: 304


🏁 Script executed:

#!/bin/bash
# Check if app has a lifespan context manager defined
ast-grep --pattern '@asynccontextmanager' detectors/huggingface/app.py 2>/dev/null || echo "No asynccontextmanager found via ast-grep"

Length of output: 52


🏁 Script executed:

#!/bin/bash
# Find all files in the detectors package related to initialization
fd -e ".py" detectors/huggingface/ | xargs grep -l "__init__\|lifespan" | head -5

Length of output: 393


🏁 Script executed:

#!/bin/bash
# Check if there's an __init__.py that might initialize the app
cat detectors/huggingface/__init__.py 2>/dev/null || echo "File not found"

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for app creation and any lifespan event handlers
rg -n "app\s*=|@app\.on_event|lifespan" detectors/huggingface/app.py --type=py

Length of output: 89


🏁 Script executed:

#!/bin/bash
# Check what Detector class does during __init__
rg -n "class Detector" detectors/ --type=py -B2 -A20

Length of output: 2873


🏁 Script executed:

#!/bin/bash
# Look at the test's import statements and setup
head -50 tests/detectors/huggingface/test_metrics.py

Length of output: 1527


🏁 Script executed:

#!/bin/bash
# Check if registry_name is defined in Detector
rg -n "registry_name" detectors/ --type=py -B3 -A3

Length of output: 6736


🏁 Script executed:

#!/bin/bash
# Search for how app.get_all_detectors is implemented
rg -n "def get_all_detectors\|detectors\s*=" detectors/huggingface/app.py --type=py -B2 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check for FastAPI app definition line by line
rg -n "from fastapi import\|FastAPI()" detectors/huggingface/ --type=py

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for any app event handlers or middleware setup
rg -n "@app\.\|app\.add" detectors/huggingface/app.py --type=py

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check for app attribute initialization patterns
rg -n "app\\.state|Detector" tests/detectors/huggingface/test_metrics.py --type=py -B2 -A2

Length of output: 608


🏁 Script executed:

#!/bin/bash
# Look for import patterns and fixture initialization
rg -n "@pytest|import|from" tests/detectors/huggingface/test_metrics.py --type=py | head -20

Length of output: 475


🏁 Script executed:

#!/bin/bash
# Check the actual line 45-76 snippet from the test file
sed -n '45,76p' tests/detectors/huggingface/test_metrics.py

Length of output: 1269


🏁 Script executed:

#!/bin/bash
# Check if there's any lifespan defined as an on_event handler
sed -n '1,100p' detectors/huggingface/app.py

Length of output: 1698


🏁 Script executed:

#!/bin/bash
# Check Starlette documentation format by looking for async startup patterns
rg -n "startup\|shutdown\|lifespan" detectors/huggingface/ --type=py

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for app.set_detector and app.state initialization anywhere
rg -n "app\\.set_detector|app\\.state\\s*=" --type=py

Length of output: 804


🏁 Script executed:

#!/bin/bash
# Check conftest or any shared test utilities
fd -e "conftest.py" tests/

Length of output: 329


🏁 Script executed:

#!/bin/bash
# Look for shared test fixtures
rg -n "@pytest.fixture" tests/ --type=py -B2 -A10

Length of output: 29087


🏁 Script executed:

#!/bin/bash
# List files in test directory structure
find tests/detectors/huggingface/ -type f -name "*.py"

Length of output: 463


🏁 Script executed:

#!/bin/bash
# Check full context of the app.py file
wc -l detectors/huggingface/app.py && echo "---" && cat detectors/huggingface/app.py

Length of output: 1734


🏁 Script executed:

#!/bin/bash
# Check if there's a fixture conftest in the huggingface test dir
ls -la tests/detectors/huggingface/

Length of output: 743


🏁 Script executed:

#!/bin/bash
# Check for any pytest markers or setup that might control lifespan
rg -n "pytest|setup|teardown" tests/detectors/huggingface/test_metrics.py --type=py

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Check imports in test_metrics.py to understand what's being tested
head -100 tests/detectors/huggingface/test_metrics.py | grep -E "^import|^from"

Length of output: 303


🏁 Script executed:

#!/bin/bash
# Check the client fixture more carefully - look at broader context
sed -n '40,85p' tests/detectors/huggingface/test_metrics.py

Length of output: 1775


🏁 Script executed:

#!/bin/bash
# Find all fixture definitions across test suite
rg -n "def client|def app" tests/detectors/huggingface/ --type=py

Length of output: 68


🏁 Script executed:

#!/bin/bash
# Look for app initialization and detector management functions
rg -n "def set_detector|def get_all_detectors|def get_detector" detectors/ --type=py -A5

Length of output: 932


🏁 Script executed:

#!/bin/bash
# Check if app has any built-in lifespan events
rg -n "on_event\|@app\\." detectors/huggingface/app.py --type=py

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for where Instrumentator is used and its impact
rg -n "Instrumentator\|add_instruments" detectors/huggingface/ --type=py -B3 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check if there's a state object initialization anywhere
rg -n "state\\.instruments\|app\\.state" detectors/huggingface/app.py --type=py -B5 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Final check: Look at the complete app.py and test structure
echo "=== APP.PY ===" && cat detectors/huggingface/app.py && echo -e "\n=== TEST FIXTURE ===" && sed -n '40,80p' tests/detectors/huggingface/test_metrics.py

Length of output: 3150


🏁 Script executed:

#!/bin/bash
# Verify FastAPI version requirement
find . -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" | xargs cat 2>/dev/null | grep -i "fastapi\|starlette"

Length of output: 59


🏁 Script executed:

#!/bin/bash
# Check complete test file structure and any other fixtures
cat tests/detectors/huggingface/test_metrics.py | head -200

Length of output: 3985


🏁 Script executed:

#!/bin/bash
# Look at detector initialization and detector state management
grep -rn "class Detector\|def registry_name\|registry_name =" detectors/huggingface/ --include="*.py" -B2 -A10

Length of output: 1724


🏁 Script executed:

#!/bin/bash
# Check if there's global state management for detectors
grep -rn "detectors\s*=\|_detectors\s*=\|DETECTORS\s*=" detectors/huggingface/ --include="*.py" -B2 -A5

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for app lifecycle management in app.py
cat detectors/huggingface/app.py

Length of output: 1698


🏁 Script executed:

#!/bin/bash
# Check if Detector() call during app initialization is the issue
grep -n "Detector()" detectors/huggingface/app.py tests/detectors/huggingface/test_metrics.py

Length of output: 134


🏁 Script executed:

#!/bin/bash
# Check TestClient usage patterns in other tests
find tests/ -name "*.py" -exec grep -l "TestClient" {} \;

Length of output: 201


🏁 Script executed:

#!/bin/bash
# Look for lifespan or startup event patterns across the whole codebase
grep -rn "lifespan\|on_event.*startup" . --include="*.py" | head -20

Length of output: 503


Patch test fixture to prevent lifespan from overriding mocked detector.

The test fixture's mocked detector is replaced when TestClient(app) triggers the app lifespan. Both the fixture's detector and the lifespan's detector use the same registry key (based on model_name), so the lifespan's newly-created detector overwrites the mocked one.

Move app.set_detector() and add_instruments() calls to after TestClient creation:

@pytest.fixture
def client(self):
    current_dir = os.path.dirname(__file__)
    parent_dir = os.path.dirname(os.path.dirname(current_dir))
    os.environ["MODEL_DIR"] = os.path.join(parent_dir, "dummy_models", "bert/BertForSequenceClassification")

    detector = Detector()

    # patch the model to allow for control over detections - long messages will flag
    def detection_fn(*args, **kwargs):
        output = Mock()
        if kwargs["input_ids"].shape[-1] > 10:
            output.logits = torch.tensor([[0.0, 1.0]])
        else:
            output.logits = torch.tensor([[1.0, 0.0]])

        if kwargs["input_ids"].shape[-1] > 100:
            time.sleep(.25)
        return output

    class ModelMock:
        def __init__(self):
            self.config = Mock()
            self.config.id2label = detector.model.config.id2label
            self.config.problem_type = detector.model.config.problem_type
        def __call__(self, *args, **kwargs):
            return detection_fn(*args, **kwargs)

    detector.model = ModelMock()
    test_client = TestClient(app)
    app.set_detector(detector, detector.registry_name)
    detector.add_instruments(app.state.instruments)
    return test_client
🧰 Tools
🪛 Ruff (0.14.1)

53-53: Unused function argument: args

(ARG001)

🤖 Prompt for AI Agents
In tests/detectors/huggingface/test_metrics.py around lines 45-76 the fixture
sets app.set_detector(...) and detector.add_instruments(...) before creating
TestClient(app), which causes the app lifespan to recreate/overwrite the mocked
detector; create the TestClient first (e.g., test_client = TestClient(app)),
then call app.set_detector(detector, detector.registry_name) and
detector.add_instruments(app.state.instruments), and finally return the
TestClient instance so the mocked detector remains in place after the app
lifespan runs.

@github-actions
Copy link

github-actions bot commented Oct 24, 2025

PR image build completed successfully!

📦 Huggingface PR image: quay.io/trustyai/guardrails-detector-huggingface-runtime-ci:$PR_HEAD_SHA
📦 Built-in PR image: quay.io/trustyai/guardrails-detector-built-in-ci:$PR_HEAD_SHA
📦 LLM Judge PR image: quay.io/trustyai/guardrails-detector-llm-judge-ci:$PR_HEAD_SHA

@RobGeada RobGeada merged commit 6b267b5 into trustyai-explainability:main Oct 27, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants