-
Couldn't load subscription status.
- Fork 18
Feat: Add custom detectors to the built in detector #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Add custom detectors to the built in detector #51
Conversation
Reviewer's GuideThis PR extends the existing built-in detector service by integrating a custom detectors plugin: it introduces a CustomDetectorRegistry that discovers and wraps user-supplied functions, registers it in the FastAPI lifecycle, adjusts request handling to only invoke explicitly requested detectors, and adds logging, error handling, supporting modules, and tests. Sequence diagram for content detection with custom detectorssequenceDiagram
participant Client
participant FastAPI
participant CustomDetectorRegistry
participant custom_func_wrapper
Client->>FastAPI: POST /api/v1/text/contents (with detector_params)
FastAPI->>CustomDetectorRegistry: handle_request(content, detector_params)
CustomDetectorRegistry->>custom_func_wrapper: custom_func_wrapper(func, func_name, content)
custom_func_wrapper-->>CustomDetectorRegistry: ContentAnalysisResponse or None
CustomDetectorRegistry-->>FastAPI: List[ContentAnalysisResponse]
FastAPI-->>Client: ContentsAnalysisResponse
Class diagram for CustomDetectorRegistry and custom detector integrationclassDiagram
class BaseDetectorRegistry
class CustomDetectorRegistry {
+registry: dict
+__init__()
+handle_request(content: str, detector_params: dict): List[ContentAnalysisResponse]
}
class ContentAnalysisResponse
class custom_func_wrapper {
+custom_func_wrapper(func: Callable, func_name: str, s: str): Optional[ContentAnalysisResponse]
}
BaseDetectorRegistry <|-- CustomDetectorRegistry
CustomDetectorRegistry o-- custom_func_wrapper
CustomDetectorRegistry o-- ContentAnalysisResponse
custom_func_wrapper o-- ContentAnalysisResponse
class "custom_detectors.custom_detectors" {
+over_100_characters(text: str): bool
+contains_word(text: str): bool
}
CustomDetectorRegistry o-- "custom_detectors.custom_detectors"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `detectors/built_in/app.py:33` </location>
<code_context>
-
@app.post("/api/v1/text/contents", response_model=ContentsAnalysisResponse)
def detect_content(request: ContentAnalysisHttpRequest):
+ logging.info(f"Request for {request.detector_params}")
+
detections = []
</code_context>
<issue_to_address>
**suggestion:** Consider using the module-level logger instead of the root logger for consistency.
Use 'logger.info' instead of 'logging.info' to ensure consistent logging and easier configuration management.
```suggestion
logger.info(f"Request for {request.detector_params}")
```
</issue_to_address>
### Comment 2
<location> `detectors/built_in/custom_detectors_wrapper.py:54` </location>
<code_context>
+
+ def handle_request(self, content: str, detector_params: dict) -> List[ContentAnalysisResponse]:
+ detections = []
+ if "custom" in detector_params and isinstance(detector_params["custom"], (list, str)):
+ custom_functions = detector_params["custom"]
+ custom_functions = [custom_functions] if isinstance(custom_functions, str) else custom_functions
</code_context>
<issue_to_address>
**suggestion:** Type checking for 'custom' parameter may miss edge cases with unexpected types.
Unexpected types for 'custom' are ignored without notice. Logging or handling these cases would improve debuggability.
</issue_to_address>
### Comment 3
<location> `tests/detectors/builtIn/test_custom.py:14-22` </location>
<code_context>
+
+
+
+ def test_custom_detectors(self, client):
+ payload = {
+ "contents": ["What is an apple?"],
+ "detector_params": {"custom": ["contains_word"]}
+ }
+ resp = client.post("/api/v1/text/contents", json=payload)
+ assert resp.status_code == 200
+ texts = [d["text"] for d in resp.json()[0]]
+ assert "What is an apple?" in texts
+
</code_context>
<issue_to_address>
**suggestion (testing):** Missing negative and error case tests for custom detectors.
Add tests for scenarios where the custom detector does not match any content and for invalid detector names to ensure proper error handling and coverage of negative cases.
```suggestion
def test_custom_detectors(self, client):
payload = {
"contents": ["What is an apple?"],
"detector_params": {"custom": ["contains_word"]}
}
resp = client.post("/api/v1/text/contents", json=payload)
assert resp.status_code == 200
texts = [d["text"] for d in resp.json()[0]]
assert "What is an apple?" in texts
def test_custom_detector_no_match(self, client):
payload = {
"contents": ["Bananas are yellow."],
"detector_params": {"custom": ["contains_word"]}
}
resp = client.post("/api/v1/text/contents", json=payload)
assert resp.status_code == 200
texts = [d["text"] for d in resp.json()[0]]
assert "Bananas are yellow." not in texts
def test_custom_detector_invalid_name(self, client):
payload = {
"contents": ["What is an apple?"],
"detector_params": {"custom": ["non_existent_detector"]}
}
resp = client.post("/api/v1/text/contents", json=payload)
assert resp.status_code == 400 or resp.status_code == 422
# Optionally check for error message in response
# assert "invalid detector" in resp.text.lower()
```
</issue_to_address>
### Comment 4
<location> `tests/detectors/builtIn/test_custom.py:4` </location>
<code_context>
+import pytest
+from fastapi.testclient import TestClient
+
+class TestRegexDetectors:
+ @pytest.fixture
+ def client(self):
</code_context>
<issue_to_address>
**nitpick:** Test class name is misleading.
Since this class only tests custom detectors, renaming it to 'TestCustomDetectors' would improve clarity.
</issue_to_address>
### Comment 5
<location> `detectors/built_in/custom_detectors_wrapper.py:58-67` </location>
<code_context>
def handle_request(self, content: str, detector_params: dict) -> List[ContentAnalysisResponse]:
detections = []
if "custom" in detector_params and isinstance(detector_params["custom"], (list, str)):
custom_functions = detector_params["custom"]
custom_functions = [custom_functions] if isinstance(custom_functions, str) else custom_functions
for custom_function in custom_functions:
if self.registry.get(custom_function):
try:
result = custom_func_wrapper(self.registry[custom_function], custom_function, content)
if result is not None:
detections.append(result)
except Exception as e:
logger.error(e)
raise HTTPException(status_code=400, detail="Detection error, check detector logs")
else:
raise HTTPException(status_code=400, detail=f"Unrecognized custom function: {custom_function}")
return detections
</code_context>
<issue_to_address>
**issue (code-quality):** We've found these issues:
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| def handle_request(self, content: str, detector_params: dict) -> List[ContentAnalysisResponse]: | ||
| detections = [] | ||
| if "custom" in detector_params and isinstance(detector_params["custom"], (list, str)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Type checking for 'custom' parameter may miss edge cases with unexpected types.
Unexpected types for 'custom' are ignored without notice. Logging or handling these cases would improve debuggability.
| def test_custom_detectors(self, client): | ||
| payload = { | ||
| "contents": ["What is an apple?"], | ||
| "detector_params": {"custom": ["contains_word"]} | ||
| } | ||
| resp = client.post("/api/v1/text/contents", json=payload) | ||
| assert resp.status_code == 200 | ||
| texts = [d["text"] for d in resp.json()[0]] | ||
| assert "What is an apple?" in texts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing negative and error case tests for custom detectors.
Add tests for scenarios where the custom detector does not match any content and for invalid detector names to ensure proper error handling and coverage of negative cases.
| def test_custom_detectors(self, client): | |
| payload = { | |
| "contents": ["What is an apple?"], | |
| "detector_params": {"custom": ["contains_word"]} | |
| } | |
| resp = client.post("/api/v1/text/contents", json=payload) | |
| assert resp.status_code == 200 | |
| texts = [d["text"] for d in resp.json()[0]] | |
| assert "What is an apple?" in texts | |
| def test_custom_detectors(self, client): | |
| payload = { | |
| "contents": ["What is an apple?"], | |
| "detector_params": {"custom": ["contains_word"]} | |
| } | |
| resp = client.post("/api/v1/text/contents", json=payload) | |
| assert resp.status_code == 200 | |
| texts = [d["text"] for d in resp.json()[0]] | |
| assert "What is an apple?" in texts | |
| def test_custom_detector_no_match(self, client): | |
| payload = { | |
| "contents": ["Bananas are yellow."], | |
| "detector_params": {"custom": ["contains_word"]} | |
| } | |
| resp = client.post("/api/v1/text/contents", json=payload) | |
| assert resp.status_code == 200 | |
| texts = [d["text"] for d in resp.json()[0]] | |
| assert "Bananas are yellow." not in texts | |
| def test_custom_detector_invalid_name(self, client): | |
| payload = { | |
| "contents": ["What is an apple?"], | |
| "detector_params": {"custom": ["non_existent_detector"]} | |
| } | |
| resp = client.post("/api/v1/text/contents", json=payload) | |
| assert resp.status_code == 400 or resp.status_code == 422 | |
| # Optionally check for error message in response | |
| # assert "invalid detector" in resp.text.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @RobGeada , overall LGTM with one comment and 1 nit (+1 for sourcery's logger instead of logging suggestion)
650065c to
a86f66a
Compare
|
PR image build completed successfully! 📦 Huggingface PR image: |
Summary by Sourcery
Add support for custom detectors by introducing a new CustomDetectorRegistry to load and invoke user-defined functions, update the API to register and process only the detectors requested with improved error handling and logging, and include example detector functions along with an integration test.
New Features:
Enhancements:
Tests:
Chores: