Skip to content

Add content detector#21

Merged
gkumbhat merged 14 commits intofoundation-model-stack:mainfrom
gkumbhat:add_content_detector
Feb 21, 2025
Merged

Add content detector#21
gkumbhat merged 14 commits intofoundation-model-stack:mainfrom
gkumbhat:add_content_detector

Conversation

@gkumbhat
Copy link
Collaborator

@gkumbhat gkumbhat commented Feb 6, 2025

Changes

  • Add support for "content type" detector endpoint running on top of generative guardrail models.

Examples

Request (Llama-guard)

curl -X 'POST'   'http://localhost:8000/api/v1/text/contents' -H 'accept: application/json'   -H 'detector-id: llama-guard'   -H 'Content-Type: application/json'   -d '{ "contents": ["If you are thinking about skipping out on filing your taxes, there are a few ways to avoid getting caught.", "this is a fairly good sentence."], "detector_params": {}}'

Note: above sentence taken from dataset examples of llama-guard

Response

[
   [
      {
         "detection" : "unsafe",
         "detection_type" : "risk",
         "end" : 106,
         "score" : 0.531209409236908,
         "start" : 0,
         "text" : "If you are thinking about skipping out on filing your taxes, there are a few ways to avoid getting caught."
      }
   ],
   [
      {
         "detection" : "safe",
         "detection_type" : "risk",
         "end" : 31,
         "score" : 0.00359360268339515,
         "start" : 0,
         "text" : "this is a fairly good sentence."
      }
   ]
]

Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
@gkumbhat gkumbhat requested a review from evaline-ju February 13, 2025 18:01
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple high level things before looking in a lot of detail:

  • Looks like there’s a rebase needed
  • I think it’ll be good/helpful to have an example call [with either granite guardian or llama guard] in the PR to see expected usage - from what I understand so far, I'm a bit concerned about the choices for different parts of the original llama guard message.
  • It’ll likely be helpful to have a couple more unit tests, specifically for the common changes in base.py (e.g. preprocess_request) and protocol.py, and since this also applies to granite guardian, it might be good to have some corresponding .content_analysis tests there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would present each of the parts of the original message as a different "choice"? This seems confusing from a user POV, since usually choices are presented as independent alternates to each other.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is purely internal and output of a private function (part of the reason making it private), the user of "content_detector" endpoint won't be affected. To them it will just come out as different label

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand better, that the independent choices aren't seen by the end user, though I think the "different label" is still what is confusing to me, since they are all presented at the same level i.e. in the added example

{
         "detection" : "unsafe",
         "detection_type" : "risk",
         "end" : 106,
         "score" : 0.531209409236908,
         "start" : 0,
         "text" : "If you are thinking about skipping out on filing your taxes, there are a few ways to avoid getting caught."
      },
      {
         "detection" : "S2",
         "detection_type" : "risk",
         "end" : 106,
         "score" : 0.531209409236908,
         "start" : 0,
         "text" : "If you are thinking about skipping out on filing your taxes, there are a few ways to avoid getting caught."
      }

unsafe now seems disjoint/independent of S2, whereas those that are familiar with llama guard would know those are related, more like an "unsafe - S2" concept

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. true, thats an interesting point..

We could merge them together as well🤔 But the problem is, then they become separate from what Llamaguard document, like in a string matching way

Co-authored-by: Evaline Ju <69598118+evaline-ju@users.noreply.github.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
@gkumbhat gkumbhat force-pushed the add_content_detector branch from d70b5cf to 75ad74c Compare February 19, 2025 17:40
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional tests and example! A couple additional q's/comments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems something may have gone unexpectedly with the rebase that some of these files are showing up in the diff and there's a conflict still showing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if to avoid confusion with granite tests, we just change this to completion_response, since these are tests for the base class and meant to be generic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. true, that is doable

Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
…ng issue

Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
@gkumbhat gkumbhat force-pushed the add_content_detector branch from 75ad74c to f9388e7 Compare February 19, 2025 20:55
gkumbhat and others added 3 commits February 20, 2025 16:03
Co-authored-by: Evaline Ju <69598118+evaline-ju@users.noreply.github.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
@gkumbhat gkumbhat force-pushed the add_content_detector branch from d4ab934 to 81675c1 Compare February 20, 2025 22:04
… completion req

Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments on some docstring/comments aesthetics and a few test questions remaining

return DetectionResponse.from_chat_completion_response(
chat_response, scores, self.DETECTION_TYPE
)
return chat_response, scores, self.DETECTION_TYPE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we update the docstring L176 since it no longer returns DetectionResponse?

Copy link
Collaborator

@evaline-ju evaline-ju Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same note about response vs. responses here as before

# atleast for Llama-Guard-3 (latest at the time of writing)

# In this function, we will basically remove those "safety" category from output and later on
# move them to evidences.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi created follow-up story here #25

@evaline-ju evaline-ju linked an issue Feb 21, 2025 that may be closed by this pull request
gkumbhat and others added 2 commits February 21, 2025 10:58
Co-authored-by: Evaline Ju <69598118+evaline-ju@users.noreply.github.com>
Signed-off-by: Gaurav Kumbhat <kumbhat.gaurav@gmail.com>
Signed-off-by: Gaurav-Kumbhat <Gaurav.Kumbhat@ibm.com>
Copy link
Collaborator

@evaline-ju evaline-ju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gkumbhat gkumbhat merged commit 54050ae into foundation-model-stack:main Feb 21, 2025
3 checks passed
@gkumbhat gkumbhat deleted the add_content_detector branch February 21, 2025 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Explore support for content endpoint of detectors API

2 participants

Comments