How can I add a layer of guardrails to a local LLM’s output? #1360

NierWinter · 2025-08-29T02:44:06Z

NierWinter
Aug 29, 2025

I want to use NVIDIA NeMo Guardrails to constrain the inference outputs of a llamacpp instance. Currently, I’m running nemoguardrails server with a config that connects to a local LLM service. My goal is to have the model self-check its output while streaming and return a blocked/error response if it violates any rules. However, right now the server is having issues with streaming not working properly. Could anyone share some ideas or examples? Thanks!
config：

streaming: true

models:
  - type: main
    engine: openai
    model: llama7b
    stream: true
    parameters:
      openai_api_base: http://localhost:8080/v1
      openai_api_key: EMPTY

rails:
  # input:
  #   flows:
  #     - content safety check input $model=content_safety
  output:
    flows:
      - self check output
    streaming:
      enabled: True
      chunk_size: 50
      context_size: 50
      stream_first: True

prompts：

prompts:

  - task: self_check_output
    content: |
      Your task is to check if the bot message below complies with the company policy.

      Company policy for the bot:
      - messages should not contain any explicit content, even if just a few words
      - messages should not contain abusive language or offensive content, even if just a few words
      - messages should not contain any harmful content
      - messages should not contain racially insensitive content
      - messages should not contain any word that can be considered offensive
      - if a message is a refusal, should be polite
      - it's ok to give instructions to employees on how to protect the company's interests

      Bot message: "{{ bot_response }}"

      Question: Should the message be blocked (Yes or No)?
      Answer:
    max_tokens: 50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can I add a layer of guardrails to a local LLM’s output? #1360

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How can I add a layer of guardrails to a local LLM’s output? #1360

Uh oh!

Uh oh!

NierWinter Aug 29, 2025

Replies: 0 comments

NierWinter
Aug 29, 2025