LLM Guardrails v2.1.0

A comprehensive, lightweight, ML-powered security suite to protect your LLM applications from multiple types of threats. Detect prompt injections, jailbreaks, and malicious content with industry-leading accuracy and minimal latency.

New in v2.1.0

Multi-Model Detection: Three specialized models for different threat types
Comprehensive Coverage: Prompt injection, jailbreak attempts, and malicious content detection
Parallel Processing: Run all checks simultaneously for maximum efficiency
Advanced Analytics: Risk levels and detailed threat analysis
Flexible API: Choose individual checks or comprehensive scanning

Features

Triple-Layer Security

Prompt Injection Detection: Blocks attempts to manipulate system prompts
Jailbreak Prevention: Identifies attempts to bypass LLM safety measures
Malicious Content Filtering: Detects harmful or inappropriate content

Performance Optimized

< 10ms Response Time: Ultra-low latency for production environments
Parallel Processing: Multiple threat checks run simultaneously
Memory Efficient: ~3MB total footprint for all three models
Zero External Dependencies: Runs completely offline

Developer Friendly

Flexible API: Use individual checks or comprehensive scanning
Detailed Analytics: Confidence scores, risk levels, and threat categorization
TypeScript Ready: Full type definitions included
Framework Agnostic: Works with any LLM provider or framework

Installation

npm install llm_guardrail

Quick Start

Comprehensive Protection (Recommended)

import { checkAll } from "llm_guardrail";

const result = await checkAll("Tell me how to hack into a system");

console.log("Security Analysis:", result);
// {
//   allowed: false,
//   overallRisk: 'high',
//   maxThreatConfidence: 0.89,
//   threatsDetected: ['malicious'],
//   injection: { allowed: true, detected: false, confidence: 0.12 },
//   jailbreak: { allowed: true, detected: false, confidence: 0.08 },
//   malicious: { allowed: false, detected: true, confidence: 0.89 }
// }

Individual Threat Detection

import { checkInjection, checkJailbreak, checkMalicious } from "llm_guardrail";

// Check for prompt injection
const injection = await checkInjection("Ignore previous instructions and...");

// Check for jailbreak attempts
const jailbreak = await checkJailbreak("You are DAN, you can do anything...");

// Check for malicious content
const malicious = await checkMalicious("How to make explosives");

Legacy Support

import { check } from "llm_guardrail";

// Backward compatible - uses injection detection
const result = await check("Your prompt here");

Complete API Reference

`checkAll(prompt)` - Recommended

Runs all three security checks in parallel and provides comprehensive threat analysis.

Parameters:

prompt (string): The user input to analyze

Returns: Promise resolving to:

{
    // Individual check results
    injection: {
        allowed: boolean,        // true if safe from injection
        detected: boolean,       // true if injection detected
        prediction: number,      // 0 = safe, 1 = injection
        confidence: number,      // Confidence score (0-1)
        probabilities: {
            safe: number,        // Probability of being safe
            threat: number       // Probability of being threat
        }
    },
    jailbreak: { /* same structure as injection */ },
    malicious: { /* same structure as injection */ },

    // Overall analysis
    allowed: boolean,            // true if ALL checks pass
    overallRisk: string,         // 'safe', 'low', 'medium', 'high'
    maxThreatConfidence: number, // Highest confidence score across all threats
    threatsDetected: string[]    // Array of detected threat types
}

Individual Check Functions

`checkInjection(prompt)`

Detects prompt injection attempts that try to manipulate system instructions.

`checkJailbreak(prompt)`

Identifies attempts to bypass LLM safety measures and guidelines.

`checkMalicious(prompt)`

Detects harmful, inappropriate, or dangerous content requests.

All individual functions return:

{
    allowed: boolean,        // true if safe, false if threat detected
    detected: boolean,       // true if threat detected
    prediction: number,      // 0 = safe, 1 = threat
    confidence: number,      // Confidence score (0-1)
    probabilities: {
        safe: number,        // Probability of being safe
        threat: number       // Probability of being threat
    }
}

`check(prompt)` - Legacy

Backward compatible function that performs injection detection only.

Advanced Usage Examples

Production-Ready Security Gateway

import { checkAll } from "llm_guardrail";

async function securityGateway(userMessage, options = {}) {
  const {
    strictMode = false,
    logThreats = true,
    customThreshold = null,
  } = options;

  try {
    const analysis = await checkAll(userMessage);

    // Custom risk assessment
    const riskThreshold = customThreshold || (strictMode ? 0.3 : 0.7);
    const highRisk = analysis.maxThreatConfidence > riskThreshold;

    if (logThreats && analysis.threatsDetected.length > 0) {
      console.warn("SECURITY ALERT:", {
        threats: analysis.threatsDetected,
        confidence: analysis.maxThreatConfidence,
        risk: analysis.overallRisk,
        message: userMessage.substring(0, 100) + "...",
      });
    }

    return {
      allowed: analysis.allowed && !highRisk,
      analysis,
      action: highRisk ? "block" : "allow",
      reason: highRisk ? `${analysis.overallRisk} risk detected` : "safe",
    };
  } catch (error) {
    console.error("Security gateway error:", error);
    return { allowed: false, action: "block", reason: "security check failed" };
  }
}

// Usage
const result = await securityGateway(userInput, { strictMode: true });
if (result.allowed) {
  // Proceed with LLM call
  console.log("Message approved for processing");
} else {
  console.log(`BLOCKED: ${result.reason}`);
}

Targeted Threat Detection

import { checkInjection, checkJailbreak, checkMalicious } from "llm_guardrail";

// Educational content filter
async function moderateEducationalContent(content) {
  const [injection, malicious] = await Promise.all([
    checkInjection(content),
    checkMalicious(content),
  ]);

  if (injection.detected) {
    return { approved: false, reason: "potential system manipulation" };
  }

  if (malicious.detected && malicious.confidence > 0.6) {
    return { approved: false, reason: "inappropriate content" };
  }

  return { approved: true, reason: "content approved" };
}

// Customer service filter
async function moderateCustomerService(message) {
  // Allow slightly higher tolerance for jailbreak attempts in customer service
  const [injection, jailbreak, malicious] = await Promise.all([
    checkInjection(message),
    checkJailbreak(message),
    checkMalicious(message),
  ]);

  const threats = [];
  if (injection.confidence > 0.8) threats.push("injection");
  if (jailbreak.confidence > 0.9) threats.push("jailbreak"); // Higher threshold
  if (malicious.confidence > 0.7) threats.push("malicious");

  return {
    escalate: threats.length > 0,
    threats,
    confidence: Math.max(
      injection.confidence,
      jailbreak.confidence,
      malicious.confidence,
    ),
  };
}

Real-time Chat Protection

import { checkAll } from "llm_guardrail";

class ChatModerator {
  constructor(options = {}) {
    this.strictMode = options.strictMode || false;
    this.rateLimiter = new Map(); // Simple rate limiting
  }

  async moderateMessage(userId, message) {
    // Rate limiting check
    const now = Date.now();
    const userHistory = this.rateLimiter.get(userId) || [];
    const recentRequests = userHistory.filter((time) => now - time < 60000);

    if (recentRequests.length > 10) {
      return { allowed: false, reason: "rate limit exceeded" };
    }

    // Update rate limiter
    recentRequests.push(now);
    this.rateLimiter.set(userId, recentRequests);

    // Security check
    const analysis = await checkAll(message);

    // Special handling for different threat types
    if (analysis.injection.detected) {
      return {
        allowed: false,
        reason: "prompt injection detected",
        action: "warn_admin",
        analysis,
      };
    }

    if (analysis.jailbreak.detected && analysis.jailbreak.confidence > 0.8) {
      return {
        allowed: false,
        reason: "jailbreak attempt detected",
        action: "temporary_restriction",
        analysis,
      };
    }

    if (analysis.malicious.detected) {
      return {
        allowed: false,
        reason: "inappropriate content",
        action: "content_filter",
        analysis,
      };
    }

    return { allowed: true, analysis };
  }
}

// Usage
const moderator = new ChatModerator({ strictMode: true });
const result = await moderator.moderateMessage("user123", userMessage);

Multi-Language Enterprise Setup

import { checkAll } from "llm_guardrail";

class EnterpriseSecurityLayer {
  constructor(config = {}) {
    this.config = {
      enableAuditLog: config.enableAuditLog || true,
      alertWebhook: config.alertWebhook || null,
      bypassUsers: config.bypassUsers || [],
      ...config,
    };
    this.auditLog = [];
  }

  async validateRequest(userId, prompt, metadata = {}) {
    const timestamp = new Date().toISOString();

    // Bypass check for admin users
    if (this.config.bypassUsers.includes(userId)) {
      return { allowed: true, reason: "admin bypass" };
    }

    const analysis = await checkAll(prompt);

    // Audit logging
    if (this.config.enableAuditLog) {
      this.auditLog.push({
        timestamp,
        userId,
        promptLength: prompt.length,
        analysis,
        metadata,
        allowed: analysis.allowed,
      });
    }

    // Alert on high-risk threats
    if (analysis.overallRisk === "high" && this.config.alertWebhook) {
      await this.sendAlert({
        level: "HIGH",
        userId,
        threats: analysis.threatsDetected,
        confidence: analysis.maxThreatConfidence,
        timestamp,
      });
    }

    return {
      allowed: analysis.allowed,
      riskLevel: analysis.overallRisk,
      threats: analysis.threatsDetected,
      confidence: analysis.maxThreatConfidence,
      requestId: `${userId}-${Date.now()}`,
    };
  }

  async sendAlert(alertData) {
    try {
      // Implementation depends on your alerting system
      console.warn("SECURITY ALERT:", alertData);
    } catch (error) {
      console.error("Failed to send security alert:", error);
    }
  }

  getAuditReport(timeRange = "24h") {
    const now = Date.now();
    const cutoff = now - (timeRange === "24h" ? 86400000 : 3600000);

    return this.auditLog
      .filter((entry) => new Date(entry.timestamp).getTime() > cutoff)
      .reduce(
        (report, entry) => {
          report.total++;
          if (!entry.allowed) report.blocked++;
          entry.analysis.threatsDetected.forEach((threat) => {
            report.threatCounts[threat] =
              (report.threatCounts[threat] || 0) + 1;
          });
          return report;
        },
        { total: 0, blocked: 0, threatCounts: {} },
      );
  }
}

Error Handling & Fallbacks

import { checkAll, checkInjection } from "llm_guardrail";

async function robustSecurityCheck(prompt, fallbackStrategy = "block") {
  try {
    // Primary check with timeout
    const timeoutPromise = new Promise((_, reject) =>
      setTimeout(() => reject(new Error("Security check timeout")), 5000),
    );

    const result = await Promise.race([checkAll(prompt), timeoutPromise]);

    return result;
  } catch (error) {
    console.error("Security check failed:", error.message);

    // Fallback strategies
    switch (fallbackStrategy) {
      case "allow":
        console.warn("WARNING: Security check failed - allowing by default");
        return { allowed: true, fallback: true, error: error.message };

      case "basic":
        try {
          // Fallback to basic injection check only
          const basicResult = await checkInjection(prompt);
          return { ...basicResult, fallback: true, fallbackType: "basic" };
        } catch (fallbackError) {
          return {
            allowed: false,
            fallback: true,
            error: fallbackError.message,
          };
        }

      case "block":
      default:
        console.warn("SECURITY CHECK FAILED - blocking by default");
        return { allowed: false, fallback: true, error: error.message };
    }
  }
}

Document and Website Parser Integration

LLM Guardrails seamlessly integrates with document parsing and web scraping workflows to provide comprehensive content security before processing with LLMs. This section covers common integration patterns for protecting your application from malicious content embedded in documents or scraped from websites.

Document Parser Integration

PDF Document Processing

import { checkAll } from "llm_guardrail";
import pdf from "pdf-parse";
import fs from "fs";

async function securelyProcessPDF(filePath, options = {}) {
  const {
    chunkSize = 1000,
    skipSecurityCheck = false,
    strictMode = false,
  } = options;

  try {
    // Parse PDF content
    const dataBuffer = fs.readFileSync(filePath);
    const pdfData = await pdf(dataBuffer);
    const fullText = pdfData.text;

    if (skipSecurityCheck) {
      return { content: fullText, security: null };
    }

    // Security check on full document
    const documentAnalysis = await checkAll(fullText);

    if (!documentAnalysis.allowed) {
      return {
        allowed: false,
        reason: `Document contains ${documentAnalysis.overallRisk} risk content`,
        threats: documentAnalysis.threatsDetected,
        analysis: documentAnalysis,
      };
    }

    // For large documents, also check chunks for more granular analysis
    const chunks = splitIntoChunks(fullText, chunkSize);
    const chunkAnalyses = await Promise.all(
      chunks.map(async (chunk, index) => {
        const analysis = await checkAll(chunk);
        return {
          index,
          content: chunk,
          analysis,
          risky: !analysis.allowed,
        };
      }),
    );

    const riskyChunks = chunkAnalyses.filter((chunk) => chunk.risky);

    return {
      allowed: riskyChunks.length === 0,
      content: fullText,
      security: {
        document: documentAnalysis,
        chunks: chunkAnalyses,
        riskyChunks: riskyChunks.length,
        totalChunks: chunks.length,
      },
    };
  } catch (error) {
    console.error("PDF processing error:", error);
    return {
      allowed: strictMode ? false : true,
      error: error.message,
      fallback: true,
    };
  }
}

function splitIntoChunks(text, chunkSize) {
  const chunks = [];
  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.substring(i, i + chunkSize));
  }
  return chunks;
}

// Usage
const result = await securelyProcessPDF("./document.pdf", {
  chunkSize: 500,
  strictMode: true,
});

if (result.allowed) {
  // Safe to process with LLM
  const llmResponse = await processWithLLM(result.content);
} else {
  console.warn("Document blocked:", result.reason);
}

Word Document Processing

import { checkAll } from "llm_guardrail";
import mammoth from "mammoth";

async function securelyProcessWordDoc(filePath, options = {}) {
  const { extractImages = false, securityLevel = "standard" } = options;

  try {
    // Extract text from Word document
    const result = await mammoth.extractRawText({ path: filePath });
    const text = result.value;

    // Extract and check image alt text if requested
    let imageAnalysis = null;
    if (extractImages) {
      const imageResult = await mammoth.extractRawText({
        path: filePath,
        convertImage: mammoth.images.imgElement(function (image) {
          return { alt: image.altText || "No description" };
        }),
      });

      // Check image descriptions for malicious content
      imageAnalysis = await checkAll(imageResult.value);
    }

    // Security analysis
    const textAnalysis = await checkAll(text);
    const overallAllowed =
      textAnalysis.allowed && (!imageAnalysis || imageAnalysis.allowed);

    // Determine security thresholds based on level
    const threshold =
      securityLevel === "strict"
        ? 0.3
        : securityLevel === "standard"
          ? 0.7
          : 0.9;

    const meetsThreshold =
      textAnalysis.maxThreatConfidence < threshold &&
      (!imageAnalysis || imageAnalysis.maxThreatConfidence < threshold);

    return {
      allowed: overallAllowed && meetsThreshold,
      content: {
        text: text,
        images: extractImages ? imageResult.value : null,
      },
      security: {
        text: textAnalysis,
        images: imageAnalysis,
        overallRisk: Math.max(
          textAnalysis.maxThreatConfidence,
          imageAnalysis ? imageAnalysis.maxThreatConfidence : 0,
        ),
      },
      metadata: {
        wordCount: text.split(/\s+/).length,
        hasImages: extractImages,
        securityLevel,
      },
    };
  } catch (error) {
    console.error("Word document processing error:", error);
    throw new Error(`Failed to process Word document: ${error.message}`);
  }
}

Excel/CSV Data Processing

import { checkAll, checkMalicious } from "llm_guardrail";
import xlsx from "xlsx";
import csv from "csv-parser";
import fs from "fs";

class SecureDataProcessor {
  constructor(options = {}) {
    this.maxCellsToCheck = options.maxCellsToCheck || 1000;
    this.checkHeaders = options.checkHeaders !== false;
    this.aggregateCheck = options.aggregateCheck !== false;
  }

  async processExcel(filePath) {
    const workbook = xlsx.readFile(filePath);
    const results = {};

    for (const sheetName of workbook.SheetNames) {
      const sheet = workbook.Sheets[sheetName];
      const data = xlsx.utils.sheet_to_json(sheet, { header: 1 });

      results[sheetName] = await this.analyzeSheetData(data, sheetName);
    }

    return {
      allowed: Object.values(results).every((sheet) => sheet.allowed),
      sheets: results,
      summary: this.createSummary(results),
    };
  }

  async processCSV(filePath) {
    return new Promise((resolve, reject) => {
      const data = [];

      fs.createReadStream(filePath)
        .pipe(csv())
        .on("data", (row) => data.push(row))
        .on("end", async () => {
          try {
            const analysis = await this.analyzeRowData(data);
            resolve(analysis);
          } catch (error) {
            reject(error);
          }
        })
        .on("error", reject);
    });
  }

  async analyzeSheetData(data, sheetName) {
    const flatData = data
      .flat()
      .filter(
        (cell) => cell && typeof cell === "string" && cell.trim().length > 0,
      );

    // Sample data if too large
    const sampled =
      flatData.length > this.maxCellsToCheck
        ? this.sampleArray(flatData, this.maxCellsToCheck)
        : flatData;

    // Check headers separately if requested
    let headerAnalysis = null;
    if (this.checkHeaders && data.length > 0) {
      const headers = data[0].filter((h) => h && typeof h === "string");
      headerAnalysis =
        headers.length > 0 ? await checkAll(headers.join(" ")) : null;
    }

    // Aggregate content check
    let contentAnalysis = null;
    if (this.aggregateCheck && sampled.length > 0) {
      const aggregateText = sampled.join(" ").substring(0, 10000); // Limit size
      contentAnalysis = await checkMalicious(aggregateText);
    }

    // Individual cell checks for high-risk content
    const cellChecks = await Promise.all(
      sampled.slice(0, 100).map(async (cell, index) => {
        if (cell.length > 50) {
          // Only check substantial content
          const analysis = await checkAll(cell);
          return {
            index,
            content: cell.substring(0, 100),
            analysis,
            risky: !analysis.allowed,
          };
        }
        return null;
      }),
    );

    const validChecks = cellChecks.filter((check) => check !== null);
    const riskyCells = validChecks.filter((check) => check.risky);

    return {
      allowed:
        (!headerAnalysis || headerAnalysis.allowed) &&
        (!contentAnalysis || contentAnalysis.allowed) &&
        riskyCells.length === 0,
      sheet: sheetName,
      analysis: {
        headers: headerAnalysis,
        content: contentAnalysis,
        cells: validChecks,
        riskyCells: riskyCells.length,
        totalCells: sampled.length,
      },
    };
  }

  async analyzeRowData(data) {
    // Similar logic to analyzeSheetData but for CSV rows
    const allText = data.map((row) => Object.values(row).join(" ")).join("\n");
    const analysis = await checkAll(allText.substring(0, 10000));

    return {
      allowed: analysis.allowed,
      rowCount: data.length,
      analysis,
      riskLevel: analysis.overallRisk,
    };
  }

  sampleArray(array, size) {
    const step = Math.floor(array.length / size);
    return array.filter((_, index) => index % step === 0).slice(0, size);
  }

  createSummary(results) {
    const sheets = Object.keys(results);
    const allowed = sheets.filter((name) => results[name].allowed).length;
    const risks = sheets.map(
      (name) => results[name].analysis.content?.overallRisk || "safe",
    );

    return {
      totalSheets: sheets.length,
      allowedSheets: allowed,
      blockedSheets: sheets.length - allowed,
      highestRisk: risks.includes("high")
        ? "high"
        : risks.includes("medium")
          ? "medium"
          : risks.includes("low")
            ? "low"
            : "safe",
    };
  }
}

// Usage
const processor = new SecureDataProcessor({
  maxCellsToCheck: 500,
  checkHeaders: true,
});

const excelResult = await processor.processExcel("./data.xlsx");
const csvResult = await processor.processCSV("./data.csv");

Website Content Scraping Integration

Basic Web Scraping Security

import { checkAll } from "llm_guardrail";
import puppeteer from "puppeteer";
import cheerio from "cheerio";
import axios from "axios";

async function securelyScrapePage(url, options = {}) {
  const {
    timeout = 30000,
    securityLevel = "standard",
    extractImages = false,
    maxContentLength = 50000,
  } = options;

  try {
    // Fetch page content
    const response = await axios.get(url, {
      timeout,
      headers: {
        "User-Agent": "Mozilla/5.0 (compatible; SecureBot/1.0)",
      },
    });

    const $ = cheerio.load(response.data);

    // Extract various content types
    const content = {
      title: $("title").text(),
      headings: $("h1, h2, h3, h4, h5, h6")
        .map((_, el) => $(el).text())
        .get(),
      paragraphs: $("p")
        .map((_, el) => $(el).text())
        .get(),
      links: $("a")
        .map((_, el) => $(el).text())
        .get()
        .filter((text) => text.trim()),
      meta: $('meta[name="description"]').attr("content") || "",
    };

    if (extractImages) {
      content.imageAlts = $("img[alt]")
        .map((_, el) => $(el).attr("alt"))
        .get();
    }

    // Combine all text content
    const allText = [
      content.title,
      ...content.headings,
      ...content.paragraphs.slice(0, 20), // Limit paragraphs
      content.meta,
    ]
      .join("\n")
      .substring(0, maxContentLength);

    // Security analysis
    const analysis = await checkAll(allText);

    // Check specific elements that might contain malicious content
    const linkAnalysis =
      content.links.length > 0
        ? await checkAll(content.links.join(" ").substring(0, 5000))
        : null;

    const imageAnalysis =
      extractImages && content.imageAlts.length > 0
        ? await checkAll(content.imageAlts.join(" "))
        : null;

    // Determine overall safety
    const analyses = [analysis, linkAnalysis, imageAnalysis].filter(Boolean);
    const overallAllowed = analyses.every((a) => a.allowed);
    const maxRisk = Math.max(...analyses.map((a) => a.maxThreatConfidence));

    return {
      url,
      allowed: overallAllowed,
      content,
      security: {
        overall: analysis,
        links: linkAnalysis,
        images: imageAnalysis,
        maxRisk,
        riskLevel:
          maxRisk > 0.7
            ? "high"
            : maxRisk > 0.4
              ? "medium"
              : maxRisk > 0
                ? "low"
                : "safe",
      },
      metadata: {
        contentLength: allText.length,
        elementsChecked: {
          paragraphs: content.paragraphs.length,
          headings: content.headings.length,
          links: content.links.length,
          images: content.imageAlts?.length || 0,
        },
      },
    };
  } catch (error) {
    console.error(`Error scraping ${url}:`, error.message);
    return {
      url,
      allowed: false,
      error: error.message,
      fallback: true,
    };
  }
}

Advanced Web Scraping with Puppeteer

async function securelyScrapeSPA(url, options = {}) {
  const {
    waitForSelector = "body",
    securityLevel = "standard",
    captureJavaScriptContent = false,
    blockResources = ["image", "stylesheet", "font"],
  } = options;

  const browser = await puppeteer.launch({
    headless: true,
    args: ["--no-sandbox", "--disable-setuid-sandbox"],
  });

  try {
    const page = await browser.newPage();

    // Block unnecessary resources for faster loading
    if (blockResources.length > 0) {
      await page.setRequestInterception(true);
      page.on("request", (req) => {
        if (blockResources.includes(req.resourceType())) {
          req.abort();
        } else {
          req.continue();
        }
      });
    }

    // Navigate and wait for content
    await page.goto(url, { waitUntil: "networkidle0", timeout: 30000 });
    await page.waitForSelector(waitForSelector);

    // Extract content after JavaScript execution
    const content = await page.evaluate(() => {
      return {
        title: document.title,
        text: document.body.innerText,
        headings: Array.from(
          document.querySelectorAll("h1,h2,h3,h4,h5,h6"),
        ).map((h) => h.innerText),
        links: Array.from(document.querySelectorAll("a"))
          .map((a) => a.innerText)
          .filter((text) => text.trim()),
        forms: Array.from(document.querySelectorAll("form")).map(
          (form) => form.innerText,
        ),
        dynamicContent: Array.from(
          document.querySelectorAll("[data-dynamic]"),
        ).map((el) => el.innerText),
      };
    });

    // Security analysis on extracted content
    const mainText = [content.title, content.text]
      .join("\n")
      .substring(0, 20000);
    const mainAnalysis = await checkAll(mainText);

    // Check dynamic and form content separately
    const dynamicText = [...content.forms, ...content.dynamicContent].join(" ");
    const dynamicAnalysis =
      dynamicText.length > 0
        ? await checkAll(dynamicText.substring(0, 5000))
        : null;

    const analyses = [mainAnalysis, dynamicAnalysis].filter(Boolean);
    const overallAllowed = analyses.every((a) => a.allowed);

    return {
      url,
      allowed: overallAllowed,
      content,
      security: {
        main: mainAnalysis,
        dynamic: dynamicAnalysis,
        overallRisk: Math.max(...analyses.map((a) => a.maxThreatConfidence)),
      },
    };
  } finally {
    await browser.close();
  }
}

Multi-Page Scraping Pipeline

class SecureWebScraper {
  constructor(options = {}) {
    this.concurrency = options.concurrency || 3;
    this.delay = options.delay || 1000;
    this.maxRetries = options.maxRetries || 2;
    this.securityLevel = options.securityLevel || "standard";
  }

  async scrapeMultipleUrls(urls, options = {}) {
    const results = [];
    const batches = this.createBatches(urls, this.concurrency);

    for (const batch of batches) {
      const batchPromises = batch.map((url) =>
        this.scrapeWithRetry(url, options),
      );

      const batchResults = await Promise.allSettled(batchPromises);
      results.push(...batchResults);

      // Delay between batches to be respectful
      if (batches.indexOf(batch) < batches.length - 1) {
        await this.sleep(this.delay);
      }
    }

    // Process results
    const successful = results
      .filter((r) => r.status === "fulfilled")
      .map((r) => r.value);

    const failed = results
      .filter((r) => r.status === "rejected")
      .map((r) => ({ error: r.reason.message }));

    // Security summary
    const allowedPages = successful.filter((page) => page.allowed);
    const blockedPages = successful.filter((page) => !page.allowed);
    const riskyPages = successful.filter(
      (page) => page.security && page.security.overallRisk === "high",
    );

    return {
      successful: successful.length,
      failed: failed.length,
      allowed: allowedPages.length,
      blocked: blockedPages.length,
      highRisk: riskyPages.length,
      pages: successful,
      errors: failed,
      summary: {
        totalProcessed: urls.length,
        successRate: ((successful.length / urls.length) * 100).toFixed(1) + "%",
        securityRate:
          ((allowedPages.length / successful.length) * 100).toFixed(1) + "%",
      },
    };
  }

  async scrapeWithRetry(url, options) {
    for (let attempt = 1; attempt <= this.maxRetries + 1; attempt++) {
      try {
        return await securelyScrapePage(url, {
          ...options,
          securityLevel: this.securityLevel,
        });
      } catch (error) {
        if (attempt <= this.maxRetries) {
          console.warn(`Attempt ${attempt} failed for ${url}, retrying...`);
          await this.sleep(1000 * attempt); // Exponential backoff
        } else {
          throw error;
        }
      }
    }
  }

  createBatches(array, batchSize) {
    const batches = [];
    for (let i = 0; i < array.length; i += batchSize) {
      batches.push(array.slice(i, i + batchSize));
    }
    return batches;
  }

  sleep(ms) {
    return new Promise((resolve) => setTimeout(resolve, ms));
  }
}

// Usage
const scraper = new SecureWebScraper({
  concurrency: 2,
  delay: 2000,
  securityLevel: "strict",
});

const urls = [
  "https://example.com/page1",
  "https://example.com/page2",
  "https://example.com/page3",
];

const results = await scraper.scrapeMultipleUrls(urls, {
  extractImages: true,
  maxContentLength: 30000,
});

console.log(
  `Processed ${results.successful} pages, ${results.allowed} allowed`,
);

Real-World Integration Examples

Content Management System Integration

import { checkAll } from "llm_guardrail";

class SecureContentIngestion {
  constructor(options = {}) {
    this.quarantineFolder = options.quarantineFolder || "./quarantine";
    this.approvedFolder = options.approvedFolder || "./approved";
    this.strictMode = options.strictMode || false;
  }

  async processUploadedDocument(filePath, fileType) {
    try {
      let content;

      // Route to appropriate parser
      switch (fileType.toLowerCase()) {
        case "pdf":
          content = await securelyProcessPDF(filePath);
          break;
        case "docx":
        case "doc":
          content = await securelyProcessWordDoc(filePath);
          break;
        case "xlsx":
        case "xls":
          content = await new SecureDataProcessor().processExcel(filePath);
          break;
        default:
          throw new Error(`Unsupported file type: ${fileType}`);
      }

      // Make ingestion decision
      if (content.allowed) {
        await this.moveToApproved(filePath, content);
        return {
          status: "approved",
          content,
          action: "ready_for_llm_processing",
        };
      } else {
        await this.quarantineFile(filePath, content);
        return {
          status: "quarantined",
          reason: content.reason || "Security check failed",
          threats: content.threats || [],
          action: "manual_review_required",
        };
      }
    } catch (error) {
      console.error("Content ingestion error:", error);
      return {
        status: "error",
        error: error.message,
        action: "retry_or_manual_review",
      };
    }
  }

  async moveToApproved(filePath, content) {
    // Implementation depends on your file system setup
    console.log(`Moving ${filePath} to approved folder`);
    // Add metadata about security analysis
  }

  async quarantineFile(filePath, analysis) {
    // Implementation depends on your file system setup
    console.log(`Quarantining ${filePath} due to security concerns`);
    // Log security analysis for review
  }
}

// Usage
const ingestion = new SecureContentIngestion({
  strictMode: true,
});

const result = await ingestion.processUploadedDocument(
  "./uploads/document.pdf",
  "pdf",
);

if (result.status === "approved") {
  // Process with your LLM
  const llmResponse = await processWithLLM(result.content.content);
}

This comprehensive integration guide shows how to use LLM Guardrails with document parsing and website scraping to ensure security before LLM processing. The examples cover various file formats, web scraping scenarios, and real-world implementation patterns for production systems.

Technical Architecture

Multi-Model Security System

Specialized Models: Three dedicated models trained on different threat datasets
- prompt_injection_model.json - Detects system prompt manipulation
- jailbreak_model.json - Identifies safety bypass attempts
- malicious_model.json - Filters harmful content requests

Core Components

TF-IDF Vectorization: Advanced text feature extraction with n-gram support
Logistic Regression: Optimized binary classification for each threat type
Parallel Processing: Concurrent model execution for maximum throughput
Smart Caching: Models loaded once and reused across requests

Performance Benchmarks

Metric	Value
Response Time	< 5ms (all three models)
Memory Usage	~15MB (total footprint)
Accuracy	>95% across all threat types
Throughput	10,000+ checks/second
Cold Start	~50ms (first request)

Security Models

Prompt Injection Detection

Trained on datasets containing:

System prompt manipulation attempts
Instruction override patterns
Context confusion attacks
Role hijacking attempts

Jailbreak Prevention

Specialized for detecting:

"DAN" and similar personas
Ethical guideline bypass attempts
Roleplay-based circumvention
Authority figure impersonation

Malicious Content Filtering

Identifies requests for:

Harmful instructions
Illegal activities
Violence and threats
Privacy violations

Error Handling Best Practices

import { checkAll } from "llm_guardrail";

// Production-ready error handling
async function safeSecurityCheck(prompt, options = {}) {
  const { timeout = 5000, retries = 2, fallbackStrategy = "block" } = options;

  for (let attempt = 1; attempt <= retries + 1; attempt++) {
    try {
      const timeoutPromise = new Promise((_, reject) =>
        setTimeout(() => reject(new Error("Timeout")), timeout),
      );

      const result = await Promise.race([checkAll(prompt), timeoutPromise]);

      return { success: true, ...result };
    } catch (error) {
      if (attempt <= retries) {
        console.warn(`Security check attempt ${attempt} failed, retrying...`);
        continue;
      }

      // All retries failed - implement fallback
      console.error("All security check attempts failed:", error.message);

      return {
        success: false,
        error: error.message,
        allowed: fallbackStrategy === "allow",
        fallback: true,
      };
    }
  }
}

Migration Guide

From v1.x to v2.1.0

Breaking Changes

Model file renamed: model_data.json → prompt_injection_model.json
Return object structure updated for consistency

Migration Steps

// OLD (v1.x)
import { check } from "llm_guardrail";
const result = await check(prompt);
// result.injective, result.probabilities.injection

// NEW (v2.1.0) - Backward Compatible
import { check } from "llm_guardrail";
const result = await check(prompt);
// result.detected, result.probabilities.threat

// RECOMMENDED (v2.1.0) - New API
import { checkAll } from "llm_guardrail";
const result = await checkAll(prompt);
// result.injection.detected, result.overallRisk

Feature Additions

// New comprehensive checking
const analysis = await checkAll(prompt);
console.log("Risk Level:", analysis.overallRisk);
console.log("Threats Found:", analysis.threatsDetected);

// Individual threat checking
const injection = await checkInjection(prompt);
const jailbreak = await checkJailbreak(prompt);
const malicious = await checkMalicious(prompt);

Configuration Options

Custom Risk Thresholds

// Define your own risk assessment logic
function customRiskAssessment(analysis, context = {}) {
  const { userTrust = 0, contentType = "general" } = context;

  // Adjust thresholds based on context
  const baseThreshold = contentType === "education" ? 0.8 : 0.5;
  const adjustedThreshold = Math.max(0.1, baseThreshold - userTrust);

  return {
    allowed: analysis.maxThreatConfidence < adjustedThreshold,
    risk: analysis.overallRisk,
    customScore: analysis.maxThreatConfidence / adjustedThreshold,
  };
}

Integration Patterns

Express.js Middleware

import express from "express";
import { checkAll } from "llm_guardrail";

const app = express();

const securityMiddleware = async (req, res, next) => {
  try {
    const { message } = req.body;
    const analysis = await checkAll(message);

    if (!analysis.allowed) {
      return res.status(400).json({
        error: "Content blocked by security filters",
        reason: `${analysis.overallRisk} risk detected`,
        threats: analysis.threatsDetected,
      });
    }

    req.securityAnalysis = analysis;
    next();
  } catch (error) {
    console.error("Security middleware error:", error);
    res.status(500).json({ error: "Security check failed" });
  }
};

app.post("/chat", securityMiddleware, async (req, res) => {
  // Process secure message
  const response = await processMessage(req.body.message);
  res.json({ response, security: req.securityAnalysis });
});

WebSocket Security

import WebSocket from "ws";
import { checkAll } from "llm_guardrail";

const wss = new WebSocket.Server({ port: 8080 });

wss.on("connection", (ws) => {
  ws.on("message", async (data) => {
    try {
      const message = JSON.parse(data);
      const analysis = await checkAll(message.text);

      if (analysis.allowed) {
        // Process and broadcast safe message
        wss.clients.forEach((client) => {
          if (client.readyState === WebSocket.OPEN) {
            client.send(
              JSON.stringify({
                type: "message",
                text: message.text,
                user: message.user,
              }),
            );
          }
        });
      } else {
        // Notify sender of blocked content
        ws.send(
          JSON.stringify({
            type: "error",
            message: "Message blocked by security filters",
            threats: analysis.threatsDetected,
          }),
        );
      }
    } catch (error) {
      ws.send(
        JSON.stringify({
          type: "error",
          message: "Failed to process message",
        }),
      );
    }
  });
});

Monitoring & Analytics

Security Metrics Collection

import { checkAll } from "llm_guardrail";

class SecurityMetrics {
  constructor() {
    this.metrics = {
      totalChecks: 0,
      threatsBlocked: 0,
      threatTypes: {},
      averageResponseTime: 0,
      falsePositives: 0,
    };
  }

  async checkWithMetrics(prompt, metadata = {}) {
    const startTime = Date.now();

    try {
      const result = await checkAll(prompt);
      const responseTime = Date.now() - startTime;

      // Update metrics
      this.metrics.totalChecks++;
      this.metrics.averageResponseTime =
        (this.metrics.averageResponseTime * (this.metrics.totalChecks - 1) +
          responseTime) /
        this.metrics.totalChecks;

      if (!result.allowed) {
        this.metrics.threatsBlocked++;
        result.threatsDetected.forEach((threat) => {
          this.metrics.threatTypes[threat] =
            (this.metrics.threatTypes[threat] || 0) + 1;
        });
      }

      return {
        ...result,
        responseTime,
        metrics: this.getSnapshot(),
      };
    } catch (error) {
      console.error("Security check with metrics failed:", error);
      throw error;
    }
  }

  getSnapshot() {
    return {
      ...this.metrics,
      blockRate:
        (
          (this.metrics.threatsBlocked / this.metrics.totalChecks) *
          100
        ).toFixed(2) + "%",
      topThreats: Object.entries(this.metrics.threatTypes)
        .sort(([, a], [, b]) => b - a)
        .slice(0, 3),
    };
  }
}

Community & Support

Discord Community: Join our active community
- Get help with implementation
- Share use cases and feedback
- Early access to new features
- Direct developer support
GitHub Issues: Report bugs and request features
Documentation: Full API documentation
Enterprise Support: Available for high-volume deployments

Roadmap v2.2+

Planned Features

Custom Model Training: Train models on your specific data
Real-time Model Updates: Download updated models automatically
Multi-language Support: Models for non-English content
Severity Scoring: Granular threat severity levels
Content Categories: Detailed classification beyond binary detection
Performance Dashboard: Built-in metrics visualization
Cloud Integration: Optional cloud-based model updates

Integration Roadmap

LangChain Plugin: Native LangChain integration
OpenAI Wrapper: Direct OpenAI API proxy with built-in protection
Anthropic Integration: Claude-specific optimizations
Azure OpenAI: Enterprise Azure integration
AWS Bedrock: Native AWS Bedrock support

Performance Tips

Production Optimization

// Model preloading for better cold start performance
import { checkInjection } from "llm_guardrail";

// Preload models during application startup
async function warmupModels() {
  console.log("Warming up security models...");
  await Promise.all([
    checkInjection("test"),
    checkJailbreak("test"),
    checkMalicious("test"),
  ]);
  console.log("Models ready");
}

// Call during app initialization
await warmupModels();

Batch Processing

// For high-throughput scenarios
async function batchSecurityCheck(prompts) {
  const results = await Promise.allSettled(
    prompts.map((prompt) => checkAll(prompt)),
  );

  return results.map((result, index) => ({
    prompt: prompts[index],
    success: result.status === "fulfilled",
    analysis: result.status === "fulfilled" ? result.value : null,
    error: result.status === "rejected" ? result.reason : null,
  }));
}

License & Legal

License: ISC License - see LICENSE
Model Usage: Models trained on public datasets with appropriate licenses
Privacy: All processing happens locally - no data transmitted externally
Compliance: GDPR and CCPA compliant (no data collection)

Contributing

We welcome contributions from the community! Here's how you can help:

Ways to Contribute

Bug Reports: Help us identify and fix issues
Feature Requests: Suggest new capabilities
Documentation: Improve examples and guides
Testing: Test edge cases and report findings
Code: Submit pull requests for new features

Development Setup

git clone https://github.com/Frank2006x/llm_Guardrails.git
cd llm_Guardrails
npm install
npm test

Community Guidelines

Be respectful and constructive
Follow our code of conduct
Test your changes thoroughly
Document new features clearly

⚠️ Important Security Notice

LLM Guardrails provides robust protection but should be part of a comprehensive security strategy. Always:

Implement multiple layers of security
Monitor and log security events
Keep models updated
Validate inputs at multiple levels
Have incident response procedures

Remember: No single security measure is 100% effective. Defense in depth is key.

Logistic Regression: ML model trained on prompt injection datasets
Local Processing: No external API calls or data transmission
ES Module Support: Modern JavaScript module system

Performance

Latency: < 10ms typical response time
Memory: ~5MB model footprint
CPU: Minimal overhead suitable for production

Security Model

The guardrail uses a machine learning approach trained to detect:

Jailbreak attempts
System prompt leaks
Role confusion attacks
Instruction injection
Context manipulation

Error Handling Best Practices

import { check } from "llm_guardrail";

async function safeCheck(prompt) {
  try {
    return await check(prompt);
  } catch (error) {
    console.error("Guardrail error:", error.message);

    // Fail securely - when in doubt, block
    return {
      allowed: false,
      error: error.message,
      fallback: true,
    };
  }
}

Community & Support

Discord: Join our community at https://discord.gg/xV8e3TFrFU
GitHub Issues: Report bugs and request features
GitHub Repository: Source code and documentation

Roadmap v2.2+

Multi-language support
Custom model training utilities
Real-time model updates
Performance analytics dashboard
Integration examples for popular frameworks

License & Legal

This project is licensed under the ISC License - see the package.json for details.

Contributing

We welcome contributions! Please feel free to submit pull requests, report bugs, or suggest features through our GitHub repository or Discord community.

⚠️ Security Notice: This guardrail provides an additional layer of security but should be part of a comprehensive security strategy. Always validate and sanitize inputs at multiple levels.

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

LLM Guardrails v2.1.0

New in v2.1.0

Features

Triple-Layer Security

Performance Optimized

Developer Friendly

Installation

Quick Start

Comprehensive Protection (Recommended)

Individual Threat Detection

Legacy Support

Complete API Reference

checkAll(prompt) - Recommended

Individual Check Functions

checkInjection(prompt)

checkJailbreak(prompt)

checkMalicious(prompt)

check(prompt) - Legacy

Advanced Usage Examples

Production-Ready Security Gateway

Targeted Threat Detection

Real-time Chat Protection

Multi-Language Enterprise Setup

Error Handling & Fallbacks

Document and Website Parser Integration

Document Parser Integration

PDF Document Processing

Word Document Processing

Excel/CSV Data Processing

Website Content Scraping Integration

Basic Web Scraping Security

Advanced Web Scraping with Puppeteer

Multi-Page Scraping Pipeline

Real-World Integration Examples

Content Management System Integration

Technical Architecture

Multi-Model Security System

Core Components

Performance Benchmarks

Security Models

Prompt Injection Detection

Jailbreak Prevention

Malicious Content Filtering

Error Handling Best Practices

Migration Guide

From v1.x to v2.1.0

Breaking Changes

Migration Steps

Feature Additions

Configuration Options

Custom Risk Thresholds

Integration Patterns

Express.js Middleware

WebSocket Security

Monitoring & Analytics

Security Metrics Collection

Community & Support

Roadmap v2.2+

Planned Features

Integration Roadmap

Performance Tips

Production Optimization

Batch Processing

License & Legal

Contributing

Ways to Contribute

Development Setup

Community Guidelines

Performance

Security Model

Error Handling Best Practices

Community & Support

Roadmap v2.2+

License & Legal

`checkAll(prompt)` - Recommended

`checkInjection(prompt)`

`checkJailbreak(prompt)`

`checkMalicious(prompt)`

`check(prompt)` - Legacy