API Reference

Full API reference for terlik.js. For a quick overview, see the README.

Installation
Quick Start
Constructor & Options
Per-Language Imports
Detection Methods
Runtime Dictionary Management
Static Methods
Detection Modes
Per-Call Options
Normalizer (Standalone)
Advanced: TerlikCore
Exported Types
Performance Notes

Installation

npm install terlik.js
# or
pnpm add terlik.js
# or
yarn add terlik.js

Quick Start

import { Terlik } from "terlik.js";

const terlik = new Terlik();
terlik.containsProfanity("siktir git");  // true
terlik.clean("siktir git burdan");       // "****** git burdan"
terlik.getMatches("siktir git");         // [{ word: "siktir", root: "sik", ... }]

Constructor & Options

import { Terlik } from "terlik.js";

const terlik = new Terlik(options?: TerlikOptions);

`TerlikOptions`

Option	Type	Default	Description
`language`	`string`	`"tr"`	Language code. Built-in: `"tr"`, `"en"`, `"es"`, `"de"`.
`mode`	`Mode`	`"balanced"`	Detection mode: `"strict"`, `"balanced"`, or `"loose"`.
`maskStyle`	`MaskStyle`	`"stars"`	Masking style for `clean()`: `"stars"`, `"partial"`, or `"replace"`.
`replaceMask`	`string`	`"[***]"`	Replacement text when `maskStyle` is `"replace"`.
`customList`	`string[]`	`undefined`	Additional words to detect (added as high-severity exact entries).
`whitelist`	`string[]`	`undefined`	Additional words to exclude from detection.
`enableFuzzy`	`boolean`	`false`	Enable fuzzy matching (Levenshtein or Dice similarity).
`fuzzyThreshold`	`number`	`0.8`	Similarity threshold for fuzzy matching (0–1). `0.8` ≈ 1 typo per 5 chars.
`fuzzyAlgorithm`	`FuzzyAlgorithm`	`"levenshtein"`	`"levenshtein"` or `"dice"`.
`maxLength`	`number`	`10000`	Truncate input beyond this length. Must be ≥ 1.
`backgroundWarmup`	`boolean`	`false`	Compile patterns in background via `setTimeout`. Do not use in serverless.
`extendDictionary`	`DictionaryData`	`undefined`	External dictionary to merge with the built-in one. See Runtime Dictionary Management.
`disableLeetDecode`	`boolean`	`false`	Disable leet-speak decoding and number expansion. Safety layers (NFKD, diacritics, Cyrillic) remain active.
`disableCompound`	`boolean`	`false`	Disable CamelCase decompounding (3rd detection pass). Explicit compound variants in the dictionary are unaffected.
`minSeverity`	`Severity`	`undefined`	Minimum severity threshold. Matches below this level are excluded.
`excludeCategories`	`Category[]`	`undefined`	Categories to exclude from results.

Per-Language Imports

If you only need one language, use sub-path imports to reduce bundle size:

import { Terlik } from "terlik.js/tr";  // Turkish only (~10 KB gzip)
import { Terlik } from "terlik.js/en";  // English only (~10 KB gzip)
import { Terlik } from "terlik.js/es";  // Spanish only (~9 KB gzip)
import { Terlik } from "terlik.js/de";  // German only (~9 KB gzip)

Bundle Size Comparison

Import	Gzip size	Includes
`terlik.js`	~14 KB	All 4 languages (TR, EN, ES, DE)
`terlik.js/tr`	~10 KB	Turkish only
`terlik.js/en`	~10 KB	English only
`terlik.js/es`	~9 KB	Spanish only
`terlik.js/de`	~9 KB	German only

Sub-Path Exports

Each per-language entry point (terlik.js/tr, /en, /es, /de) exports:

Export	Description
`Terlik`	Pre-configured class for that language. No `language` option needed.
`createTerlik(options?)`	Factory function returning a `Terlik` instance.
`languageConfig`	Pre-resolved `LanguageConfig` object for advanced use with `TerlikCore`.
`TerlikCore`	Low-level engine class. See Advanced: TerlikCore.
All types	`TerlikOptions`, `DetectOptions`, `CleanOptions`, `MatchResult`, etc.

Usage

// Basic — just like the main entry, but single-language
import { Terlik } from "terlik.js/tr";
const t = new Terlik();
t.containsProfanity("siktir"); // true

// Factory
import { createTerlik } from "terlik.js/en";
const en = createTerlik({ mode: "strict" });
en.containsProfanity("fuck"); // true

// Advanced — use languageConfig with TerlikCore
import { TerlikCore, languageConfig } from "terlik.js/tr";
const custom = new TerlikCore(languageConfig, { mode: "loose", enableFuzzy: true });

Per-Language `Terlik` Constructor

The per-language Terlik class omits the language option since the language is fixed:

// Per-language: language is fixed, cannot be overridden
import { Terlik } from "terlik.js/tr";
const t = new Terlik({ mode: "strict" }); // OK
// const t = new Terlik({ language: "en" }); // TypeScript error

Detection Methods

`containsProfanity(text, options?): boolean`

Returns true if the text contains any profanity match.

Parameters:

Param	Type	Description
`text`	`string`	Input text to check.
`options`	`DetectOptions`	Optional per-call overrides. See Per-Call Options.

Returns: boolean

terlik.containsProfanity("siktir git");     // true
terlik.containsProfanity("merhaba dünya");  // false

// With per-call options
terlik.containsProfanity("damn it", { minSeverity: "high" }); // false (damn is medium)

`getMatches(text, options?): MatchResult[]`

Returns all profanity matches with full details.

Parameters:

Param	Type	Description
`text`	`string`	Input text to check.
`options`	`DetectOptions`	Optional per-call overrides.

Returns: MatchResult[]

const matches = terlik.getMatches("siktir git orospucocugu");
// [
//   { word: "siktir", root: "sik", index: 0, severity: "high", category: "sexual", method: "pattern" },
//   { word: "orospucocugu", root: "orospu", index: 11, severity: "high", category: "insult", method: "pattern" }
// ]

`MatchResult` Interface

interface MatchResult {
  /** The matched text from the original input. */
  word: string;
  /** The dictionary root word. */
  root: string;
  /** Character index in the original input. */
  index: number;
  /** Severity of the matched word. */
  severity: "high" | "medium" | "low";
  /** Content category (undefined for custom words added via customList/addWords). */
  category?: "sexual" | "insult" | "slur" | "general";
  /** How the match was detected. */
  method: "exact" | "pattern" | "fuzzy";
}

`clean(text, options?): string`

Returns the input text with profanity masked.

Parameters:

Param	Type	Description
`text`	`string`	Input text to clean.
`options`	`CleanOptions`	Optional per-call overrides (includes mask options).

Returns: string

Mask Styles

// "stars" (default) — full word replaced with asterisks
terlik.clean("siktir git");
// "****** git"

// "partial" — first and last characters preserved
terlik.clean("siktir git", { maskStyle: "partial" });
// "s****r git"

// "replace" — replaced with fixed text
terlik.clean("siktir git", { maskStyle: "replace" });
// "[***] git"

// Custom replacement text
terlik.clean("siktir git", { maskStyle: "replace", replaceMask: "🤬" });
// "🤬 git"

`language: string` (read-only)

Returns the language code of the instance.

const t = new Terlik({ language: "en" });
t.language; // "en"

Runtime Dictionary Management

`addWords(words: string[]): void`

Add custom words to the detection list at runtime. Patterns are recompiled automatically.

terlik.addWords(["customSlang", "anotherWord"]);
terlik.containsProfanity("customSlang"); // true

`removeWords(words: string[]): void`

Remove words from the detection list. Removes both built-in and custom entries by root. Patterns are recompiled automatically.

terlik.removeWords(["damn"]);
terlik.containsProfanity("damn"); // false

`extendDictionary` Option

Merge an external dictionary with the built-in one at construction time. Useful for teams managing custom word lists:

const terlik = new Terlik({
  extendDictionary: {
    version: 1,
    suffixes: ["ci", "cu"],
    entries: [
      {
        root: "customword",
        variants: ["cust0mword"],
        severity: "high",
        category: "general",
        suffixable: true,
      },
    ],
    whitelist: ["safeterm"],
  },
});

terlik.containsProfanity("customword");   // true
terlik.containsProfanity("customwordci"); // true (suffix match)
terlik.containsProfanity("safeterm");     // false (whitelisted)
terlik.containsProfanity("siktir");       // true (built-in still works)

Merge behavior:

Duplicate roots are skipped (built-in takes precedence).
Suffixes are merged (union of both lists).
Whitelist entries are merged.
Pattern cache is disabled for extended instances.

Static Methods

`Terlik.warmup(languages?, options?): Map<string, Terlik>`

Creates and JIT-warms instances for multiple languages at once. Useful for server deployments to eliminate cold-start latency.

Parameters:

Param	Type	Default	Description
`languages`	`string[]`	All supported	Language codes to warm up.
`options`	`Omit<TerlikOptions, "language">`	`undefined`	Shared options applied to all instances.

Returns: Map<string, Terlik>

// Warm up all languages
const cache = Terlik.warmup();

// Warm up specific languages
const cache = Terlik.warmup(["tr", "en"]);

// With shared options
const cache = Terlik.warmup(["tr", "en"], { mode: "strict" });

// Usage in server
app.post("/chat", (req, res) => {
  const lang = req.body.language;
  const cleaned = cache.get(lang)!.clean(req.body.message); // <1ms
});

Note: Only available from the main terlik.js entry point. Per-language entries do not export warmup.

`getSupportedLanguages(): string[]`

Returns all available built-in language codes.

import { getSupportedLanguages } from "terlik.js";
getSupportedLanguages(); // ["tr", "en", "es", "de"]

Detection Modes

Mode	Behavior	Best for
`strict`	Normalize + exact match only. Highest precision, lowest recall.	Minimum false positives.
`balanced`	Normalize + pattern matching with separator/leet tolerance. Best F1 overall.	General use (default).
`loose`	Pattern + fuzzy matching (Levenshtein or Dice). Highest recall, slightly higher FPR.	Maximum coverage, typo tolerance.

Detailed Comparison

Strict mode:

Only matches after normalization (char folding, leet decode, separator removal, repeat collapse).
No fuzzy matching. No char-class pattern expansion.
Best when you have controlled input and want zero false positives.

Balanced mode (recommended):

Full normalization pipeline + regex patterns with char-class expansion.
Catches leet speak, separators, char repetition, CamelCase compounds.
Near-zero FPR with high recall. Best for production chat systems.

Loose mode:

Everything in balanced + fuzzy matching.
Catches typos and creative misspellings within the similarity threshold.
~18x slower than balanced due to O(n*m) similarity computation.
Use only when typo tolerance is critical.

// Override mode per call
terlik.containsProfanity("text", { mode: "strict" });
terlik.clean("text", { mode: "loose", enableFuzzy: true, fuzzyThreshold: 0.75 });

Per-Call Options

Both containsProfanity and getMatches accept DetectOptions. The clean method accepts CleanOptions which extends DetectOptions with mask options.

`DetectOptions`

Option	Type	Default	Description
`mode`	`Mode`	Instance default	Override detection mode for this call.
`enableFuzzy`	`boolean`	Instance default	Enable/disable fuzzy matching.
`fuzzyThreshold`	`number`	Instance default	Similarity threshold (0–1).
`fuzzyAlgorithm`	`FuzzyAlgorithm`	Instance default	`"levenshtein"` or `"dice"`.
`disableLeetDecode`	`boolean`	Instance default	Skip leet-speak decoding and number expansion. Safety layers remain active.
`disableCompound`	`boolean`	Instance default	Skip CamelCase decompounding pass.
`minSeverity`	`Severity`	Instance default	Exclude matches below this severity.
`excludeCategories`	`Category[]`	Instance default	Exclude matches in these categories.

`CleanOptions`

Extends DetectOptions with:

Option	Type	Default	Description
`maskStyle`	`MaskStyle`	Instance default	`"stars"`, `"partial"`, or `"replace"`.
`replaceMask`	`string`	Instance default	Replacement text for `"replace"` style.

Examples

// Only detect high-severity words
terlik.getMatches("text", { minSeverity: "high" });

// Exclude slurs category
terlik.getMatches("text", { excludeCategories: ["slur"] });

// Disable leet decode for controlled input (e.g. validated usernames)
terlik.containsProfanity("user123", { disableLeetDecode: true });

// Clean with different mask per call
terlik.clean("text", { maskStyle: "partial" });
terlik.clean("text", { maskStyle: "replace", replaceMask: "[CENSORED]" });

Normalizer (Standalone)

The normalization pipeline is available as a standalone export for use outside of profanity detection.

`normalize(text): string`

Normalizes text using the default Turkish locale pipeline.

import { normalize } from "terlik.js";

normalize("S.İ.K.T.İ.R");  // "siktir"
normalize("$1kt1r");        // "siktir"
normalize("Scheiße");        // "scheisse"

`createNormalizer(config): (text: string) => string`

Creates a language-specific normalizer function. The returned function applies a 10-stage pipeline:

Strip invisible chars (ZWSP, ZWNJ, soft hyphen, etc.)
NFKD decompose (fullwidth to ASCII, precomposed to base + combining)
Strip combining marks (removes accents/diacritics)
Locale-aware lowercase
Cyrillic confusable to Latin
Language-specific char folding
Number expansion (if configured)
Leet decode
Punctuation removal (between letters)
Repeat collapse + whitespace trim

import { createNormalizer } from "terlik.js";

const deNormalize = createNormalizer({
  locale: "de",
  charMap: { ä: "a", ö: "o", ü: "u", ß: "ss" },
  leetMap: { "0": "o", "3": "e" },
});

deNormalize("Scheiße");  // "scheisse"
deNormalize("fück");     // "fuck"
deNormalize("ｆｕｃｋ");   // "fuck" (fullwidth chars)

`NormalizerConfig`

interface NormalizerConfig {
  locale: string;
  charMap: Record<string, string>;
  leetMap: Record<string, string>;
  numberExpansions?: [string, string][];
}

Advanced: TerlikCore

TerlikCore is the low-level engine class that powers all Terlik instances. It accepts a pre-resolved LanguageConfig object instead of a language string, bypassing the language registry entirely.

When to Use

Custom language configs: Build your own LanguageConfig for a language not yet supported.
Advanced tree-shaking: Per-language entry points use TerlikCore internally. Using it directly gives you full control.
Runtime language loading: Load language configs dynamically (e.g., from a database or API).

Usage

import { TerlikCore } from "terlik.js";
import type { LanguageConfig } from "terlik.js";

// Build a custom language config
const myLangConfig: LanguageConfig = {
  locale: "fr",
  charMap: { é: "e", è: "e", ê: "e", ë: "e", ç: "c" },
  leetMap: { "0": "o", "1": "i", "3": "e", "@": "a", "$": "s" },
  charClasses: {
    a: "[a4@àáâãäå]",
    e: "[e3èéêë]",
    // ...
  },
  dictionary: {
    version: 1,
    suffixes: [],
    entries: [
      { root: "merde", variants: [], severity: "high", category: "general", suffixable: false },
    ],
    whitelist: [],
  },
};

const fr = new TerlikCore(myLangConfig, { mode: "balanced" });
fr.containsProfanity("merde"); // true

Using with Per-Language Config

Each per-language entry exports its languageConfig:

import { TerlikCore, languageConfig } from "terlik.js/tr";

// Use the Turkish config with custom options
const custom = new TerlikCore(languageConfig, {
  mode: "loose",
  enableFuzzy: true,
  fuzzyThreshold: 0.75,
});

Constructor

new TerlikCore(langConfig: LanguageConfig, options?: TerlikOptions)

TerlikCore has the same instance API as Terlik (containsProfanity, getMatches, clean, addWords, removeWords, getPatterns, language), but does not have the static warmup method.

`getPatterns(): Map<string, RegExp>`

Returns the compiled regex patterns (root → regex). Useful for debugging or custom matching logic.

const patterns = terlik.getPatterns();
for (const [root, regex] of patterns) {
  console.log(`${root}: ${regex.source}`);
}

Exported Types

All types below are exported from the main entry (terlik.js). Per-language entries (terlik.js/tr, etc.) export all types except WordEntry and NormalizerConfig — use the main entry if you need those.

Core Types

type Severity = "high" | "medium" | "low";
type Category = "sexual" | "insult" | "slur" | "general";
type Mode = "strict" | "balanced" | "loose";
type MaskStyle = "stars" | "partial" | "replace";
type FuzzyAlgorithm = "levenshtein" | "dice";
type MatchMethod = "exact" | "pattern" | "fuzzy";

Interfaces

interface TerlikOptions {
  language?: string;
  mode?: Mode;
  maskStyle?: MaskStyle;
  replaceMask?: string;
  customList?: string[];
  whitelist?: string[];
  enableFuzzy?: boolean;
  fuzzyThreshold?: number;
  fuzzyAlgorithm?: FuzzyAlgorithm;
  maxLength?: number;
  backgroundWarmup?: boolean;
  extendDictionary?: DictionaryData;
  disableLeetDecode?: boolean;
  disableCompound?: boolean;
  minSeverity?: Severity;
  excludeCategories?: Category[];
}

interface DetectOptions {
  mode?: Mode;
  enableFuzzy?: boolean;
  fuzzyThreshold?: number;
  fuzzyAlgorithm?: FuzzyAlgorithm;
  disableLeetDecode?: boolean;
  disableCompound?: boolean;
  minSeverity?: Severity;
  excludeCategories?: Category[];
}

interface CleanOptions extends DetectOptions {
  maskStyle?: MaskStyle;
  replaceMask?: string;
}

interface MatchResult {
  word: string;
  root: string;
  index: number;
  severity: Severity;
  category?: Category;
  method: MatchMethod;
}

interface WordEntry {
  root: string;
  variants: string[];
  severity: Severity;
  category?: string;
  suffixable?: boolean;
}

interface LanguageConfig {
  locale: string;
  charMap: Record<string, string>;
  leetMap: Record<string, string>;
  charClasses: Record<string, string>;
  numberExpansions?: [string, string][];
  dictionary: DictionaryData;
}

// Used by extendDictionary option and LanguageConfig.dictionary.
// Not exported from the public API — use this shape inline.
interface DictionaryData {
  version: number;           // must be 1
  suffixes: string[];        // e.g. ["ing", "ed", "er", "s"]
  entries: Array<{
    root: string;            // canonical root form
    variants: string[];      // alternative spellings
    severity: string;        // "high" | "medium" | "low"
    category: string;        // "sexual" | "insult" | "slur" | "general"
    suffixable: boolean;     // whether suffix engine applies
  }>;
  whitelist: string[];       // words to exclude from detection
}

interface NormalizerConfig {
  locale: string;
  charMap: Record<string, string>;
  leetMap: Record<string, string>;
  numberExpansions?: [string, string][];
}

Performance Notes

Lazy Compilation

new Terlik() is near-instant (~1.5ms). Regex patterns compile on the first detection call, not at construction.

Phase	Cost	When
`new Terlik()`	~1.5ms	Construction (lookup tables only)
First `detect()`	~200-700ms	Lazy regex compilation + V8 JIT warmup
Subsequent calls	<1ms	Patterns cached, JIT optimized

Warmup Strategies

Strategy	When	Example
`backgroundWarmup: true`	Long-running servers (Express, Fastify)	`new Terlik({ backgroundWarmup: true })`
Explicit warmup	Serverless (Lambda, Vercel, Workers)	`const t = new Terlik(); t.containsProfanity("warmup");`
`Terlik.warmup()`	Multi-language servers	`Terlik.warmup(["tr", "en"])`
Lazy (default)	Scripts, CLIs, low-traffic	`new Terlik()`

Serverless warning: Do NOT use backgroundWarmup in serverless runtimes. The setTimeout callback may never fire because the process freezes between invocations.

Per-Language Bundle Sizes

Using per-language imports eliminates unused language dictionaries from your bundle:

Import	Raw	Gzip
`terlik.js` (all languages)	~67 KB	~14 KB
`terlik.js/tr`	~42 KB	~10 KB
`terlik.js/en`	~43 KB	~10 KB
`terlik.js/es`	~38 KB	~9 KB
`terlik.js/de`	~37 KB	~9 KB

The core engine (~32 KB raw) is shared across all entry points via code splitting. The difference comes from dictionary size per language.

FilesExpand file tree

api.md

Latest commit

History

api.md

File metadata and controls

API Reference

Table of Contents

Installation

Quick Start

Constructor & Options

TerlikOptions

Per-Language Imports

Bundle Size Comparison

Sub-Path Exports

Usage

Per-Language Terlik Constructor

Detection Methods

containsProfanity(text, options?): boolean

getMatches(text, options?): MatchResult[]

MatchResult Interface

clean(text, options?): string

Mask Styles

language: string (read-only)

Runtime Dictionary Management

addWords(words: string[]): void

removeWords(words: string[]): void

extendDictionary Option

Static Methods

Terlik.warmup(languages?, options?): Map<string, Terlik>

getSupportedLanguages(): string[]

Detection Modes

Detailed Comparison

Per-Call Options

DetectOptions

CleanOptions

Examples

Normalizer (Standalone)

normalize(text): string

createNormalizer(config): (text: string) => string

NormalizerConfig

Advanced: TerlikCore

When to Use

Usage

Using with Per-Language Config

Constructor

getPatterns(): Map<string, RegExp>

Exported Types

Core Types

Interfaces

Performance Notes

Lazy Compilation

Warmup Strategies

Per-Language Bundle Sizes

`TerlikOptions`

Per-Language `Terlik` Constructor

`containsProfanity(text, options?): boolean`

`getMatches(text, options?): MatchResult[]`

`MatchResult` Interface

`clean(text, options?): string`

`language: string` (read-only)

`addWords(words: string[]): void`

`removeWords(words: string[]): void`

`extendDictionary` Option

`Terlik.warmup(languages?, options?): Map<string, Terlik>`

`getSupportedLanguages(): string[]`

`DetectOptions`

`CleanOptions`

`normalize(text): string`

`createNormalizer(config): (text: string) => string`

`NormalizerConfig`

`getPatterns(): Map<string, RegExp>`