Full API reference for terlik.js. For a quick overview, see the README.
- Installation
- Quick Start
- Constructor & Options
- Per-Language Imports
- Detection Methods
- Runtime Dictionary Management
- Static Methods
- Detection Modes
- Per-Call Options
- Normalizer (Standalone)
- Advanced: TerlikCore
- Exported Types
- Performance Notes
npm install terlik.js
# or
pnpm add terlik.js
# or
yarn add terlik.jsimport { Terlik } from "terlik.js";
const terlik = new Terlik();
terlik.containsProfanity("siktir git"); // true
terlik.clean("siktir git burdan"); // "****** git burdan"
terlik.getMatches("siktir git"); // [{ word: "siktir", root: "sik", ... }]import { Terlik } from "terlik.js";
const terlik = new Terlik(options?: TerlikOptions);| Option | Type | Default | Description |
|---|---|---|---|
language |
string |
"tr" |
Language code. Built-in: "tr", "en", "es", "de". |
mode |
Mode |
"balanced" |
Detection mode: "strict", "balanced", or "loose". |
maskStyle |
MaskStyle |
"stars" |
Masking style for clean(): "stars", "partial", or "replace". |
replaceMask |
string |
"[***]" |
Replacement text when maskStyle is "replace". |
customList |
string[] |
undefined |
Additional words to detect (added as high-severity exact entries). |
whitelist |
string[] |
undefined |
Additional words to exclude from detection. |
enableFuzzy |
boolean |
false |
Enable fuzzy matching (Levenshtein or Dice similarity). |
fuzzyThreshold |
number |
0.8 |
Similarity threshold for fuzzy matching (0–1). 0.8 ≈ 1 typo per 5 chars. |
fuzzyAlgorithm |
FuzzyAlgorithm |
"levenshtein" |
"levenshtein" or "dice". |
maxLength |
number |
10000 |
Truncate input beyond this length. Must be ≥ 1. |
backgroundWarmup |
boolean |
false |
Compile patterns in background via setTimeout. Do not use in serverless. |
extendDictionary |
DictionaryData |
undefined |
External dictionary to merge with the built-in one. See Runtime Dictionary Management. |
disableLeetDecode |
boolean |
false |
Disable leet-speak decoding and number expansion. Safety layers (NFKD, diacritics, Cyrillic) remain active. |
disableCompound |
boolean |
false |
Disable CamelCase decompounding (3rd detection pass). Explicit compound variants in the dictionary are unaffected. |
minSeverity |
Severity |
undefined |
Minimum severity threshold. Matches below this level are excluded. |
excludeCategories |
Category[] |
undefined |
Categories to exclude from results. |
If you only need one language, use sub-path imports to reduce bundle size:
import { Terlik } from "terlik.js/tr"; // Turkish only (~10 KB gzip)
import { Terlik } from "terlik.js/en"; // English only (~10 KB gzip)
import { Terlik } from "terlik.js/es"; // Spanish only (~9 KB gzip)
import { Terlik } from "terlik.js/de"; // German only (~9 KB gzip)| Import | Gzip size | Includes |
|---|---|---|
terlik.js |
~14 KB | All 4 languages (TR, EN, ES, DE) |
terlik.js/tr |
~10 KB | Turkish only |
terlik.js/en |
~10 KB | English only |
terlik.js/es |
~9 KB | Spanish only |
terlik.js/de |
~9 KB | German only |
Each per-language entry point (terlik.js/tr, /en, /es, /de) exports:
| Export | Description |
|---|---|
Terlik |
Pre-configured class for that language. No language option needed. |
createTerlik(options?) |
Factory function returning a Terlik instance. |
languageConfig |
Pre-resolved LanguageConfig object for advanced use with TerlikCore. |
TerlikCore |
Low-level engine class. See Advanced: TerlikCore. |
| All types | TerlikOptions, DetectOptions, CleanOptions, MatchResult, etc. |
// Basic — just like the main entry, but single-language
import { Terlik } from "terlik.js/tr";
const t = new Terlik();
t.containsProfanity("siktir"); // true
// Factory
import { createTerlik } from "terlik.js/en";
const en = createTerlik({ mode: "strict" });
en.containsProfanity("fuck"); // true
// Advanced — use languageConfig with TerlikCore
import { TerlikCore, languageConfig } from "terlik.js/tr";
const custom = new TerlikCore(languageConfig, { mode: "loose", enableFuzzy: true });The per-language Terlik class omits the language option since the language is fixed:
// Per-language: language is fixed, cannot be overridden
import { Terlik } from "terlik.js/tr";
const t = new Terlik({ mode: "strict" }); // OK
// const t = new Terlik({ language: "en" }); // TypeScript errorReturns true if the text contains any profanity match.
Parameters:
| Param | Type | Description |
|---|---|---|
text |
string |
Input text to check. |
options |
DetectOptions |
Optional per-call overrides. See Per-Call Options. |
Returns: boolean
terlik.containsProfanity("siktir git"); // true
terlik.containsProfanity("merhaba dünya"); // false
// With per-call options
terlik.containsProfanity("damn it", { minSeverity: "high" }); // false (damn is medium)Returns all profanity matches with full details.
Parameters:
| Param | Type | Description |
|---|---|---|
text |
string |
Input text to check. |
options |
DetectOptions |
Optional per-call overrides. |
Returns: MatchResult[]
const matches = terlik.getMatches("siktir git orospucocugu");
// [
// { word: "siktir", root: "sik", index: 0, severity: "high", category: "sexual", method: "pattern" },
// { word: "orospucocugu", root: "orospu", index: 11, severity: "high", category: "insult", method: "pattern" }
// ]interface MatchResult {
/** The matched text from the original input. */
word: string;
/** The dictionary root word. */
root: string;
/** Character index in the original input. */
index: number;
/** Severity of the matched word. */
severity: "high" | "medium" | "low";
/** Content category (undefined for custom words added via customList/addWords). */
category?: "sexual" | "insult" | "slur" | "general";
/** How the match was detected. */
method: "exact" | "pattern" | "fuzzy";
}Returns the input text with profanity masked.
Parameters:
| Param | Type | Description |
|---|---|---|
text |
string |
Input text to clean. |
options |
CleanOptions |
Optional per-call overrides (includes mask options). |
Returns: string
// "stars" (default) — full word replaced with asterisks
terlik.clean("siktir git");
// "****** git"
// "partial" — first and last characters preserved
terlik.clean("siktir git", { maskStyle: "partial" });
// "s****r git"
// "replace" — replaced with fixed text
terlik.clean("siktir git", { maskStyle: "replace" });
// "[***] git"
// Custom replacement text
terlik.clean("siktir git", { maskStyle: "replace", replaceMask: "🤬" });
// "🤬 git"Returns the language code of the instance.
const t = new Terlik({ language: "en" });
t.language; // "en"Add custom words to the detection list at runtime. Patterns are recompiled automatically.
terlik.addWords(["customSlang", "anotherWord"]);
terlik.containsProfanity("customSlang"); // trueRemove words from the detection list. Removes both built-in and custom entries by root. Patterns are recompiled automatically.
terlik.removeWords(["damn"]);
terlik.containsProfanity("damn"); // falseMerge an external dictionary with the built-in one at construction time. Useful for teams managing custom word lists:
const terlik = new Terlik({
extendDictionary: {
version: 1,
suffixes: ["ci", "cu"],
entries: [
{
root: "customword",
variants: ["cust0mword"],
severity: "high",
category: "general",
suffixable: true,
},
],
whitelist: ["safeterm"],
},
});
terlik.containsProfanity("customword"); // true
terlik.containsProfanity("customwordci"); // true (suffix match)
terlik.containsProfanity("safeterm"); // false (whitelisted)
terlik.containsProfanity("siktir"); // true (built-in still works)Merge behavior:
- Duplicate roots are skipped (built-in takes precedence).
- Suffixes are merged (union of both lists).
- Whitelist entries are merged.
- Pattern cache is disabled for extended instances.
Creates and JIT-warms instances for multiple languages at once. Useful for server deployments to eliminate cold-start latency.
Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
languages |
string[] |
All supported | Language codes to warm up. |
options |
Omit<TerlikOptions, "language"> |
undefined |
Shared options applied to all instances. |
Returns: Map<string, Terlik>
// Warm up all languages
const cache = Terlik.warmup();
// Warm up specific languages
const cache = Terlik.warmup(["tr", "en"]);
// With shared options
const cache = Terlik.warmup(["tr", "en"], { mode: "strict" });
// Usage in server
app.post("/chat", (req, res) => {
const lang = req.body.language;
const cleaned = cache.get(lang)!.clean(req.body.message); // <1ms
});Note: Only available from the main
terlik.jsentry point. Per-language entries do not exportwarmup.
Returns all available built-in language codes.
import { getSupportedLanguages } from "terlik.js";
getSupportedLanguages(); // ["tr", "en", "es", "de"]| Mode | Behavior | Best for |
|---|---|---|
strict |
Normalize + exact match only. Highest precision, lowest recall. | Minimum false positives. |
balanced |
Normalize + pattern matching with separator/leet tolerance. Best F1 overall. | General use (default). |
loose |
Pattern + fuzzy matching (Levenshtein or Dice). Highest recall, slightly higher FPR. | Maximum coverage, typo tolerance. |
Strict mode:
- Only matches after normalization (char folding, leet decode, separator removal, repeat collapse).
- No fuzzy matching. No char-class pattern expansion.
- Best when you have controlled input and want zero false positives.
Balanced mode (recommended):
- Full normalization pipeline + regex patterns with char-class expansion.
- Catches leet speak, separators, char repetition, CamelCase compounds.
- Near-zero FPR with high recall. Best for production chat systems.
Loose mode:
- Everything in balanced + fuzzy matching.
- Catches typos and creative misspellings within the similarity threshold.
- ~18x slower than balanced due to O(n*m) similarity computation.
- Use only when typo tolerance is critical.
// Override mode per call
terlik.containsProfanity("text", { mode: "strict" });
terlik.clean("text", { mode: "loose", enableFuzzy: true, fuzzyThreshold: 0.75 });Both containsProfanity and getMatches accept DetectOptions. The clean method accepts CleanOptions which extends DetectOptions with mask options.
| Option | Type | Default | Description |
|---|---|---|---|
mode |
Mode |
Instance default | Override detection mode for this call. |
enableFuzzy |
boolean |
Instance default | Enable/disable fuzzy matching. |
fuzzyThreshold |
number |
Instance default | Similarity threshold (0–1). |
fuzzyAlgorithm |
FuzzyAlgorithm |
Instance default | "levenshtein" or "dice". |
disableLeetDecode |
boolean |
Instance default | Skip leet-speak decoding and number expansion. Safety layers remain active. |
disableCompound |
boolean |
Instance default | Skip CamelCase decompounding pass. |
minSeverity |
Severity |
Instance default | Exclude matches below this severity. |
excludeCategories |
Category[] |
Instance default | Exclude matches in these categories. |
Extends DetectOptions with:
| Option | Type | Default | Description |
|---|---|---|---|
maskStyle |
MaskStyle |
Instance default | "stars", "partial", or "replace". |
replaceMask |
string |
Instance default | Replacement text for "replace" style. |
// Only detect high-severity words
terlik.getMatches("text", { minSeverity: "high" });
// Exclude slurs category
terlik.getMatches("text", { excludeCategories: ["slur"] });
// Disable leet decode for controlled input (e.g. validated usernames)
terlik.containsProfanity("user123", { disableLeetDecode: true });
// Clean with different mask per call
terlik.clean("text", { maskStyle: "partial" });
terlik.clean("text", { maskStyle: "replace", replaceMask: "[CENSORED]" });The normalization pipeline is available as a standalone export for use outside of profanity detection.
Normalizes text using the default Turkish locale pipeline.
import { normalize } from "terlik.js";
normalize("S.İ.K.T.İ.R"); // "siktir"
normalize("$1kt1r"); // "siktir"
normalize("Scheiße"); // "scheisse"Creates a language-specific normalizer function. The returned function applies a 10-stage pipeline:
- Strip invisible chars (ZWSP, ZWNJ, soft hyphen, etc.)
- NFKD decompose (fullwidth to ASCII, precomposed to base + combining)
- Strip combining marks (removes accents/diacritics)
- Locale-aware lowercase
- Cyrillic confusable to Latin
- Language-specific char folding
- Number expansion (if configured)
- Leet decode
- Punctuation removal (between letters)
- Repeat collapse + whitespace trim
import { createNormalizer } from "terlik.js";
const deNormalize = createNormalizer({
locale: "de",
charMap: { ä: "a", ö: "o", ü: "u", ß: "ss" },
leetMap: { "0": "o", "3": "e" },
});
deNormalize("Scheiße"); // "scheisse"
deNormalize("fück"); // "fuck"
deNormalize("fuck"); // "fuck" (fullwidth chars)interface NormalizerConfig {
locale: string;
charMap: Record<string, string>;
leetMap: Record<string, string>;
numberExpansions?: [string, string][];
}TerlikCore is the low-level engine class that powers all Terlik instances. It accepts a pre-resolved LanguageConfig object instead of a language string, bypassing the language registry entirely.
- Custom language configs: Build your own
LanguageConfigfor a language not yet supported. - Advanced tree-shaking: Per-language entry points use
TerlikCoreinternally. Using it directly gives you full control. - Runtime language loading: Load language configs dynamically (e.g., from a database or API).
import { TerlikCore } from "terlik.js";
import type { LanguageConfig } from "terlik.js";
// Build a custom language config
const myLangConfig: LanguageConfig = {
locale: "fr",
charMap: { é: "e", è: "e", ê: "e", ë: "e", ç: "c" },
leetMap: { "0": "o", "1": "i", "3": "e", "@": "a", "$": "s" },
charClasses: {
a: "[a4@àáâãäå]",
e: "[e3èéêë]",
// ...
},
dictionary: {
version: 1,
suffixes: [],
entries: [
{ root: "merde", variants: [], severity: "high", category: "general", suffixable: false },
],
whitelist: [],
},
};
const fr = new TerlikCore(myLangConfig, { mode: "balanced" });
fr.containsProfanity("merde"); // trueEach per-language entry exports its languageConfig:
import { TerlikCore, languageConfig } from "terlik.js/tr";
// Use the Turkish config with custom options
const custom = new TerlikCore(languageConfig, {
mode: "loose",
enableFuzzy: true,
fuzzyThreshold: 0.75,
});new TerlikCore(langConfig: LanguageConfig, options?: TerlikOptions)TerlikCore has the same instance API as Terlik (containsProfanity, getMatches, clean, addWords, removeWords, getPatterns, language), but does not have the static warmup method.
Returns the compiled regex patterns (root → regex). Useful for debugging or custom matching logic.
const patterns = terlik.getPatterns();
for (const [root, regex] of patterns) {
console.log(`${root}: ${regex.source}`);
}All types below are exported from the main entry (terlik.js). Per-language entries (terlik.js/tr, etc.) export all types except WordEntry and NormalizerConfig — use the main entry if you need those.
type Severity = "high" | "medium" | "low";
type Category = "sexual" | "insult" | "slur" | "general";
type Mode = "strict" | "balanced" | "loose";
type MaskStyle = "stars" | "partial" | "replace";
type FuzzyAlgorithm = "levenshtein" | "dice";
type MatchMethod = "exact" | "pattern" | "fuzzy";interface TerlikOptions {
language?: string;
mode?: Mode;
maskStyle?: MaskStyle;
replaceMask?: string;
customList?: string[];
whitelist?: string[];
enableFuzzy?: boolean;
fuzzyThreshold?: number;
fuzzyAlgorithm?: FuzzyAlgorithm;
maxLength?: number;
backgroundWarmup?: boolean;
extendDictionary?: DictionaryData;
disableLeetDecode?: boolean;
disableCompound?: boolean;
minSeverity?: Severity;
excludeCategories?: Category[];
}
interface DetectOptions {
mode?: Mode;
enableFuzzy?: boolean;
fuzzyThreshold?: number;
fuzzyAlgorithm?: FuzzyAlgorithm;
disableLeetDecode?: boolean;
disableCompound?: boolean;
minSeverity?: Severity;
excludeCategories?: Category[];
}
interface CleanOptions extends DetectOptions {
maskStyle?: MaskStyle;
replaceMask?: string;
}
interface MatchResult {
word: string;
root: string;
index: number;
severity: Severity;
category?: Category;
method: MatchMethod;
}
interface WordEntry {
root: string;
variants: string[];
severity: Severity;
category?: string;
suffixable?: boolean;
}
interface LanguageConfig {
locale: string;
charMap: Record<string, string>;
leetMap: Record<string, string>;
charClasses: Record<string, string>;
numberExpansions?: [string, string][];
dictionary: DictionaryData;
}
// Used by extendDictionary option and LanguageConfig.dictionary.
// Not exported from the public API — use this shape inline.
interface DictionaryData {
version: number; // must be 1
suffixes: string[]; // e.g. ["ing", "ed", "er", "s"]
entries: Array<{
root: string; // canonical root form
variants: string[]; // alternative spellings
severity: string; // "high" | "medium" | "low"
category: string; // "sexual" | "insult" | "slur" | "general"
suffixable: boolean; // whether suffix engine applies
}>;
whitelist: string[]; // words to exclude from detection
}
interface NormalizerConfig {
locale: string;
charMap: Record<string, string>;
leetMap: Record<string, string>;
numberExpansions?: [string, string][];
}new Terlik() is near-instant (~1.5ms). Regex patterns compile on the first detection call, not at construction.
| Phase | Cost | When |
|---|---|---|
new Terlik() |
~1.5ms | Construction (lookup tables only) |
First detect() |
~200-700ms | Lazy regex compilation + V8 JIT warmup |
| Subsequent calls | <1ms | Patterns cached, JIT optimized |
| Strategy | When | Example |
|---|---|---|
backgroundWarmup: true |
Long-running servers (Express, Fastify) | new Terlik({ backgroundWarmup: true }) |
| Explicit warmup | Serverless (Lambda, Vercel, Workers) | const t = new Terlik(); t.containsProfanity("warmup"); |
Terlik.warmup() |
Multi-language servers | Terlik.warmup(["tr", "en"]) |
| Lazy (default) | Scripts, CLIs, low-traffic | new Terlik() |
Serverless warning: Do NOT use
backgroundWarmupin serverless runtimes. ThesetTimeoutcallback may never fire because the process freezes between invocations.
Using per-language imports eliminates unused language dictionaries from your bundle:
| Import | Raw | Gzip |
|---|---|---|
terlik.js (all languages) |
~67 KB | ~14 KB |
terlik.js/tr |
~42 KB | ~10 KB |
terlik.js/en |
~43 KB | ~10 KB |
terlik.js/es |
~38 KB | ~9 KB |
terlik.js/de |
~37 KB | ~9 KB |
The core engine (~32 KB raw) is shared across all entry points via code splitting. The difference comes from dictionary size per language.