A literate, typed JavaScript regex toolkit — powered by TypeScript.
Write regex like a human. Ship it like a machine.
Regular expressions are absurdly powerful — a tiny automaton you can carry in your pocket.
But once a regex grows beyond “a few tokens and a prayer”, it becomes:
- hard to read
- easy to break
- painful to maintain
This library exists to keep regexes literate.
Write a multi-line, commented, PCRE-style regex source (with # notes), then normalize it into a compact JavaScript RegExp.source while preserving the normalized source as a TypeScript string literal type.
That gives you two superpowers:
- Human-friendly editing (readable formatting + comments)
- Machine-friendly safety (typed, normalized sources that flow through your codebase)
In short: fewer regex jump-scares, more confidence.
const re = /^(?:\s*\/\*\*\s+|\s+\*?\s+)(?:(?=@(...))|...)/gm;const RE_SOURCE = `
/^ # start
(?: ... ) # jsdoc start
... # more notes
/gm` as const;
const re = compilePCREStyleRegExpLiteral(RE_SOURCE);- PCRE-ish style regex source:
- multi-line formatting
# ...line comments\#escape for literal#
- Type-level normalization:
- derive normalized JS
RegExp.sourceas a string literal type
- derive normalized JS
- Optional global augmentation:
- opt-in only (
import "literate-regex/global")
- opt-in only (
- Designed to reduce TypeScript instantiation pain:
- line-oriented normalization (helps avoid
ts(2589)compared to naive full-string scanning)
- line-oriented normalization (helps avoid
npm i literate-regex
# or
pnpm add literate-regex
# or
yarn add literate-regeximport { PCREStyleToJsRegExpSource } from "literate-regex";
// Only those who want to expand globally
import "literate-regex/global";#starts a line comment (unless escaped)\#is kept as a literal#- whitespace characters are stripped during normalization
import type { PCREStyleToJsRegExpSource } from "./literate-regex";
// sample 1
const RE_SOURCE = `
^ # start
(?:\\#\\w+) # literal "#"
\\s+ # whitespace
` as const;
// type JsSource = "^(?:#\\w+)\\s+"
type JsSource = PCREStyleToJsRegExpSource<typeof RE_SOURCE>;Tip: You must use
as constto preserve the source as a string literal type.
PCREStyleToJsRegExpSource<...> is purely type-level.
If you also normalize at runtime, mirror the same rules:
import { normalizePCREStyleSource } from "literate-regex";
// import type { PCREStyleToJsRegExpSource } from "literate-regex";
// sample 2
const src = `
^ # start
\\#\\w+ # literal
` as const;
// '^#\\w+'
// const normalized: "^#\\w+"
const normalized = normalizePCREStyleSource(src);import {
TypedRegExp,
// normalizePCREStyleSource,
compilePCREStyleRegExpLiteral,
} from "literate-regex";
import type {
RegExpLiteralParts,
PCREStyleToJsRegExpSource,
RegExpExecArrayFixedPretty,
ReplacerFunctionSignature,
} from "literate-regex";
//
// sample of compilePCREStyleRegExpLiteral
//
const pcreStyledRegex = `/
(\\(\\?\#[\\s\\S]*?(?<!\\\\)\\)(?=\\s*$|.)) # multi line comment
|
(?:^(?:\\s+|))?(?<![\\\\])(\\#(?:\\s|[\\s\\S])*?$) # single line comment
|
(?<regexFragment>
(?:^\\s+)?(?:[^\\s]+)
)+ # regex flagment
|
([\\r|\\r\\n|\\n]+|[\\x20\\t]+(?=$)?) # whitespaces
/gm`;
const jsRegex = compilePCREStyleRegExpLiteral(pcreStyledRegex);
type TPcreStyledRegex = typeof pcreStyledRegex;
type TJsRegexSource = PCREStyleToJsRegExpSource<TPcreStyledRegex>;
type TJsRegexLiteralParts = RegExpLiteralParts<TJsRegexSource>;
type TJsRegexExecArray = RegExpExecArrayFixedPretty<
TypedRegExp<TJsRegexLiteralParts["pattern"]>
>;
type TJsRegexStringReplacer = ReplacerFunctionSignature<
TypedRegExp<TJsRegexLiteralParts["pattern"]>
>;
let m = jsRegex.exec(pcreStyledRegex);
type Test0 = TJsRegexExecArray extends typeof m ? true : false;
type Test1 = typeof m extends TJsRegexExecArray ? true : false;
const replacer: TJsRegexStringReplacer = (...args) => "";
pcreStyledRegex.replace(jsRegex, replacer);
pcreStyledRegex.replace(jsRegex, "");This package provides an optional global augmentation entry:
import "literate-regex/global";This is intentionally opt-in to avoid unexpected type pollution across projects.
-
This is not a full PCRE parser. It focuses on:
- line comments (
# ...) - escaping
\# - whitespace stripping
- line comments (
-
Very large type-level inputs may still hit TS limits depending on your environment. If that happens, split your regex source into smaller pieces.
This library’s whitespace set is based on the ECMAScript definition used by RegExp \s
(WhiteSpace ∪ LineTerminator).
- ECMA-262: White Space (Table 33) https://tc39.es/ecma262/#sec-white-space
- ECMA-262: Line Terminators (Table 34) https://tc39.es/ecma262/#sec-line-terminators
- MDN: RegExp character classes (
\sequivalence) https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes
Released under the Apache-2.0 License.
See LICENSE for details.