@bitovi/n8n-nodes-semantic-text-splitter

This is an n8n community node. It lets you use semantic text splitting in your n8n workflows.

Semantic text splitting is an advanced text processing technique that splits text based on semantic similarity rather than simple character or token counts. This node implements the methodology described in 5 Levels Of Text Splitting, which uses embedding models to identify natural breakpoints in text where meaning shifts significantly.

n8n is a fair-code licensed workflow automation platform.

Installation
Operations
Compatibility
Usage
Resources
Version history

Installation

Follow the installation guide in the n8n community nodes documentation.

Quick Installation:

Make sure to allow community nodes with N8N_COMMUNITY_PACKAGES_ENABLED=true
Once logged in to your N8N web UI, go to /settings/community-nodes and type @bitovi/n8n-nodes-semantic-text-splitter

Operations

The Semantic Text Splitter node provides intelligent text splitting capabilities:

Semantic Text Splitting: Splits text into semantically coherent chunks using embedding-based similarity analysis
Configurable Parameters: Control splitting behavior with customizable thresholds and window sizes
Integration with Embedding Models: Works with any n8n-compatible embedding model for similarity calculations

Compatibility

Minimum n8n version: Compatible with n8n v1.0+
Node.js version: Requires Node.js 18.10 or higher
Dependencies: Requires @langchain/core and @langchain/textsplitters

This node is tested with the latest stable version of n8n and should work with most n8n installations that support community nodes.

Usage

Node Configuration

The Semantic Text Splitter node requires an embedding model connection and supports the following parameters:

Percentile Threshold for Breakpoints (default: 0.95): Controls how selective the algorithm is when identifying breakpoints. Higher values result in fewer, larger chunks.
Number of Sentences to Consider in Sliding Window (default: 3): The size of the sliding window used for embedding comparison.
Minimum Chunk Size (default: 100): Minimum character length for generated chunks.
Sentence Delimiters (default: ".!?"): Characters used to identify sentence boundaries.

How It Works

Sentence Splitting: Text is first split into sentences using the configured delimiters
Sliding Windows: Creates overlapping windows of sentences based on the window size
Embedding Generation: Each window is converted to embeddings using the connected embedding model
Distance Calculation: Calculates cosine distances between sequential window embeddings
Breakpoint Detection: Identifies breakpoints where semantic distance exceeds the threshold
Chunk Creation: Creates final text chunks based on identified breakpoints

Example Workflow

Connect a text input (e.g., from a document loader)
Connect an embedding model (e.g., OpenAI Embeddings)
Configure the splitting parameters based on your needs
Connect the output to downstream processing nodes (e.g., vector store, document processing)

This approach creates more coherent chunks compared to simple character or token-based splitting, making it ideal for RAG (Retrieval Augmented Generation) applications and document processing workflows.

Resources

Version history

v0.1.2 (Current): Latest stable release with full semantic splitting functionality
v0.1.x: Initial releases with core semantic text splitting features

Need help or have questions?

Need guidance on leveraging AI agents or N8N for your business? Our AI Agents workshop will equip you with the knowledge and tools necessary to implement successful and valuable agentic workflows.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.vscode		.vscode
nodes/SemanticTextSplitter		nodes/SemanticTextSplitter
.editorconfig		.editorconfig
.eslintrc.js		.eslintrc.js
.eslintrc.prepublish.js		.eslintrc.prepublish.js
.gitignore		.gitignore
.npmignore		.npmignore
.prettierrc.js		.prettierrc.js
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
README.md		README.md
README_TEMPLATE.md		README_TEMPLATE.md
gulpfile.js		gulpfile.js
index.js		index.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
tslint.json		tslint.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

@bitovi/n8n-nodes-semantic-text-splitter

Installation

Operations

Compatibility

Usage

Node Configuration

How It Works

Example Workflow

Resources

Version history

Need help or have questions?

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

bitovi/n8n-nodes-semantic-text-splitter

Folders and files

Latest commit

History

Repository files navigation

@bitovi/n8n-nodes-semantic-text-splitter

Installation

Operations

Compatibility

Usage

Node Configuration

How It Works

Example Workflow

Resources

Version history

Need help or have questions?

License

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages