A powerful semantic search plugin for Fess, the open-source enterprise search server. This plugin extends Fess's search capabilities by integrating neural search using OpenSearch's machine learning features and vector similarity search.
- Neural Search Integration: Leverages OpenSearch ML Commons plugin for semantic vector search
- Automatic Query Rewriting: Converts traditional text queries to neural queries when appropriate
- Rank Fusion Processing: Combines traditional and semantic search results for improved relevance
- Content Chunking: Processes long documents in chunks for better semantic matching
- Configurable Models: Supports multiple pre-trained transformer models from HuggingFace
- Seamless Integration: Works as a drop-in plugin for existing Fess installations
- Fess 15.0+ (Full-text Enterprise Search Server)
- OpenSearch 2.x with ML Commons plugin enabled
- Docker and Docker Compose (recommended for setup)
git clone https://github.com/codelibs/docker-fess.git
cd docker-fess/compose
Add the following line to your compose.yaml
:
environment:
- "FESS_PLUGINS=fess-webapp-semantic-search:15.1.0"
docker compose -f compose.yaml -f compose-opensearch2.yaml up -d
Download and run the setup script:
curl -o setup.sh https://raw.githubusercontent.com/codelibs/fess-webapp-semantic-search/main/tools/setup.sh
chmod +x setup.sh
./setup.sh localhost:9200
The setup script will:
- Display available pre-trained models
- Register your selected model in OpenSearch
- Create the neural search pipeline
- Provide the configuration settings
In Fess Admin Panel (Admin > General > System Properties), add the configuration provided by the setup script:
fess.semantic_search.pipeline=neural_pipeline
fess.semantic_search.content.field=content_vector
fess.semantic_search.content.dimension=384
fess.semantic_search.content.method=hnsw
fess.semantic_search.content.engine=lucene
fess.semantic_search.content.model_id=<your-model-id>
- Go to Admin > Maintenance and start reindexing
- Create your crawling configuration
- Start the crawler
- Begin semantic searching!
The plugin supports various pre-trained transformer models:
Model | Dimension | Description |
---|---|---|
all-MiniLM-L6-v2 | 384 | Fast and efficient, good for general use |
all-mpnet-base-v2 | 768 | Higher quality, slower performance |
all-distilroberta-v1 | 768 | RoBERTa-based, good performance |
msmarco-distilbert-base-tas-b | 768 | Optimized for passage retrieval |
multi-qa-MiniLM-L6-cos-v1 | 384 | Specialized for question answering |
paraphrase-multilingual-MiniLM-L12-v2 | 384 | Multilingual support |
Property | Description | Default |
---|---|---|
fess.semantic_search.pipeline |
Neural search pipeline name | - |
fess.semantic_search.content.model_id |
ML model ID in OpenSearch | - |
fess.semantic_search.content.field |
Vector field name | - |
fess.semantic_search.content.dimension |
Vector dimension size | - |
Property | Description | Default |
---|---|---|
fess.semantic_search.content.method |
Vector search method | hnsw |
fess.semantic_search.content.engine |
Vector search engine | lucene |
fess.semantic_search.content.space_type |
Distance calculation method | cosinesimil |
fess.semantic_search.min_score |
Minimum similarity score | - |
fess.semantic_search.min_content_length |
Minimum content length for processing | - |
fess.semantic_search.content.chunk_size |
Number of chunks to return | 1 |
Property | Description | Default |
---|---|---|
fess.semantic_search.content.param.m |
HNSW M parameter | 16 |
fess.semantic_search.content.param.ef_construction |
HNSW ef_construction parameter | 100 |
- SemanticSearchHelper: Central component managing neural search configuration and model interactions
- NeuralQueryBuilder: Custom OpenSearch query builder for neural/vector search queries
- SemanticPhraseQueryCommand: Converts phrase queries to neural queries when appropriate
- SemanticTermQueryCommand: Handles term-based semantic search queries
- SemanticSearcher: Extends Fess's DefaultSearcher for rank fusion processing
- Query Processing: Integrates with Fess's QueryParser to rewrite queries for semantic search
- Document Processing: Adds rewrite rules for OpenSearch mapping and settings to support vector fields
- Rank Fusion: Registers as a searcher in Fess's rank fusion processor
- DI Container: Uses LastaDi for dependency injection
git clone https://github.com/codelibs/fess-webapp-semantic-search.git
cd fess-webapp-semantic-search
mvn clean package
mvn test
mvn clean compile javadoc:javadoc
The plugin is available from Maven Central:
<dependency>
<groupId>org.codelibs.fess</groupId>
<artifactId>fess-webapp-semantic-search</artifactId>
<version>15.1.0</version>
</dependency>
- Download the JAR from Maven Repository
- Place it in your Fess webapp/WEB-INF/lib/ directory
- Restart Fess
See the Fess Plugin Guide for detailed installation instructions.
We welcome contributions!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for new functionality
- Run the test suite (
mvn test
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project uses:
- Maven for build management
- JUnit for testing
- CheckStyle for code formatting
- JavaDoc for documentation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Plugin Version | Fess Version | OpenSearch Version |
---|---|---|
15.0.x | 15.0+ | 2.x |
14.9.x | 14.9+ | 2.x |
- Documentation: Fess Documentation
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Community: Fess Community
- CodeLibs for developing and maintaining Fess
- HuggingFace for providing pre-trained transformer models
- OpenSearch team for ML Commons plugin
- All contributors who have helped improve this plugin