Fess Multimodal Search Plugin

A powerful multimodal search plugin for Fess that enables semantic search across text, images, and other media formats using CLIP (Contrastive Language-Image Pre-training) embeddings and vector similarity search.

🌟 Features

Multimodal Search: Search across text and images using natural language queries
CLIP Integration: Leverages OpenAI's CLIP model for generating high-quality embeddings
Vector Similarity: Uses OpenSearch/Elasticsearch KNN capabilities for fast vector search
Seamless Integration: Easy installation as a Fess plugin
Scalable Architecture: Built for enterprise-scale search deployments
Open Source: Apache 2.0 licensed with full source code availability

🏗️ Architecture

The plugin extends Fess with the following components:

CasClient: Communicates with CLIP-as-a-Service for embedding generation
MultiModalSearchHelper: Configures vector field mappings and query rewriting
KNNQueryBuilder: Builds k-nearest neighbor queries for vector similarity search
CasExtractor: Extracts and processes image content during crawling
EmbeddingIngester: Handles vector embedding storage and indexing

📋 Requirements

Fess: Version 15.0 or higher
Java: OpenJDK 11 or higher
OpenSearch/Elasticsearch: With KNN plugin support
Docker: For running the CLIP service
GPU (optional): For faster embedding generation

🚀 Quick Start

1. Installation

Download the plugin JAR from Maven Central and install it via the Fess administration console.

Alternatively, add the dependency to your project:

<dependency>
    <groupId>org.codelibs.fess</groupId>
    <artifactId>fess-webapp-multimodal</artifactId>
    <version>15.1.0</version>
</dependency>

2. Start CLIP Service

Clone the repository and start the CLIP API server:

git clone https://github.com/codelibs/fess-webapp-multimodal.git
cd fess-webapp-multimodal/docker
docker compose up -d

The CLIP API will be available at http://localhost:51000.

3. Configure Fess

Add the following system properties in Fess administration console:

fess.multimodal.content.field=content_vector
fess.multimodal.content.dimension=512
fess.multimodal.content.method=hnsw
fess.multimodal.content.engine=lucene
fess.multimodal.content.space_type=cosinesimil
fess.multimodal.min_score=0.5

4. Apply Configuration

Navigate to Scheduler → Execute Config Reloader
Navigate to Maintenance → Execute Re-indexing

5. Start Crawling

Configure and start crawling directories containing images and documents. The plugin will automatically:

Extract text and image content
Generate CLIP embeddings
Store vectors in the search index

🔍 Usage Examples

Text-to-Image Search

Search for images using natural language descriptions:

"red sports car on highway"
"sunset over mountains"
"person playing guitar"

Cross-Modal Search

Find related content across different media types:

"beach vacation" → Returns both text documents and beach images
"cooking recipe" → Returns recipe text and food images

⚙️ Configuration

System Properties

Property	Description	Default	Example
`fess.multimodal.content.field`	Vector field name	`content_vector`	`image_vector`
`fess.multimodal.content.dimension`	Vector dimensions	`512`	`768`
`fess.multimodal.content.method`	KNN algorithm	`hnsw`	`ivf`
`fess.multimodal.content.engine`	Search engine	`lucene`	`nmslib`
`fess.multimodal.content.space_type`	Distance metric	`cosinesimil`	`l2`
`fess.multimodal.min_score`	Minimum similarity score	`0.5`	`0.7`

CLIP Service Configuration

The CLIP service can be customized by modifying docker/clip_config.yaml:

jtype: Flow
version: '1'
with:
  port: 51000
  protocol: http
  cors: true
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch

🧪 Testing

Run the test suite:

mvn clean test

For integration testing with sample data:

# Install test data using FiftyOne
pip install fiftyone
fiftyone zoo datasets load open-images-v7 --split validation --kwargs max_samples=1000 -d ./test-images

# Configure Fess to crawl the test-images directory

📊 Performance

Embedding Generation: ~50ms per image (with GPU), ~200ms (CPU only)
Search Latency: <100ms for vector similarity queries
Throughput: 1000+ documents/minute during indexing
Index Size: ~2KB additional storage per document for vectors

🛠️ Development

Building from Source

git clone https://github.com/codelibs/fess-webapp-multimodal.git
cd fess-webapp-multimodal
mvn clean package

Project Structure

src/main/java/org/codelibs/fess/multimodal/
├── client/          # CLIP service client
├── crawler/         # Content extraction
├── helper/          # Search configuration
├── index/           # Query builders
├── query/           # Query processing
├── rank/            # Result ranking
└── util/            # Utilities

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📚 Documentation

🐛 Troubleshooting

Common Issues

CLIP Service Connection Failed

# Check if CLIP service is running
curl http://localhost:51000/health

# Check Docker logs
docker logs clip_server

Vector Search Not Working

Ensure KNN plugin is installed in OpenSearch/Elasticsearch
Verify vector field mapping in index settings
Check minimum score threshold configuration

Performance Issues

Enable GPU support for CLIP service
Increase JVM heap size for Fess
Optimize KNN index parameters

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

OpenAI CLIP for the foundational multimodal model
Jina AI for the CLIP server implementation
CodeLibs for the Fess search platform
All contributors who have helped improve this project

📞 Support

Issues: GitHub Issues
Documentation: Fess Official Docs

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
docker		docker
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

License

codelibs/fess-webapp-multimodal

Folders and files

Latest commit

History

Repository files navigation

Fess Multimodal Search Plugin

🌟 Features

🏗️ Architecture

📋 Requirements

🚀 Quick Start

1. Installation

2. Start CLIP Service

3. Configure Fess

4. Apply Configuration

5. Start Crawling

🔍 Usage Examples

Text-to-Image Search

Cross-Modal Search

⚙️ Configuration

System Properties

CLIP Service Configuration

🧪 Testing

📊 Performance

🛠️ Development

Building from Source

Project Structure

Contributing

📚 Documentation

🐛 Troubleshooting

Common Issues

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages