Skip to content

Commit 837710a

Browse files
authored
docs: photo search example (#924)
1 parent 71704b0 commit 837710a

File tree

4 files changed

+36
-15
lines changed

4 files changed

+36
-15
lines changed

docs/docs/examples/examples/photo_search.md

Lines changed: 36 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,34 +10,47 @@ sidebar_custom_props:
1010
tags: [vector-index, multi-modal]
1111
---
1212

13-
import { GitHubButton, YouTubeButton } from '../../../src/components/GitHubButton';
13+
import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/components/GitHubButton';
1414

1515
<GitHubButton url="https://github.com/cocoindex-io/cocoindex/tree/main/examples/face_recognition"/>
1616

17-
18-
## What We Will Achieve
17+
## Overview
18+
We’ll walk through a comprehensive example of building a scalable face recognition pipeline. We’ll
1919
- Detect all faces in the image and extract their bounding boxes
2020
- Crop and encode each face image into a 128-dimensional face embedding
2121
- Store metadata and vectors in a structured index to support queries like:
2222
“Find all similar faces to this one” or “Search images that include this person”
2323

24+
With this, you can build your own photo search app with face detection and search.
2425

25-
## Indexing Flow
26+
## Flow Overview
27+
![Flow Overview](/img/examples/photo_search/flow.png)
2628

27-
1. We ingest a list of images.
28-
2. For each image, we:
29+
1. Ingest the images.
30+
2. For each image,
2931
- Extract faces from the image.
3032
- Compute embeddings for each face.
31-
3. We export the following fields to a table in Postgres with PGVector:
33+
3. Export following fields to a table in Postgres with PGVector:
3234
- Filename, rect, embedding for each face.
3335

36+
## Setup
37+
- [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
38+
39+
- Install Qdrant
40+
```sh
41+
docker run -d -p 6334:6334 -p 6333:6333 qdrant/qdrant
42+
```
43+
44+
- Install dependencies:
45+
```sh
46+
pip install -e .
47+
```
3448

35-
## Image Ingestion
49+
## Add source
3650

3751
We monitor an `images/` directory using the built-in `LocalFile` source. All newly added files are automatically processed and indexed.
52+
3853
```python
39-
python
40-
CopyEdit
4154
@cocoindex.flow_def(name="FaceRecognition")
4255
def face_recognition_flow(flow_builder, data_scope):
4356
data_scope["images"] = flow_builder.add_source(
@@ -56,7 +69,6 @@ or [Azure Blob store](https://cocoindex.io/docs/ops/sources#azureblob).
5669

5770
We use the `face_recognition` library under the hood, powered by dlib’s CNN-based face detector. Since the model is slow on large images, we downscale wide images before detection.
5871

59-
6072
```python
6173
@cocoindex.op.function(
6274
cache=True,
@@ -109,9 +121,9 @@ with data_scope["images"].row() as image:
109121
```
110122
111123
After this step, each image has a list of detected faces and bounding boxes.
112-
113124
Each detected face is cropped from the original image and stored as a PNG.
114125
126+
![Extracted Faces](/img/examples/photo_search/extraction.png)
115127
116128
## Compute Face Embeddings
117129
@@ -154,7 +166,6 @@ face_embeddings.collect(
154166
rect=face["rect"],
155167
embedding=face["embedding"],
156168
)
157-
158169
```
159170
160171
We export to a Qdrant collection:
@@ -171,12 +182,12 @@ face_embeddings.export(
171182
172183
Now you can run cosine similarity queries over facial vectors.
173184
174-
CocoIndex supports 1-line switch with other vector databases like [Postgres](https://cocoindex.io/docs/ops/targets#postgres).
185+
CocoIndex supports 1-line switch with other vector databases.
186+
<DocumentationButton url="https://cocoindex.io/docs/ops/targets#postgres" text="Postgres" />
175187
176188
## Query the Index
177189
178190
You can now build facial search apps or dashboards. For example:
179-
180191
- Given a new face embedding, find the most similar faces
181192
- Find all face images that appear in a set of photos
182193
- Cluster embeddings to group visually similar people
@@ -187,3 +198,13 @@ For querying embeddings, check out [Image Search project](https://cocoindex.io/b
187198
If you’d like to see a full example on the query path with image match, give it a shout at
188199
[our group](https://discord.com/invite/zpA9S2DR7s).
189200
201+
## CocoInsight
202+
CocoInsight is a tool to help you understand your data pipeline and data index. It can now visualize identified sections of an image based on the bounding boxes and makes it easier to understand and evaluate AI extractions - seamlessly attaching computed features in the context of unstructured visual data.
203+
204+
You can walk through the project step by step in [CocoInsight](https://www.youtube.com/watch?v=MMrpUfUcZPk) to see exactly how each field is constructed and what happens behind the scenes.
205+
206+
```sh
207+
cocoindex server -ci main.py
208+
```
209+
210+
Follow the url `https://cocoindex.io/cocoinsight`. It connects to your local CocoIndex server, with zero pipeline data retention.
-53.8 KB
Loading
669 KB
Loading
84.4 KB
Loading

0 commit comments

Comments
 (0)