Skip to content

Comments

feat: support local ai embedding, local ai search, local document content search#7839

Merged
appflowy merged 35 commits intomainfrom
local_embedding
May 2, 2025
Merged

feat: support local ai embedding, local ai search, local document content search#7839
appflowy merged 35 commits intomainfrom
local_embedding

Conversation

@appflowy
Copy link
Contributor

@appflowy appflowy commented Apr 27, 2025

• Remove the obsolete Supabase test that’s no longer used.
• Add support for generating and using local embeddings.
• Introduce local AI-powered search functionality.
• Enable full-text search over document content (previously we only indexed files that existed on disk).
• Refactor folder-view search so that each workspace maintains its own Tantivy index directory.
• Reindex a document automatically whenever its content hash changes.
• Add end-to-end integration tests to verify search functionality.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Apr 27, 2025

Reviewer's Guide by Sourcery

This pull request introduces local embedding generation and storage using Ollama and a SQLite-based vector database. The implementation includes setting up a new vector database, integrating Ollama for embedding generation, creating an asynchronous scheduler to process content for embedding, and updating the core application flow to trigger indexing for supported collaboration objects like documents.

No diagrams generated as the changes look simple and do not need a visual representation.

File-Level Changes

Change Details Files
Introduce the flowy-sqlite-vec crate and integrate vector database functionality.
  • Add a new crate flowy-sqlite-vec for vector database interactions.
  • Define new schema tables collab_table and collab_embeddings_table in flowy-sqlite.
  • Add SQL migration files for the new database tables.
  • Implement VectorSqliteDB for handling SQLite-vec operations like upserting and searching embeddings.
  • Initialize the vector database path within AppFlowyCore initialization.
frontend/rust-lib/flowy-core/Cargo.toml
frontend/rust-lib/flowy-sqlite/src/schema.rs
frontend/rust-lib/flowy-sqlite/migrations/2025-04-25-071459_local_ai_model/up.sql
frontend/rust-lib/flowy-sqlite/migrations/2025-04-25-071459_local_ai_model/down.sql
frontend/rust-lib/flowy-sqlite-vec/Cargo.toml
frontend/rust-lib/flowy-sqlite-vec/src/db.rs
frontend/rust-lib/flowy-sqlite-vec/src/lib.rs
frontend/rust-lib/flowy-sqlite-vec/src/migration.rs
frontend/rust-lib/flowy-sqlite-vec/migrations/001-init/up.sql
frontend/rust-lib/flowy-sqlite-vec/tests/main.rs
frontend/rust-lib/flowy-core/src/lib.rs
Implement embedding generation, scheduling, and indexing logic.
  • Add a new embeddings module to flowy-ai.
  • Create EmbedContext to manage shared state for embeddings (Ollama client, Vector DB).
  • Implement EmbeddingScheduler with background tasks for generating and writing embeddings.
  • Define Embedder trait and OllamaEmbedder for interacting with Ollama.
  • Define Indexer trait and DocumentIndexer for splitting text into chunks and creating embedded chunks.
  • Introduce UnindexedCollab and EmbeddedChunk entities in flowy-ai-pub.
  • Add PeriodicallyEmbeddingWrite and PeriodicallyWriter trait in collab-integrate to handle asynchronous writing of embeddings.
  • Implement PeriodicallyWriterImpl in flowy-core to integrate with the embedding scheduler.
  • Add persistence logic for CollabTable in flowy-ai-pub.
frontend/rust-lib/flowy-ai/src/embeddings/scheduler.rs
frontend/rust-lib/flowy-ai/src/embeddings/document_indexer.rs
frontend/rust-lib/flowy-ai/src/embeddings/embedder.rs
frontend/rust-lib/flowy-ai/src/embeddings/indexer.rs
frontend/rust-lib/flowy-ai/src/embeddings/mod.rs
frontend/rust-lib/flowy-ai/src/lib.rs
frontend/rust-lib/flowy-ai-pub/src/entities.rs
frontend/rust-lib/flowy-ai-pub/src/lib.rs
frontend/rust-lib/collab-integrate/src/period_write.rs
frontend/rust-lib/collab-integrate/src/lib.rs
frontend/rust-lib/flowy-core/src/deps_resolve/collab_deps.rs
frontend/rust-lib/flowy-ai-pub/src/persistence/collab_sql.rs
Integrate local AI and embedding features into the application lifecycle and data flow.
  • Add reload_ollama_client method to LocalAIController and call it on relevant application events.
  • Update AppFlowyCoreConfig and related initialization logic.
  • Modify AIManager to reload the Ollama client when settings change.
  • Add LocalEmbeddingNotReady error code.
  • Integrate PeriodicallyWriterImpl into WorkspaceCollabIntegrateImpl.
  • Modify AppFlowyCollabBuilder to optionally include an embeddings_writer.
frontend/rust-lib/flowy-ai/src/local_ai/controller.rs
frontend/rust-lib/flowy-core/src/config.rs
frontend/rust-lib/flowy-core/src/user_state_callback.rs
frontend/rust-lib/flowy-core/src/server_layer.rs
frontend/rust-lib/flowy-ai/src/ai_manager.rs
frontend/rust-lib/flowy-error/src/code.rs
frontend/rust-lib/flowy-core/src/deps_resolve/collab_deps.rs
frontend/rust-lib/collab-integrate/src/collab_builder.rs
Update build configuration and dependencies.
  • Update Rust toolchain to 1.84.0.
  • Add new crate dependencies (flowy-sqlite-vec, text-splitter, twox-hash).
  • Update libsqlite3-sys dependency.
  • Add flowy-sqlite-vec to the workspace members.
  • Include Rust version in the print-env make task.
  • Remove the unused macros.rs file from flowy-sqlite.
rust-toolchain.toml
frontend/rust-lib/rust-toolchain.toml
frontend/Makefile.toml
frontend/rust-lib/Cargo.toml
frontend/rust-lib/flowy-ai/Cargo.toml
frontend/rust-lib/collab-integrate/Cargo.toml
frontend/rust-lib/flowy-core/Cargo.toml
frontend/rust-lib/flowy-sqlite/Cargo.toml
frontend/rust-lib/flowy-sqlite/src/lib.rs
frontend/rust-lib/flowy-sqlite/src/macros.rs
Minor updates and cleanup in other modules.
  • Update log prefixes from [AI Plugin] to [Local AI].
  • Minor adjustments in database filter and sort logic.
  • Minor update in ChunkedBytes calculation.
  • Change runtime field visibility in lib-dispatch.
  • Minor updates in AST symbol comparison.
  • Minor cleanup in document manager and search manager.
frontend/rust-lib/flowy-ai/src/local_ai/controller.rs
frontend/rust-lib/flowy-database2/tests/database/filter_test/script.rs
frontend/rust-lib/flowy-database2/src/services/group/entities.rs
frontend/rust-lib/flowy-database2/src/services/database_view/view_editor.rs
frontend/rust-lib/flowy-database2/src/services/sort/controller.rs
frontend/rust-lib/flowy-database2/src/services/filter/entities.rs
frontend/rust-lib/flowy-document/src/manager.rs
frontend/rust-lib/flowy-search/src/services/manager.rs
frontend/rust-lib/flowy-storage-pub/src/chunked_byte.rs
frontend/rust-lib/lib-dispatch/src/runtime.rs
frontend/rust-lib/build-tool/flowy-ast/src/symbol.rs
frontend/rust-lib/flowy-document/tests/document/util.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@appflowy appflowy marked this pull request as ready for review April 28, 2025 05:20
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @appflowy - I've reviewed your changes - here's some feedback:

Overall Comments:

  • The migration down script migrations/.../down.sql appears to contain unresolved merge conflict markers.
  • Consider moving unrelated refactorings (e.g., filter logic, sorting signatures, AST changes) to separate PRs to focus this one on embeddings.
  • The extensive conditional compilation for the desktop-only embedding feature significantly increases complexity across multiple crates.
Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟡 Testing: 2 issues found
  • 🟡 Complexity: 2 issues found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@appflowy appflowy changed the title Local embedding feat: support local ai embedding, local ai search Apr 28, 2025
@appflowy appflowy changed the title feat: support local ai embedding, local ai search feat: support local ai embedding, local ai search, local document content search Apr 29, 2025
@appflowy appflowy merged commit 612d652 into main May 2, 2025
19 of 20 checks passed
@appflowy appflowy deleted the local_embedding branch May 2, 2025 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant