Skip to content

Conversation

@shatfield4
Copy link
Collaborator

Pull Request Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ”¨ chore
  • πŸ“ docs

Relevant Issues

resolves #4687

What is in this change?

  • Refactors all vector databases to use classes instead of the object structure that was previously used
  • Creates VectorDatabase base class that is extended onto all vector database providers to ensure all required functions are declared
  • Refactors Zilliz Cloud provider to extend the Milvus class helping reduce code complexity since they have identical functions for interacting with the vector database

Additional Information

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

shatfield4 and others added 18 commits December 9, 2025 09:54
migrate astra to class
migrate lancedb to class
migrate pinecone to class
migrate zilliz to class
migrate weaviate to class
migrate qdrant to class
migrate milvus to class
migrate chroma to class
* migrate chroma to class

* migrate chroma cloud to class

* move limits to class field

---------

Co-authored-by: Timothy Carambat <[email protected]>
* migrate pgvector to class

* patch pgvector test

* convert connectionString, tableName, and validateConnection to static methods

* move instance properties to class fields

---------

Co-authored-by: Timothy Carambat <[email protected]>
simplify zilliz implementation by using milvus as base class

Co-authored-by: Timothy Carambat <[email protected]>
create generic VectorDatabase base class

Co-authored-by: Timothy Carambat <[email protected]>
extend VectorDatabase base class to all providers
@JOduMonT

This comment was marked as off-topic.

@timothycarambat timothycarambat merged commit 5039045 into master Jan 13, 2026
2 checks passed
@timothycarambat timothycarambat deleted the vectordb-class-migration branch January 13, 2026 23:24
timothycarambat added a commit that referenced this pull request Jan 22, 2026
* Migrate to `bcryptjs` (#4767)

* Replace bcrypt with bcryptjs across multiple files

* dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor frontend legacy JSON.parse with safeJsonParse (#4759)

* replace all frontend legacy JSON.parse with safeJsonParse

* default collapsed sidebar menu on failed parse

* remove extra check on conditional render

* undo singular json parse

* add guard clause and return null for `userFromStorage`

* patch domainList

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Fix pagination bug in paperless-ngx data connector (#4757)

* iterate over all pages in paperless-ngx data connector

* add error handling and data validation

* refactor to handle edge cases and null values

* catch edge case to prevent infinite loop

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Fix Stale User Session with Proper `fetch` Error Handling (#4770)

* add refresh user functionality

* prettier

* add eslint disable comment for exhaustive-deps warning in AuthContext to stop nagging about navigate func

* remove unused imports and fix typo

* handle unsafe parse of undefined for in-session user deleted

* Refactor refreshUser function to handle errors and return structured response. Update AuthProvider to manage user data based on success status.

* Remove console error logging from promise catch in System model for cleaner error handling.

* change status from 404 to 400 and valid to success

* Refactor error handling in AuthProvider's refreshUser logic to remove redundant catch block and streamline user session management on failure.

* prettier

* reorder clauses - return errors

* refactor
account for all user modes
dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Add Auth Token to Ollama Embedding Client (#4766)

* Enhance OllamaEmbedder to support authentication by adding an authorization token in headers for client initialization.

* Add optional Auth Token input for Ollama embedding options

* move info elements

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Upgrade to Multer 2.0.0 (#4768)

* upgrade to multer 2.0.0

* bump dev

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Implement Global Error Boundary (#4765)

* Implement global error boundary

* add 404 page for generic path catching

* devbuild

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Feat/cohere agent implementation (#4703)

* implement cohere agent support

* run yarn lint

* moderize Cohere
add supported langchain method
redo streaming since it was not working
looping of agent calls was not functioning

* change default model to real model tag
add case statement for model tag

* remove debug

* update default

* only whitelist known labels

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Upgrade MCP SDK to Latest (1.24.3) (#4773)

* upgrade mcp sdk to  latest (1.24.3)

* Upgrade MCP version floor in package.json to 1.24.3

* fix(devcontainer): forward ports 3000/3001 (#4779)

* 4601 log model on response (#4781)

* add model tag to chatCompletion

* add modelTag `model` to async streaming
keeps default arguments for prompt token calculation where applied via explict arg

* fix HF default arg

* render all performance metrics as available for backward compatibility
add `timestamp` to both sync/async chat methods

* extract metrics string to function

* Update Google Search Option Description To Reference Documentation For Rate Limits (#4789)

* Update Google Search description to reference documentation for rate limits

* remove

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor `LLMPerformanceMonitor.measureStream()` to Use Options Object Pattern (#4786)

* Refactor LLMPerformanceMonitor to use options object for measureStream parameters

* Refactor invocations of `measureStream` to use options arguments

* Change invocation of `measureStream` in anthropic provider to use options argument

---------

Co-authored-by: Timothy Carambat <[email protected]>

* hanging lint

* fix unnecessary scrollbar in workspace general appearance settings tab (#4791)

* fixed SuggestedChatMessages width styling

* ran yarn lint

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Add Eslint Config in `/frontend` (#4785)

* Add local ESLint configuration and disable rules to allow for errorless state

* Remove unnecessary ESLint disable comments in AuthContext and usePromptInputStorage for cleaner code.

* Update eslint-plugin-react-hooks

* Configure prettier to work with eslint

* Removed trailing commas from eslint config

* Prettier to source code

* add a v2 lint script

* put back eslint-disable comments

* fix eslinter and prettier application
always apply --fix since we --write prettier, otherwise it fails

* precaution dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor localWhisper to use custom FFMPEGWrapper class (#4775)

* refactor localWhisper to use new custom FFMPEGWrapper class

* stub tests in github actions

* add back wavefile conversion to 16khz 32f to fix docker builds

* use afterEach for cleanup in ffmpeg tests

* remove unused FFMPEG_PATH env check

* use spawnSync for ffmpeg to capture and log output

* lint

* revert removal of try/catch around validateAudioFile for more helpful error msgs

* use readFileSync instead of createReadStream for less overhead

* change import to require for fix-path and stub import in tests

* refactor to singleton to preserve ffmpeg path
dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor Managed Services in "Data Handling & Privacy" Onboarding Step to Use Their Privacy Policy URL (#4790)

* Refactor non-local LLM Provider, Vector Database, and Embedding Engine privacy information to use their policy URLs instead of descriptions

* Update LLM Provider, Embedding Engine, and Vector Database sections to include privacy policy links

* fix broken links, lint

* Update AstraDB privacy policy URL in onboarding flow

* Refactor AnythingLLM Privacy & Data page to show managed provider privacy policy URLs

* Update Mistral privacy policy URLs in onboarding flow for consistency

* Abstract privacy policies of providers into a reusable component | Refactor Privacy & Data Handling Step of onboarding flow to focus on solely rendering that step | Move provider privacy policy maps into constants.js

* Remove commented-out code for third-party provider privacy policies in Privacy and Data Handling component

* Update privacy policy descriptions for consistency by adding periods at the end of sentences in ProviderPrivacy component and constants.js

* rescope constants for providers

* extract default to external function, add loading state

---------

Co-authored-by: Timothy Carambat <[email protected]>

* patch ESM import issue (#4819)

* Upgrade YT Scraper (#4820)

* Merge commit from fork

* Update Sponsors README

* fix: validate chat message input (#4811)

* fix: validate chat message input

* fix: align message validation for thread stream-chat endpoint

---------

Co-authored-by: Timothy Carambat <[email protected]>

* patch AWS credential issue in docker context (#4842)

path AWS credential issue in docker context

* support AWS bedrock agents with streaming (#4850)

* support AWS bedrock agents with streaming

* Add back error handlers from previous fix

* VectorDB class migration (#4787)

* Migrate Astra to class (#4722)

migrate astra to class

* Migrate LanceDB to class (#4721)

migrate lancedb to class

* Migrate Pinecone to class (#4726)

migrate pinecone to class

* Migrate Zilliz to class (#4729)

migrate zilliz to class

* Migrate Weaviate to class (#4728)

migrate weaviate to class

* Migrate Qdrant to class (#4727)

migrate qdrant to class

* Migrate Milvus to class (#4725)

migrate milvus to class

* Migrate Chroma to class (#4723)

migrate chroma to class

* Migrate Chroma Cloud to class (#4724)

* migrate chroma to class

* migrate chroma cloud to class

* move limits to class field

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Migrate PGVector to class (#4730)

* migrate pgvector to class

* patch pgvector test

* convert connectionString, tableName, and validateConnection to static methods

* move instance properties to class fields

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor Zilliz Cloud vector DB provider (#4749)

simplify zilliz implementation by using milvus as base class

Co-authored-by: Timothy Carambat <[email protected]>

* VectorDatabase base class (#4738)

create generic VectorDatabase base class

Co-authored-by: Timothy Carambat <[email protected]>

* Extend VectorDatabase base class to all providers (#4755)

extend VectorDatabase base class to all providers

* patch lancedb import

* breakout name and add generic logger

* dev tag build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Make XLSX spreadsheets visible in chat by combining sheets (#4847)

* fix bug with xlsx files not being added as context

* lint

* fix console logs/warn/error

* abstract sheet processing to function + normalize error handling

* fix jsdoc

* patch xlsx filename to prevent orphaned doc

* reduce tokens

* correct pluralization

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Remove Workspace Creation Onboarding Page (#4823)

* remove create workspace step for onboarding

* remove unused image

* workspace creation into dedicated useEffect + use translated workspace name

* dev tag

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Improved DMR support (#4863)

* Improve DMR support
- Autodetect models installed
- Grab all models from hub.docker to show available
- UI to handle render,search, install, and management of models
- Support functionality for chat, stream, and agentic calls

* forgot files

* fix loader circle being too large
fix tooltip width command
adjust location of docker installer open for web platform

* adjust imports

* AnythingLLM Mobile live (#4864)

* remove new labels on landing

* minor DMR UI changes + dynamic tooltip for context management

* Adjust fix path to use ESM import (#4867)

* Adjust fix path to use ESM import

* normalize fix-path imports and usage across the app

* extract path fix logic to utils for server and collector

* add helpers

* repin strip-ansi in collector

* fix log for localWhisper
lint

* Add postsettled callers to updateENV

* minor refactor for context window finder

* Extract Model Table to component (#4871)

* Extract Model Table to component
Add provider icons to header rows and installed models
Light mode supported
Mapping for model name id hints to provider
Update DMR to filter chat models by ability since not available via hub API

* linting + dev

* fix incorrect import

* remove race condition regression for FoundryLocal provider

* remove duplicated steam method on cohere handler

* feat(i18n): add Czech (cs) language translation to AnythingLLM (#4874)

Co-authored-by: Timothy Carambat <[email protected]>

* Docker model runner download from UI (#4884)

* Enable downloads of DMR models from UI

* add utils + dev build

* linting

* add fallback key to mono model provider

* update announcements for 1.10.0

* bump versions to 1.10.0

---------

Co-authored-by: Marcello Fitton <[email protected]>
Co-authored-by: Sean Hatfield <[email protected]>
Co-authored-by: Colin Perry <[email protected]>
Co-authored-by: Irene Wang <[email protected]>
Co-authored-by: timothycarambat <[email protected]>
Co-authored-by: Ocheretovich <[email protected]>
Co-authored-by: Vladimir Vlach <[email protected]>
timothycarambat added a commit that referenced this pull request Jan 22, 2026
* Migrate to `bcryptjs` (#4767)

* Replace bcrypt with bcryptjs across multiple files

* dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor frontend legacy JSON.parse with safeJsonParse (#4759)

* replace all frontend legacy JSON.parse with safeJsonParse

* default collapsed sidebar menu on failed parse

* remove extra check on conditional render

* undo singular json parse

* add guard clause and return null for `userFromStorage`

* patch domainList

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Fix pagination bug in paperless-ngx data connector (#4757)

* iterate over all pages in paperless-ngx data connector

* add error handling and data validation

* refactor to handle edge cases and null values

* catch edge case to prevent infinite loop

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Fix Stale User Session with Proper `fetch` Error Handling (#4770)

* add refresh user functionality

* prettier

* add eslint disable comment for exhaustive-deps warning in AuthContext to stop nagging about navigate func

* remove unused imports and fix typo

* handle unsafe parse of undefined for in-session user deleted

* Refactor refreshUser function to handle errors and return structured response. Update AuthProvider to manage user data based on success status.

* Remove console error logging from promise catch in System model for cleaner error handling.

* change status from 404 to 400 and valid to success

* Refactor error handling in AuthProvider's refreshUser logic to remove redundant catch block and streamline user session management on failure.

* prettier

* reorder clauses - return errors

* refactor
account for all user modes
dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Add Auth Token to Ollama Embedding Client (#4766)

* Enhance OllamaEmbedder to support authentication by adding an authorization token in headers for client initialization.

* Add optional Auth Token input for Ollama embedding options

* move info elements

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Upgrade to Multer 2.0.0 (#4768)

* upgrade to multer 2.0.0

* bump dev

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Implement Global Error Boundary (#4765)

* Implement global error boundary

* add 404 page for generic path catching

* devbuild

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Feat/cohere agent implementation (#4703)

* implement cohere agent support

* run yarn lint

* moderize Cohere
add supported langchain method
redo streaming since it was not working
looping of agent calls was not functioning

* change default model to real model tag
add case statement for model tag

* remove debug

* update default

* only whitelist known labels

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Upgrade MCP SDK to Latest (1.24.3) (#4773)

* upgrade mcp sdk to  latest (1.24.3)

* Upgrade MCP version floor in package.json to 1.24.3

* fix(devcontainer): forward ports 3000/3001 (#4779)

* 4601 log model on response (#4781)

* add model tag to chatCompletion

* add modelTag `model` to async streaming
keeps default arguments for prompt token calculation where applied via explict arg

* fix HF default arg

* render all performance metrics as available for backward compatibility
add `timestamp` to both sync/async chat methods

* extract metrics string to function

* Update Google Search Option Description To Reference Documentation For Rate Limits (#4789)

* Update Google Search description to reference documentation for rate limits

* remove

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor `LLMPerformanceMonitor.measureStream()` to Use Options Object Pattern (#4786)

* Refactor LLMPerformanceMonitor to use options object for measureStream parameters

* Refactor invocations of `measureStream` to use options arguments

* Change invocation of `measureStream` in anthropic provider to use options argument

---------

Co-authored-by: Timothy Carambat <[email protected]>

* hanging lint

* fix unnecessary scrollbar in workspace general appearance settings tab (#4791)

* fixed SuggestedChatMessages width styling

* ran yarn lint

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Add Eslint Config in `/frontend` (#4785)

* Add local ESLint configuration and disable rules to allow for errorless state

* Remove unnecessary ESLint disable comments in AuthContext and usePromptInputStorage for cleaner code.

* Update eslint-plugin-react-hooks

* Configure prettier to work with eslint

* Removed trailing commas from eslint config

* Prettier to source code

* add a v2 lint script

* put back eslint-disable comments

* fix eslinter and prettier application
always apply --fix since we --write prettier, otherwise it fails

* precaution dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor localWhisper to use custom FFMPEGWrapper class (#4775)

* refactor localWhisper to use new custom FFMPEGWrapper class

* stub tests in github actions

* add back wavefile conversion to 16khz 32f to fix docker builds

* use afterEach for cleanup in ffmpeg tests

* remove unused FFMPEG_PATH env check

* use spawnSync for ffmpeg to capture and log output

* lint

* revert removal of try/catch around validateAudioFile for more helpful error msgs

* use readFileSync instead of createReadStream for less overhead

* change import to require for fix-path and stub import in tests

* refactor to singleton to preserve ffmpeg path
dev build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor Managed Services in "Data Handling & Privacy" Onboarding Step to Use Their Privacy Policy URL (#4790)

* Refactor non-local LLM Provider, Vector Database, and Embedding Engine privacy information to use their policy URLs instead of descriptions

* Update LLM Provider, Embedding Engine, and Vector Database sections to include privacy policy links

* fix broken links, lint

* Update AstraDB privacy policy URL in onboarding flow

* Refactor AnythingLLM Privacy & Data page to show managed provider privacy policy URLs

* Update Mistral privacy policy URLs in onboarding flow for consistency

* Abstract privacy policies of providers into a reusable component | Refactor Privacy & Data Handling Step of onboarding flow to focus on solely rendering that step | Move provider privacy policy maps into constants.js

* Remove commented-out code for third-party provider privacy policies in Privacy and Data Handling component

* Update privacy policy descriptions for consistency by adding periods at the end of sentences in ProviderPrivacy component and constants.js

* rescope constants for providers

* extract default to external function, add loading state

---------

Co-authored-by: Timothy Carambat <[email protected]>

* patch ESM import issue (#4819)

* Upgrade YT Scraper (#4820)

* Merge commit from fork

* Update Sponsors README

* fix: validate chat message input (#4811)

* fix: validate chat message input

* fix: align message validation for thread stream-chat endpoint

---------

Co-authored-by: Timothy Carambat <[email protected]>

* patch AWS credential issue in docker context (#4842)

path AWS credential issue in docker context

* support AWS bedrock agents with streaming (#4850)

* support AWS bedrock agents with streaming

* Add back error handlers from previous fix

* VectorDB class migration (#4787)

* Migrate Astra to class (#4722)

migrate astra to class

* Migrate LanceDB to class (#4721)

migrate lancedb to class

* Migrate Pinecone to class (#4726)

migrate pinecone to class

* Migrate Zilliz to class (#4729)

migrate zilliz to class

* Migrate Weaviate to class (#4728)

migrate weaviate to class

* Migrate Qdrant to class (#4727)

migrate qdrant to class

* Migrate Milvus to class (#4725)

migrate milvus to class

* Migrate Chroma to class (#4723)

migrate chroma to class

* Migrate Chroma Cloud to class (#4724)

* migrate chroma to class

* migrate chroma cloud to class

* move limits to class field

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Migrate PGVector to class (#4730)

* migrate pgvector to class

* patch pgvector test

* convert connectionString, tableName, and validateConnection to static methods

* move instance properties to class fields

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Refactor Zilliz Cloud vector DB provider (#4749)

simplify zilliz implementation by using milvus as base class

Co-authored-by: Timothy Carambat <[email protected]>

* VectorDatabase base class (#4738)

create generic VectorDatabase base class

Co-authored-by: Timothy Carambat <[email protected]>

* Extend VectorDatabase base class to all providers (#4755)

extend VectorDatabase base class to all providers

* patch lancedb import

* breakout name and add generic logger

* dev tag build

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Make XLSX spreadsheets visible in chat by combining sheets (#4847)

* fix bug with xlsx files not being added as context

* lint

* fix console logs/warn/error

* abstract sheet processing to function + normalize error handling

* fix jsdoc

* patch xlsx filename to prevent orphaned doc

* reduce tokens

* correct pluralization

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Remove Workspace Creation Onboarding Page (#4823)

* remove create workspace step for onboarding

* remove unused image

* workspace creation into dedicated useEffect + use translated workspace name

* dev tag

---------

Co-authored-by: Timothy Carambat <[email protected]>

* Improved DMR support (#4863)

* Improve DMR support
- Autodetect models installed
- Grab all models from hub.docker to show available
- UI to handle render,search, install, and management of models
- Support functionality for chat, stream, and agentic calls

* forgot files

* fix loader circle being too large
fix tooltip width command
adjust location of docker installer open for web platform

* adjust imports

* AnythingLLM Mobile live (#4864)

* remove new labels on landing

* minor DMR UI changes + dynamic tooltip for context management

* Adjust fix path to use ESM import (#4867)

* Adjust fix path to use ESM import

* normalize fix-path imports and usage across the app

* extract path fix logic to utils for server and collector

* add helpers

* repin strip-ansi in collector

* fix log for localWhisper
lint

* Add postsettled callers to updateENV

* minor refactor for context window finder

* Extract Model Table to component (#4871)

* Extract Model Table to component
Add provider icons to header rows and installed models
Light mode supported
Mapping for model name id hints to provider
Update DMR to filter chat models by ability since not available via hub API

* linting + dev

* fix incorrect import

* remove race condition regression for FoundryLocal provider

* remove duplicated steam method on cohere handler

* feat(i18n): add Czech (cs) language translation to AnythingLLM (#4874)

Co-authored-by: Timothy Carambat <[email protected]>

* Docker model runner download from UI (#4884)

* Enable downloads of DMR models from UI

* add utils + dev build

* linting

* add fallback key to mono model provider

* update announcements for 1.10.0

* bump versions to 1.10.0

---------

Co-authored-by: Marcello Fitton <[email protected]>
Co-authored-by: Sean Hatfield <[email protected]>
Co-authored-by: Colin Perry <[email protected]>
Co-authored-by: Irene Wang <[email protected]>
Co-authored-by: timothycarambat <[email protected]>
Co-authored-by: Ocheretovich <[email protected]>
Co-authored-by: Vladimir Vlach <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Refactor vector db providers

4 participants