chore(dal): enhance indexing across various schemas for improved query performance#9765
chore(dal): enhance indexing across various schemas for improved query performance#9765
Conversation
✅ Deploy preview added
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Hey there and thank you for opening this pull request! 👋 We require pull request titles to follow specific formatting rules and it looks like your proposed title needs to be adjusted. Your PR title is: Requirements:
Expected format: Details: PR title must end with 'fixes TICKET-ID' (e.g., 'fixes NOV-123') or include ticket ID in branch name |
WalkthroughThis pull request modifies MongoDB/Mongoose schema index definitions across multiple repository files in the data access layer. Changes include adding new single-field and compound indexes to various schemas (Environment, Integration, Layout, Member, Message Template, Notification, Organization, Subscriber, Tenant, Topic Subscribers, and User) while removing a per-field index specification from the Change schema and consolidating it to global index definitions. No modifications were made to exported entity signatures or public API declarations. Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @libs/dal/src/repositories/environment/environment.schema.ts:
- Around line 112-115: Remove the redundant compound index that starts with _id
by deleting the environmentSchema.index({ _id: 1, _organizationId: 1 })
declaration; keep the standalone environmentSchema.index({ _organizationId: 1 })
index already defined (the one around the existing { _organizationId: 1 }
declaration) so queries filtering only by _organizationId remain covered and
avoid violating MongoDB best practices against compound indexes beginning with
_id.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
libs/dal/src/repositories/member/member.schema.ts (1)
48-50: Remove this incorrect index definition—the field path does not exist in the schema.The index on line 48-50 references
'organizationId._userId._id', but the schema only defines_organizationIdand_userIdas direct fields (noorganizationIdfield exists). This index path does not match any document structure and is dead code. A proper compound index already exists at lines 52-56 (_organizationId,_userId,_id), which correctly covers the actual query patterns used throughout the codebase.libs/dal/src/repositories/subscriber/subscriber.schema.ts (1)
206-209: Critical: Duplicate index definition.This index is identical to the one defined at lines 179-182, including the same name (
unique_subscriber_per_environment), unique constraint, and partial filter expression. MongoDB will fail when attempting to create two indexes with the same name, which will cause schema initialization errors.Note: This appears to be a pre-existing issue in the codebase, not introduced by this PR.
🔎 Proposed fix: Remove the duplicate index
-subscriberSchema.index( - { _environmentId: 1, subscriberId: 1 }, - { name: 'unique_subscriber_per_environment', unique: true, partialFilterExpression: { deleted: false } } -); -
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
libs/dal/src/repositories/change/change.schema.tslibs/dal/src/repositories/environment/environment.schema.tslibs/dal/src/repositories/integration/integration.schema.tslibs/dal/src/repositories/layout/layout.schema.tslibs/dal/src/repositories/member/member.schema.tslibs/dal/src/repositories/message-template/message-template.schema.tslibs/dal/src/repositories/notification/notification.schema.tslibs/dal/src/repositories/organization/organization.schema.tslibs/dal/src/repositories/subscriber/subscriber.schema.tslibs/dal/src/repositories/tenant/tenant.schema.tslibs/dal/src/repositories/topic/topic-subscribers.schema.tslibs/dal/src/repositories/user/user.schema.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Write concise, technical TypeScript code with accurate examples
Use descriptive variable names with auxiliary verbs (isLoading, hasError)
Add blank lines before return statements
Import motion components from 'motion/react' instead of 'motion-react'
**/*.{ts,tsx}: Write concise, technical TypeScript code with accurate examples
Use functional and declarative programming patterns; avoid classes
Prefer iteration and modularization over code duplication; minimize code duplication as possible
Use descriptive variable names with auxiliary verbs (e.g., isLoading, hasError)
Structure files: exported component, subcomponents, helpers, static content, types
Don't leave comments in code, unless they explain something complex and not trivial
Don't use nested ternaries
Use TypeScript for all code; prefer interfaces over types in backend code, but use types over interfaces in front end code
Use the "function" keyword for pure functions
Avoid unnecessary curly braces in conditionals; use concise syntax for simple statements
Add blank lines before return statements
When importing "motion-react" package, import it from "motion/react"
Files:
libs/dal/src/repositories/notification/notification.schema.tslibs/dal/src/repositories/organization/organization.schema.tslibs/dal/src/repositories/user/user.schema.tslibs/dal/src/repositories/environment/environment.schema.tslibs/dal/src/repositories/tenant/tenant.schema.tslibs/dal/src/repositories/message-template/message-template.schema.tslibs/dal/src/repositories/subscriber/subscriber.schema.tslibs/dal/src/repositories/layout/layout.schema.tslibs/dal/src/repositories/member/member.schema.tslibs/dal/src/repositories/topic/topic-subscribers.schema.tslibs/dal/src/repositories/change/change.schema.tslibs/dal/src/repositories/integration/integration.schema.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use functional and declarative programming patterns; avoid classes
Files:
libs/dal/src/repositories/notification/notification.schema.tslibs/dal/src/repositories/organization/organization.schema.tslibs/dal/src/repositories/user/user.schema.tslibs/dal/src/repositories/environment/environment.schema.tslibs/dal/src/repositories/tenant/tenant.schema.tslibs/dal/src/repositories/message-template/message-template.schema.tslibs/dal/src/repositories/subscriber/subscriber.schema.tslibs/dal/src/repositories/layout/layout.schema.tslibs/dal/src/repositories/member/member.schema.tslibs/dal/src/repositories/topic/topic-subscribers.schema.tslibs/dal/src/repositories/change/change.schema.tslibs/dal/src/repositories/integration/integration.schema.ts
**/*.{tsx,ts}
📄 CodeRabbit inference engine (CLAUDE.md)
Favor named exports for components
Files:
libs/dal/src/repositories/notification/notification.schema.tslibs/dal/src/repositories/organization/organization.schema.tslibs/dal/src/repositories/user/user.schema.tslibs/dal/src/repositories/environment/environment.schema.tslibs/dal/src/repositories/tenant/tenant.schema.tslibs/dal/src/repositories/message-template/message-template.schema.tslibs/dal/src/repositories/subscriber/subscriber.schema.tslibs/dal/src/repositories/layout/layout.schema.tslibs/dal/src/repositories/member/member.schema.tslibs/dal/src/repositories/topic/topic-subscribers.schema.tslibs/dal/src/repositories/change/change.schema.tslibs/dal/src/repositories/integration/integration.schema.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: E2E test API / Test E2E
- GitHub Check: Unit test backend services (@novu/worker)
- GitHub Check: Unit test backend services (@novu/inbound-mail)
- GitHub Check: Unit test backend services (@novu/webhook)
- GitHub Check: Unit test backend services (@novu/ws)
- GitHub Check: Unit test @novu internal packages
- GitHub Check: Unit test backend services (@novu/api-service)
- GitHub Check: Validate OpenAPI
- GitHub Check: Unit test @novu public NPM packages (except providers)
- GitHub Check: Redirect rules - dashboard-v2-novu-staging
- GitHub Check: Header rules - dashboard-v2-novu-staging
- GitHub Check: Pages changed - dashboard-v2-novu-staging
- GitHub Check: Analyze (typescript)
- GitHub Check: Analyze (javascript)
🔇 Additional comments (15)
libs/dal/src/repositories/change/change.schema.ts (1)
24-24: Good refactoring to centralize index definitions.Removing inline
index: truedeclarations and relying on explicitchangeSchema.index()calls is a cleaner approach that centralizes all index definitions, improves visibility, and makes index management more maintainable. The explicit single-field indexes on_environmentId,_creatorId, and_entityIdwill support individual field queries appropriately.Consider verifying that the indexes are created correctly in MongoDB and optionally reviewing query patterns to determine if compound indexes (e.g.,
{ _environmentId: 1, _entityId: 1 }) would provide better performance for common multi-field queries.libs/dal/src/repositories/member/member.schema.ts (1)
52-56: LGTM! Compound index enhances query performance.The compound index on
_organizationId,_userId, and_idis correctly defined and will optimize queries filtering by these fields in combination. Since this index exists in production, it's proven to be valuable.libs/dal/src/repositories/layout/layout.schema.ts (1)
84-86: LGTM! Organization index improves lookup performance.The single-field index on
_organizationIdis correctly defined and will optimize organization-scoped layout queries. This aligns with the pattern of bringing production indexes into the codebase.libs/dal/src/repositories/tenant/tenant.schema.ts (1)
46-49: LGTM! Environment-identifier index enables efficient tenant lookups.The compound index on
_environmentIdandidentifiercomplements the existing environment-createdAt index by supporting direct lookups by environment and identifier. Both indexes serve distinct query patterns without redundancy.libs/dal/src/repositories/notification/notification.schema.ts (1)
205-208: LGTM! Subscriber-first index complements environment-first indexes.The compound index starting with
_subscriberIdis not redundant with the existing index at lines 176-180 (_environmentId,_subscriberId,createdAt). The new index efficiently supports queries that filter by subscriber first or by subscriber+environment, while the existing index serves environment-first queries. Different field orderings serve different query patterns.libs/dal/src/repositories/integration/integration.schema.ts (1)
125-137: LGTM! Compound indexes support distinct query patterns.Both new indexes share the same three-field prefix (
_environmentId,channel,active) but diverge in the fourth field:
- The first index (lines 125-130) adds
createdAtfor time-ordered queries- The second index (lines 132-137) adds
providerIdfor provider-specific lookupsThis design suggests production queries commonly filter by environment, channel, and active status, then either sort by creation time or filter by provider. Since these indexes exist in production, they're proven to serve distinct, valuable query patterns.
libs/dal/src/repositories/message-template/message-template.schema.ts (1)
98-106: LGTM: Standard index additions for query optimization.The new indexes align with common query patterns:
_environmentIdindex supports environment-scoped queries- The compound
_organizationId, deleted, _parentIdindex optimizes filtered queries by organization and parent, while accounting for soft-delete status from the mongoose-delete pluginlibs/dal/src/repositories/subscriber/subscriber.schema.ts (1)
211-215: LGTM: Covering index for optimized pagination queries.This compound index extends the existing
subscriberId, _environmentIdindex by including_id, creating a covering index that can serve queries filtering by subscriber and environment while also returning the_idfield without an additional document lookup. This is a common pattern for cursor-based pagination.libs/dal/src/repositories/user/user.schema.ts (2)
56-58: LGTM: Standard index for external ID lookups.The
externalIdindex is a straightforward optimization for queries that filter or join by external system identifiers.
60-66: Remove or justify the compound index starting with_id.This index is problematic for two reasons:
Redundant with MongoDB's default
_idindex: Queries filtering by_idalone will always use MongoDB's built-in primary key index, not this compound index.Violates covering index design: If this was intended as a covering index to project
firstName,lastName,profilePicturetogether, the index should start with the query filter field—not_id. An index starting with_idcannot optimize queries on other fields.No queries in the codebase were found that explicitly project all five fields together or would benefit from this structure. Remove this index unless there is a specific query pattern from production monitoring that justifies its overhead.
libs/dal/src/repositories/topic/topic-subscribers.schema.ts (1)
70-85: LGTM: Compound indexes for optimized multi-field queries.These three compound indexes enhance query performance for common access patterns:
_environmentId, topicKey, externalSubscriberId- for external subscriber lookups by topic within an environment_environmentId, externalSubscriberId- for direct external subscriber lookups within an environment_environmentId, _topicId, identifier- extends the existing unique index to support additional query variationsAll indexes appropriately start with
_environmentIdto support tenant-scoped queries.libs/dal/src/repositories/organization/organization.schema.ts (3)
87-89: LGTM: Standard index for external system integration.The
externalIdindex supports queries that filter by external system identifiers.
91-93: LGTM: Nested field index for partner configuration queries.This index on
partnerConfigurations.teamIdoptimizes queries that filter by partner team identifiers within the subdocument array.
82-85: These compound indexes are justified—they support covering index queries. At line 73 ofget-api-rate-limit-maximum.usecase.ts, the code executesfindById(_organizationId, '_id apiServiceLevel'), which the index{ _id: 1, apiServiceLevel: 1 }optimizes as a covering index. MongoDB can satisfy this query entirely from the index without accessing the underlying document. Similar patterns exist forapiServiceLevelupdates throughout the codebase.Likely an incorrect or invalid review comment.
libs/dal/src/repositories/environment/environment.schema.ts (1)
108-110: LGTM: Index on _parentId improves parent-child environment queries.This index will optimize queries that filter environments by their parent, which is useful for hierarchical environment relationships.
| environmentSchema.index({ | ||
| _id: 1, | ||
| _organizationId: 1, | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the environment schema file
cat -n libs/dal/src/repositories/environment/environment.schema.ts | head -150Repository: novuhq/novu
Length of output: 4313
🏁 Script executed:
# Search for queries that use both _id and _organizationId on Environment documents
rg -nP --type=ts '\.(find|findOne|aggregate|updateOne|updateMany|deleteOne|deleteMany)\(' libs/dal/src/repositories/environment/ -A 10 | grep -E '(_id|_organizationId)'Repository: novuhq/novu
Length of output: 1100
🏁 Script executed:
# Also search more broadly for Environment repository methods that might use this index pattern
rg -nP --type=ts 'environmentSchema|Environment' libs/dal/src/repositories/environment/ | head -30Repository: novuhq/novu
Length of output: 3513
🌐 Web query:
MongoDB composite index starting with unique _id field covering index use cases best practices
💡 Result:
Short answer
- Don’t make a compound index that starts with _id unless your queries always filter by _id. Because _id is unique and MongoDB uses the leftmost-prefix rule, putting _id first prevents the compound index from helping queries that don’t include _id and usually makes the compound unique/redundant. [1][2][3]
Why
- Leftmost-prefix rule: a compound index only supports queries that include the leftmost prefix fields; if _id is first, queries must supply _id to use later keys. [2]
- _id is already indexed uniquely by default; a compound index starting with _id rarely adds value for uniqueness and can be redundant. [3]
- Covered queries: to be covered, the index must contain every queried and returned field — including _id if you return it — so explicitly adding _id to an index can enable covered queries only when you actually need _id in the index. Otherwise leave the default _id index and build other indexes that cover fields you query/project. [1]
Best practices / recommendations
- Use the leftmost-prefix / ESR guideline: put equality (high-selectivity) fields first, then sort fields, then range fields. Reserve _id-first only if you need to support queries that always include _id plus other fields. [2]
- Rely on MongoDB’s default single-field _id index for _id lookups; create compound indexes without _id to support common query patterns and sorts. [2][3]
- For covered queries, include in the index every field the query filters on and returns (or exclude _id in projection if index doesn’t include it). Use explain() to confirm index-covered operations. [1]
- Avoid redundant indexes (e.g., { _id:1, a:1 } plus {_id:1}): remove duplicates unless unique/sparse constraints require them. [2][3]
Quick examples
- Good (support queries on userId and status, and cover projection): createIndex({ userId: 1, status: 1, lastSeen: 1 })
- Only use _id-first if your queries always are like { _id: , status: "x" } and you want the compound indexed ordering for projection/sort.
References
- MongoDB Query Optimization / Covered Queries. [1]
- MongoDB Compound Indexes (leftmost-prefix rule, index-prefix behavior). [2]
- MongoDB Unique Indexes / default _id index and uniqueness details. [3]
Would you like a tailored index suggestion for a specific query workload/schema?
Remove the composite _id + _organizationId index; it violates MongoDB best practices.
The composite index { _id: 1, _organizationId: 1 } starts with _id, which prevents queries filtering only by _organizationId from using it due to the leftmost-prefix rule. Queries on lines 21 and 34 of environment.repository.ts that filter only by _organizationId fall back to the separate { _organizationId: 1 } index (line 84–86), making the composite index redundant.
MongoDB best practices recommend against compound indexes starting with _id (the default unique index) unless queries always include _id. The existing { _organizationId: 1 } index already covers the common query patterns in this codebase.
🤖 Prompt for AI Agents
In @libs/dal/src/repositories/environment/environment.schema.ts around lines
112-115, Remove the redundant compound index that starts with _id by deleting
the environmentSchema.index({ _id: 1, _organizationId: 1 }) declaration; keep
the standalone environmentSchema.index({ _organizationId: 1 }) index already
defined (the one around the existing { _organizationId: 1 } declaration) so
queries filtering only by _organizationId remain covered and avoid violating
MongoDB best practices against compound indexes beginning with _id.
There was a problem hiding this comment.
its been used 173784871 times since 17 Dec
What changed? Why was the change needed?
These indexes are not present in code but available in production database. So It make sense to be available in other databases as well.
Screenshots
Expand for optional sections
Related enterprise PR
Special notes for your reviewer