Skip to content

feat(compass-collection): Schema Analysis Redux Integration for Collection Plugin CLOUDP-333846 #7177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ncarbon
Copy link
Collaborator

@ncarbon ncarbon commented Aug 7, 2025

Description

This pull request introduces schema analysis functionality to the Compass Collection Tab, enabling the application to automatically analyze a collection's schema when it is loaded (unless the collection is read-only or time-series). The changes include new Redux state and actions for schema analysis, and a thunk to perform the analysis.

Motivation

Schema Analysis Feature:

  • Added a new SchemaAnalysis state to CollectionState, with status tracking (INITIAL, ANALYZING, COMPLETED, ERROR), schema data, sample document, schema metadata, and error information.
  • Implemented a thunk action analyzeCollectionSchema that samples documents, analyzes the schema, calculates metadata (like max nesting depth and validation rules), and updates the Redux state accordingly.
  • Updated the reducer to handle new schema analysis actions, updating the state based on progress and results.

Store Initialization and Integration:

  • Modified the store initialization to include the new schemaAnalysis state and inject required dependencies (logger, preferences, abort controller) for schema analysis.
  • On collection metadata load, the store now automatically dispatches schema analysis (unless the collection is read-only or time-series).

Testing Enhancements:

  • Updated and extended tests to mock and verify schema analysis behavior, ensuring it only runs for eligible collections and is skipped for read-only or time-series collections. [1] [2]

Checklist

  • New tests and/or benchmarks are included
  • Documentation is changed or added
  • If this change updates the UI, screenshots/videos are added and a design review is requested
  • I have signed the MongoDB Contributor License Agreement (https://www.mongodb.com/legal/contributor-agreement)

Motivation and Context

To extract schema information (types and validation rules) that will be used in the upcoming mock data generator feature.

  • Bugfix
  • New feature
  • Dependency update
  • Misc

Open Questions

Dependents

Types of changes

  • Backport Needed
  • Patch (non-breaking change which fixes an issue)
  • Minor (non-breaking change which adds functionality)
  • Major (fix or feature that would cause existing functionality to change)

@github-actions github-actions bot added the feat label Aug 7, 2025
@ncarbon ncarbon added no release notes Fix or feature not for release notes and removed feat labels Aug 7, 2025
@github-actions github-actions bot added the feat label Aug 7, 2025
@ncarbon ncarbon marked this pull request as ready for review August 7, 2025 21:51
@Copilot Copilot AI review requested due to automatic review settings August 7, 2025 21:51
@ncarbon ncarbon requested a review from a team as a code owner August 7, 2025 21:51
@ncarbon ncarbon requested review from gribnoysup and jcobis August 7, 2025 21:51
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces schema analysis functionality to the Compass Collection Tab, enabling automatic analysis of a collection's schema when it loads (unless the collection is read-only or time-series). The implementation includes Redux integration for state management and automatic schema analysis on collection metadata fetch.

  • Adds schema analysis Redux state and actions with support for analyzing, completed, and error states
  • Implements automatic schema analysis when collection metadata is loaded for eligible collections
  • Extends test coverage to verify schema analysis behavior for different collection types

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 4 comments.

File Description
packages/compass-collection/src/stores/collection-tab.ts Adds schema analysis state initialization and triggers analysis on metadata load
packages/compass-collection/src/stores/collection-tab.spec.ts Adds test coverage for schema analysis behavior with different collection types
packages/compass-collection/src/modules/collection-tab.ts Implements schema analysis Redux actions, reducer, and thunk logic
packages/compass-collection/package.json Adds required dependencies for schema analysis functionality


interface SchemaAnalysisFinishedAction {
type: CollectionActions.SchemaAnalysisFinished;
schemaAnalysis: SchemaAnalysis;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actions shouldn't be controlling the state, reducer should, passing the whole state and just assigning it in the reducer is an antipattern that should be avoided, don't do this. Clearly separate what is the action payload and what is the state that reducer will derive based on the action

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this. Wondering if it's worth it to move this to its own reducer. Thoughts?

@ncarbon ncarbon requested a review from gribnoysup August 11, 2025 15:59
@ncarbon
Copy link
Collaborator Author

ncarbon commented Aug 11, 2025

@jcobis @gribnoysup Heads up - Adding compass-schema as a dependency in compass-collection in order to use calculateSchemaMetadata and get the schema depth, but this introduced a cyclical dep.

In order to remove the cyclical dependency, I think we'd have to remove the compass-collection dependency from compass-schema and compass-schema-validation since they're both importing CollectionTabPluginMetadata.

Since we only need schema_depth from this function, I'm considering extracting/duplicating just part of this calculation in compass-collection.

},
A
>;

export enum SchemaAnalysisStatus {
Copy link

@kpamaran kpamaran Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] fyi there's another thread that advocates the compass repos's preference for unions over enums in #7181 (comment). I personally prefer them over enums because they enable type algebra like creating intersection/union types, and enums add runtime artifacts that union types do not

ERROR = 'error',
}

type SchemaAnalysis = {
Copy link

@kpamaran kpamaran Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| null in various fields is a sign that we can apply discriminated fields to get type narrowing benefits.

You can break this into a type for each Status's state

type SchemaAnalysis = SchemaAnalysisError | SchemaAnalysisInitial | SchemaAnalysisAnalyzing | SchemaAnalysisCompleted`

// example. notice there's no `error` field or `| null`
type SchemaAnalysisCompleted = {
   schema: Schema;
   status: SchemaAnalysisStatus;
   sampleDocument: Document;
   schemaMetaData: { ... };
}

// etc

each type shares the status but the status helps the type checker see what fields exist based on the status

The TypeScript handbook has an example here

This should simplify the action action+reducer code as well by removing entries for nullable fields

return;
}

try {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume it's possible to go to ANALYZING directly from COMPLETED or FAILED as well?

sampleDocument: sampleDocuments[0] ?? null,
schemaMetadata,
});
} catch (err: any) {
Copy link

@kpamaran kpamaran Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there documented errors that can be caught here? Would help devs looking at stack traces narrow the source of the issue earlier, especially because there's multiple await calls

}

let schemaMetadata = null;
if (schema !== null) {
Copy link

@kpamaran kpamaran Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen if schema stays null? Seems like an edge case

@gribnoysup
Copy link
Collaborator

In order to remove the cyclical dependency, I think we'd have to remove the compass-collection dependency from compass-schema and compass-schema-validation since they're both importing CollectionTabPluginMetadata.

@ncarbon I don't think you can easily remove the dependency here, compass-collection is the place that keeps the source of truth for this type, you can't really move it somewhere else without creating an awkward intermediate place for it and I think I'd prefer we avoid doing that 🙂

I think if you want to just implement the depth counting somewhere near the code that does the check that's fine (in your case you don't even need to count the full depth, right? only that it doesn't hit a relatively small limit). You should also consider moving this code to mongodb-schema package that is already this one place where we accumulate all the shared code related to schema analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat no release notes Fix or feature not for release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants