Skip to content

feat: internal to Expanded JSON Schema conversion COMPASS-8702 #220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

paula-stacho
Copy link
Contributor

@paula-stacho paula-stacho commented Feb 5, 2025

In the last piece of the puzzle, we're generating expanded JSON Schema.
Differences from the JSON Schema:

  • it includes x-bsonType along the type (using the same map as we have for $jsonSchema)
  • it includes x-metaData and x-sampleValues (these are taken from the internal schema)
  • because of these additional properties, we don't do the array notation for mixed types (so the logic for 'plainTypes' is gone)

Notes: metadata (count, probability, hasDuplicates) are present both on the type and field level in the internal schema. I kept this. Sample values are present on the type level and are not present in objects/arrays in the internal schema, so again we're just taking them when they're available.

@paula-stacho paula-stacho marked this pull request as ready for review February 5, 2025 14:32
@paula-stacho paula-stacho requested a review from Anemy February 5, 2025 14:32
Copy link
Member

@Anemy Anemy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

One thing around sample values that maybe we could do is add some tests around outliers that would test performance. Like really long strings, really long arrays, tons of fields, or something. Probably nothing to do here, it's more related to how we're doing the internal schema, it's something worth thinking about in the future I reckon.

@Anemy
Copy link
Member

Anemy commented Feb 5, 2025

😂 you're already thinking about what I just left a comment for re #221

@paula-stacho
Copy link
Contributor Author

😂 you're already thinking about what I just left a comment for re #221

Yes, we already had the cropping for string in the analysis, so I'm just adding some extra cropping in the same place. And big arrays or complex documents will be covered by https://jira.mongodb.org/browse/COMPASS-8905

@paula-stacho paula-stacho merged commit bb683be into COMPASS-6862-schema-export-multiple-formats Feb 6, 2025
18 checks passed
@paula-stacho paula-stacho deleted the COMPASS-8702-2 branch February 6, 2025 08:36
paula-stacho added a commit that referenced this pull request Feb 6, 2025
…S-8702 COMPASS-8709 (#222)

* feat: add analyzeDocuments + SchemaAccessor COMPASS-8799 (#216)



---------

Co-authored-by: Anna Henningsen <[email protected]>

* feat: internal to MongoDB $jsonSchema conversion COMPASS-8701 (#218)


---------

Co-authored-by: Anna Henningsen <[email protected]>

* feat: internal to Standard JSON Schema conversion COMPASS-8700 (#219)

* feat: internal to Expanded JSON Schema conversion COMPASS-8702  (#220)

---------

Co-authored-by: Anna Henningsen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants