-
Notifications
You must be signed in to change notification settings - Fork 247
feat(schema, data-modeling): use iterable cursor in schema analysis COMPASS-9150 COMPASS-9315 #6894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| }; | ||
| const schemaAccessor = await analyzeDocuments(docs, schemaParseOptions); | ||
| const schemaAccessor = await analyzeDocuments( | ||
| sampleCursor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could impact performance. I have not tested that. Should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific concerns or predictions for where the biggest impact might be? I def defer to another team member to give you an answer here, but as a newbie what would it entail to validate at least the biggest bottlenecks we think may come up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we're checking if the signal is aborted between every document parsing. I don't think that would be any overhead. The part I think something could change is just that we don't do the toArray and then pass all of the documents synchronously, now the schema analysis will wait longer if there are any round trips. I don't think that would slow things down either really, but I wanted to raise it to make sure it's something on our minds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry didn't get back sooner. I agree, we've shifted the cost but in a way that should leave the app a bit more responsive as it acts on a document at a time. Plus this takes advantage of the driver's lazy deserialize. In the end the time spent should be the equivalent, broken up over more event loop cycles
| }; | ||
| const schemaAccessor = await analyzeDocuments(docs, schemaParseOptions); | ||
| const schemaAccessor = await analyzeDocuments( | ||
| sampleCursor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific concerns or predictions for where the biggest impact might be? I def defer to another team member to give you an answer here, but as a newbie what would it entail to validate at least the biggest bottlenecks we think may come up here?
COMPASS-9150 COMPASS-9315
Marked for release notes as this could impact how folks analyze their schema, they should be able to abort more predictably now as we check for signal abort between each document being analyzed.
A bit of additional context, we have COMPASS-8925 for moving the rest of Compass to pass the abort signal to the driver's methods and get rid of some of our session workarounds in the data-service. This passing of an abort signal to the driver's aggregate method with the
Abortabletype that we have in this pr is the first we're doing it in Compass.