-
-
Notifications
You must be signed in to change notification settings - Fork 114
Description
This is a somewhat detailed roadmap for the development of this crate, as I'd like to be transparent about what changes the user may anticipate. I will update it roughly once a month with all the main milestones.
This plan is not set in stone, and some other features could be implemented outside of these milestones (e.g., moving some dependencies behind features).
❤️ If you'd like to help with the development, consider sponsoring me (like Sentry does)
Rework Compiler
Right now, Registry owns all documents, which implies moves / clones to user-provided documents + uses some unsafe for self-references. However, if Registry is split into storage + resolution context, then user-provided documents could be borrowed (so the registry will only store fetched ones), + unsafe is gone.
Additionally, it will help with ValidationOptions, so it also does not hold owned registry / resources. They are only needed during compilation, so there is no need to own them (unless it is fetched via a retriever).
Schema clones during compilation should be gone by this point.
Generic JSON
Reworking the compiler will simplify direct integration of PyO3 data structures into the compilation process without the need to convert them into serde_json::Value (it is way easier to just borrow things for the compilation time). Though this area requires a bit more refining, generic JSON input would be easier to implement after this step.
In Python bindings, sometimes the overhead data (de)-serialization during compiling a validator / validation is up to 80%, and having generic input will greatly speed up everything.
Website
As I did it with css-inline.org, the crate could already be integrated into a webpage via WASM. A custom retriever via web-sys should enable even more flexibility. Some JS bindings will be needed at this point for a nice integration.
Performance, performance, performance
I have tons of ideas about it, as my main use case is to speed up the generation of instances that match a schema (in hypothesis-jsonschema & schemathesis), hence I am going to focus on the is_valid performance because the exact errors don't matter in this case.