Skip to content

Conversation

@freshtonic
Copy link
Contributor

@freshtonic freshtonic commented Apr 22, 2025

Includes a version bump to the latest sqltk version which vendors

sqlparser as sqltk-parser and includes an extra parameter in
Transform::transform to allow contextual transformations (instead of
solely by type).

The following refactorings were done as part of this PR:

  1. Break sole Transform impl into modular rules using the
    TransformationRule trait.

  2. Removed EqlFunctionTracker - this existed to support Expr
    transformations in ORDER BY clauses. With contextual transformation
    it was no longer required.

  3. Removed TypeCell - it did not carry its weight and made debugging
    really hard. The TypeRegistry now associates nodes to Arc
    using an intermediary TypeVar.

An extra method was added to the post type checking step in the
EqlMapper to resolve all Value nodes which have no concrete type
after the inference step to NativeValue.

Additionally, I went through many levels of Hell trying to figure out how I
broke some existing tests so this PR also includes:

feat: tracing of InferType impls & Unifier::unify

Debugging type checking code in VSCode is suboptimal - we really want a
graphical representation of inference at every step (but also a debugger
that doesn't lock up at inconvenient times would be great).

To make instrumentation consistent I implemented a #[trace_infer]
attribute macro which can be used on InferType impls.

The Unifier has a single #[tracing::instrument!(..)] call and there
is also code in EqlMapper to dump nodes types and type substitutions
at the end of the type checking process which is useful when debugging
failing inference tests.

To get a trace from a test, uncomment the init_tracing(); line.

The output is very verbose but very useful.

Acknowledgment

By submitting this pull request, I confirm that CipherStash can use, modify, copy, and redistribute this contribution, under the terms of CipherStash's choice.

@freshtonic freshtonic requested a review from tobyhede April 22, 2025 06:33
@freshtonic freshtonic force-pushed the feat/mapper/group-by branch from f6c0e22 to 2e23096 Compare April 22, 2025 12:13
@tobyhede
Copy link
Contributor

Is it possible to make the tracing optional? Maybe use a crate feature?
I assume it will add to compile time, will it add much overhead to release build?

@freshtonic
Copy link
Contributor Author

Is it possible to make the tracing optional? Maybe use a crate feature? I assume it will add to compile time, will it add much overhead to release build?

Disabling tracing of inference at compile time is definitely possible.

It doesn't seem to significantly affect compilation times FWIW. As for runtime overhead, my understanding is that it's practically a no-op when there's no subscriber attached.

@freshtonic freshtonic force-pushed the feat/mapper/group-by branch 2 times, most recently from 5a46c38 to 0212074 Compare April 24, 2025 01:48
Includes a version bump to the latest `sqltk` version which vendors
`sqlparser` as `sqltk-parser` and includes an extra parameter in
`Transform::transform` to allow contextual transformations (instead of
solely by type).

The following refactorings were done as part of this PR:

1. Break sole `Transform` impl into modular rules using the
   `TransformationRule` trait.

2. Removed `EqlFunctionTracker` - this existed to support `Expr`
   transformations in `ORDER BY` clauses. With contextual transformation
   it was no longer required.

3. Removed `TypeCell` - it did not carry its weight and made debugging
   really hard. The `TypeRegistry` now associates nodes to Arc<Type>
   using an intermediary `TypeVar`.

An extra method was added to the post type checking step in the
`EqlMapper` to resolve all `Value` nodes which have no concrete type
after the inference step to `NativeValue`.
Debugging type checking code in VSCode is suboptimal - we really want a
graphical representation of inference at every step (but also a debugger
that doesn't lock up at inconvenient times would be great).

To make instrumentation consistent I implemented a `#[trace_infer]`
attribute macro which can be used on `InferType` impls.

The `Unifier` has a single `#[tracing::instrument!(..)]` call and there
is also code in `EqlMapper` to dump nodes types and type substitutions
at the end of the type checking process which is useful when debugging
failing inference tests.

To get a trace from a test, uncomment the `init_tracing();` line.

The output is *very* verbose but very useful.
New mise command: `mise rust:version`

It is run during CI and can also be invoked locally.

It will output something like this:

```
$ mise run rust:version
[rust:version] $ echo "rustc --version         = " $(rustc --version)
rustc --version         =  rustc 1.86.0 (05f9846f8 2025-03-31)
cargo --version         =  cargo 1.86.0 (adf9b6ad1 2025-02-28)
cargo fmt --version     =  rustfmt 1.8.0-stable (05f9846f89 2025-03-31)
cargo clippy --version  =  clippy 0.1.86 (05f9846f89 2025-03-31)
```

Very handy to verify that your local toolchain is the same as CI when
rustfmt is failing the build but is fine locally.
Statement transformation should be performed when EQL columns are used,
not solely on whethere there are EQL literals that require encryption.

Because using EQL columns can reqire insertion of EQL helper functions.
The expr type could not be resolved because the transformation rule was
using the wrong AST node to build a `NodeKey`.
@freshtonic freshtonic force-pushed the feat/mapper/group-by branch from 0212074 to 5301649 Compare April 26, 2025 15:08
The Proxy must be able to reliably detect whether a statement requires
transformation because if a statement does not require transformation
then the potentially expensive AST rebuilding step can be skipped.

The result of type-checking is insufficient in general to tell whether a
statement requires transformation unless the `TransformationRule` logic
is duplicated in the Proxy - which we don't want of course.

This commit extends the `TransformationRule` trait with a
`would_edit` method which answers the question "would this rule change
the AST if it was applied?".

Additionally, a new `TranformationRule` impl `DryRunnable` wraps another
rule in such a way that it can "pretend" to be performing a
`Transform` (as far as `sqltk` is concerned) when really its doing a
dry-run after which it will tell us if it *would* change the AST.

`TypedCheckedStatement::requires_transform` is a new method that wraps
up the dry-run logic and tells the called whether a statement must be
transformed
@freshtonic freshtonic force-pushed the feat/mapper/group-by branch from 5301649 to 9e4d145 Compare April 27, 2025 03:38
@freshtonic freshtonic merged commit 7f95c7a into main Apr 28, 2025
2 checks passed
@freshtonic freshtonic deleted the feat/mapper/group-by branch April 28, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants