Skip to content

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jun 29, 2025

Generated with GPT-4o and manually (heavily) redacted

Stacked on #341 (first 3 commits).

Problem Overview

The Rust wrapper diverges from the CPP Driver in several areas of configuration, including:

  1. default settings,
  2. request timeouts,
  3. execution profile behavior, and
  4. load balancing policies.

These inconsistencies can lead to confusion for users migrating from the CPP Driver or relying on its documented behavior. Additionally, certain code patterns in the Rust wrapper could benefit from refactoring for improved clarity and maintainability.

Solution

All changes are made to align the Scylla Rust wrapper with the CPP Driver's behavior while improving code organization and defensive programming practices.

The commits do the following:

  1. Style and typo fixes:

    • Fixes minor style and typographical issues in exec_profile.rs.
  2. Paste CPP Driver defaults in a comment:

    • Adds comments listing the CPP Driver's default settings and configuration code to ensure no defaults are accidentally omitted in the Rust wrapper.
  3. Use MonotonicTimestampGenerator by default:

    • Ensures the default timestamp generator matches the CPP Driver's behavior, as well as most of the drivers'.
  4. Enable TCP keepalive by default:

    • Aligns with the CPP Driver, but sets the keepalive interval to 2 seconds instead of 0 seconds due to libuv's rejection of 0-second configurations (and Rust driver's complaints that 1 sec is too short, whatever).
  5. Refactor build_session_builder as a method:

    • Moves build_session_builder to be a method of CassCluster, as it perfectly makes sense.
  6. Set default policies explicitly:

    • Explicitly sets default retry and speculative execution policies in the default execution profile builder to prevent future changes from breaking functionality.
  7. Describe load balancing policy divergence:

    • Adds comments explaining the divergence in load balancing policy behaviour between the CPP Driver and the Rust wrapper. The Rust wrapper defaults to RoundRobin, requiring explicit local DC configuration for DC-aware policies.
  8. Use cluster defaults for unset settings in execution profiles:

    • Modifies CassExecProfile to use the cluster's default profile for settings not explicitly set in the execution profile, ensuring consistency with the CPP Driver.
  9. Allow unsetting retry policy in execution profile:

    • Passing null pointer as a retry policy to cass_execution_profile_retry_policy shall unset the retry policy, making the driver use the retry policy set in the cluster. This commit makes it true.
  10. Crate-expose CassPtr::null(_mut) constructors:

    • For use in tests.
  11. Unit test fetching unset settings from cluster:

  • Verifies that settings that are unset on execution profile are fetched from the cluster.

The remaining commits are not described here. I decided it's enough to read the commit messages.

Discussion Needed

Point 7: Should we reconsider the default load balancing policy to match the CPP Driver's behavior (DCAware with automatic local DC initialization)? While the current approach avoids potential pitfalls, it diverges from the CPP Driver's default.

Summary

These changes improve the correctness, maintainability, and alignment of the Scylla Rust wrapper with the CPP Driver's behavior. They also enhance code clarity and defensive programming practices.

Testing

I included one unit test for the execution profiles. The test verifies that each setting is inherited from the cluster iff it's unset in the execution profile.

Integration tests for consistency and serial consistency settings are planned as a follow-up.

Pre-review Checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have implemented Rust unit tests for the features/changes introduced.
  • [ ] I have enabled appropriate tests in Makefile in {SCYLLA,CASSANDRA}_(NO_VALGRIND_)TEST_FILTER.
  • [ ] I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula self-assigned this Jun 29, 2025
@wprzytula wprzytula added bug Something isn't working P1 P1 priority item - very important labels Jun 29, 2025
@wprzytula wprzytula added this to the 0.6 milestone Jun 29, 2025
@wprzytula wprzytula force-pushed the fix-driver-defaults branch 4 times, most recently from f9ba8ff to 19b229e Compare June 30, 2025 16:46
@wprzytula wprzytula marked this pull request as ready for review June 30, 2025 17:24
@wprzytula wprzytula requested review from Copilot and Lorak-mmk June 30, 2025 17:24
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aligns the Rust driver’s configuration defaults and execution profile behavior with the CPP Driver, while refactoring and improving code clarity.

  • Refactors consistency and serial consistency handling using new Rust syntax and a MaybeUnset enum
  • Moves build_session_builder to a method on CassCluster and updates related default configurations
  • Adjusts default TCP keepalive, timestamp generator settings, and enhances execution profile defaults management

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scylla-rust-wrapper/src/statement.rs Updates consistency handling with new Result pattern and explicit error paths
scylla-rust-wrapper/src/session.rs Changes build_session_builder usage to a method on CassCluster
scylla-rust-wrapper/src/load_balancing.rs Updates load balancing comments to explain deviation from CPP Driver defaults
scylla-rust-wrapper/src/exec_profile.rs Refactors execution profile settings inheritance and overrides tracking
scylla-rust-wrapper/src/cluster.rs Adjusts cluster defaults including TCP keepalive and timestamp generator
Other files Various style, typo fixes and API consistency improvements
Comments suppressed due to low confidence (2)

scylla-rust-wrapper/src/statement.rs:337

  • Ensure to revisit the FIXME comments when unsetting consistency becomes supported, and consider adding a note in the public API documentation to explain the current limitation.
    let Ok(maybe_set_consistency) = get_consistency_from_cass_consistency(consistency) else {

scylla-rust-wrapper/src/cluster.rs:304

  • [nitpick] The new TCP keepalive interval of 2 seconds is well-documented; verify cross-environment support to prevent unexpected connection issues.
            .tcp_keepalive_interval(DEFAULT_TCP_KEEPALIVE_INTERVAL)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(whether it makes sense or not is another topic
for dicussion): if consistency is not set in the execution profile,
it is set to the hardcoded default consistency, not the one from
the cluster's default profile."

We are aiming for a "drop-in" replacement, but I think that as with most other things it is a matter of balance.
For example, we are not interested in replicating cpp-drivers bugs.
When migrating to this driver, people will have to pay more attention that with ordinary update, so I think it is a good idea to have a reasonable defaults in cpp-driver, instead of tying everything to cpp-driver.
Not that cpp-driver changed its default consistency multiple times, so why can't we when implementing new driver?

Copy link
Collaborator Author

@wprzytula wprzytula Jul 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that cpp-driver changed its default consistency multiple times, so why can't we when implementing new driver?

Well, but the last such change occurred 10 years ago...
Since then, users could have become very used to the driver's behaviour and defaults.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this change makes sense and fixes a possible footgun (I'm talking about fallback to cluster execution profile). I don't think it will break any workflows.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it the following way:

  1. I'll open an issue about making this consistent (but diverging from the CPP Driver).
  2. We'll merge this PR as-is.
  3. I'll open another PR that addresses the issue.
  4. We'll merge that one as well.

Rationale: I want to keep bug fixes (making the behaviour correct wrt expectations from the CPP Driver) separate from improvements (making the behaviour more correct than the CPP Driver). Let the points of divergence be as discoverable and standing out as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@wprzytula wprzytula force-pushed the fix-driver-defaults branch from 19b229e to 92576fd Compare July 1, 2025 06:19
@wprzytula wprzytula requested review from Lorak-mmk and removed request for Lorak-mmk July 1, 2025 06:20
@wprzytula wprzytula force-pushed the fix-driver-defaults branch 3 times, most recently from e27b16f to f890430 Compare July 5, 2025 17:52
@wprzytula wprzytula requested a review from Lorak-mmk July 5, 2025 18:00
@wprzytula wprzytula modified the milestones: 0.6, 0.5.1 Jul 6, 2025
@wprzytula wprzytula force-pushed the fix-driver-defaults branch from f890430 to c3abe5a Compare July 7, 2025 12:57
@wprzytula
Copy link
Collaborator Author

Rebased on master.
Now I'm going to use the new Rust Driver API to fix statement and batch.

@wprzytula wprzytula force-pushed the fix-driver-defaults branch from c3abe5a to f319dd5 Compare July 7, 2025 13:52
@wprzytula
Copy link
Collaborator Author

Fixes thanks to new API exposed by Rust Driver:

  1. request timeout is now properly set in batches.
  2. (serial) consistency can now be unset in statements/batches.

@wprzytula wprzytula force-pushed the fix-driver-defaults branch from f319dd5 to 01b55ec Compare July 8, 2025 09:19
@wprzytula
Copy link
Collaborator Author

Rebased on master.

@wprzytula wprzytula force-pushed the fix-driver-defaults branch from 01b55ec to de7b88b Compare July 8, 2025 09:53
Comment on lines +191 to +196
// Load balancing is set in LoadBalancingConfig::build().
// The default load balancing policy is DCAware in CPP Driver.
// NOTE: CPP Driver initializes the local DC to the first node the client
// connects to. This is tricky, a possible footgun, and hard to be done with
// the current Rust driver implementation, which is why do don't use DC awareness
// by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this should be described in user-facing documentation?

Copy link
Collaborator Author

@wprzytula wprzytula Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What piece of documentation do you mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably cassandra.h. Is there any other user-facing documentation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll have a Scylla Documentation, as for the other drivers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere in cassandra.h there is any notice about this behaviour. I mean, cassandra.h says nothing about the peculiar behaviour of CPP Driver that the first contact point's DC is considered local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the changelog mentions it:

1.0.0-rc1
===========
Dec 22, 2014

Features
--------
* DC-aware load balancing policy is now the default and uses the DC of the first
  connected contact point. The DC-aware policy now has settings to control the
  number of remote hosts used in each DC (after exhausting the hosts in the
  local DC) and whether to ignore local consistency levels.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have the Scylla Documentation for this driver, we should mention this divergence.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this change makes sense and fixes a possible footgun (I'm talking about fallback to cluster execution profile). I don't think it will break any workflows.

wprzytula added 2 commits July 8, 2025 20:12
The number of defaults in the CPP Driver is quite large, so not to
accidentally omit any, the whole list of them, as well as the
configuration code, is pasted in a comment in `cass_cluster_new`.
@wprzytula wprzytula force-pushed the fix-driver-defaults branch from de7b88b to 3d8b0ee Compare July 8, 2025 18:12
@wprzytula
Copy link
Collaborator Author

Rebased on master.

wprzytula added 16 commits July 8, 2025 20:14
This is in line with the CPP Driver.

Also, fixed the leftover old comment about the default value of the
timestamp generator, which has not correct anymore for some years.
When the DataStax changed the default value of the timestamp generator
from `ServerTimestampGenerator` to `MonotonicTimestampGenerator`, they
forgot to update the comment in `cassandra.h`.
The keepalive is enabled by default in the CPP driver.
There is one difference: the default TCP keepalive interval (delay
after the last message sent/received before the first keepalive probe
is sent) is set to 0 sec in the CPP driver. However, libuv started
to reject such configuration on purpose since 1.49, so let's follow
its reasoning and set it to 2 sec (Rust driver warns that 1 sec
is too short and would drag a performance penalty, whatever).
See:
- <https://docs.libuv.org/en/v1.x/tcp.html#c.uv_tcp_keepalive>
- <https://github.com/libuv/libuv/blob/513751e2fcf1b5be5238de66b2c06f3e0623aca0/src/unix/tcp.c#L591>
- <https://github.com/libuv/libuv/blob/513751e2fcf1b5be5238de66b2c06f3e0623aca0/src/unix/tcp.c#L545>
`build_session_builder` makes perfect sense as a method of `CassCluster`
rather than a standalone function.

This commit is viewed best without whitespace changes.
The default execution profile in the Scylla Rust wrapper
has been setting default retry policy and speculative execution policy
correctly (in line with the CPP Driver). However, it's a better
defensive pattern to set these explicitly in the default execution
profile builder, so that any future changes to the default execution
profile in the Rust Driver do not break the Scylla Rust wrapper.
CPP Driver uses DCAware load balancing policy by default.
The tricky part is that it initializes the local DC to the first node
the client connects to. This is tricky, a possible footgun, and hard to
be done with the current Rust driver implementation, which is why we use
RoundRobin by default (we turn off DC awareness by default, requiring
the user to provide the local DC explicitly if they want to use it).

The comments describing this behaviour are added to the `cluster.rs`
and `load_balancing.rs` files for better discoverability and
understanding of this divergence.
This commit modifies the `CassExecProfile` to use the defaults from
the cluster's default profile for settings that are not explicitly
set in the execution profile. This is consistent with the behavior
of the CPP driver.
Note that the consistency is handled separately, as it has a separate
logic in CPP Driver (whether it makes sense or not is another topic
for dicussion): if consistency is not set in the execution profile,
it is set to the hardcoded default consistency, not the one from
the cluster's default profile.
According to the CPP Driver's code, passing a null retry policy pointer
to `cass_execution_profile_set_retry_policy` shall unset the retry
policy in the execution profile, making the driver use the retry
policy from the cluster's default profile.
This has been different in the Rust wrapper, which has returned
`CASS_ERROR_LIB_BAD_PARAMS` when passed a null pointer.
This commit fixes this, using the new execution profile overrides
mechanism.
CassPtr has two constructors that are useful for testing:
- `null()` for creating a null pointer
- `null_mut()` for creating a null mutable pointer
Both are safe to be used, because the unsafe code that CassPtr uses
always safely dereferences the pointer, i.e., only after nullity checks.
A unit test is added to ensure that when an execution profile does not
set specific settings (serial consistency, request timeout, retry policy,
and speculative execution policy), as well as if it sets them once
and later unsets them, the settings are fetched from the default
execution profile of the cluster.
The semantics of `CASS_CONSISTENCY_UNKNOWN` is that the corresponding
setting should be unset, so that the default from the cluster or
execution profile is used. Now that the Rust Driver exposes API
for unsetting (serial) consistency on statements and batches, we can
use it in the wrapper.
This refactors the function `cass_execution_profile_set_request_timeout`
to use the `MaybeUnsetConfig` type for the timeout argument. This is
another step towards unifying the way maybe-unset configuration options
are handled in the Scylla Rust wrapper.

Additionally, this change fixes a bug that the request timeout was
not being turned off when the timeout was set to 0. The bug would cause
immediate timeouts in the driver, as the request timeout was set to 0.

The unit test for execution profiles has been updated to check
correctness after this change.
As @Lorak-mmk smartly pointed out, the `MaybeUnsetConfigValue` trait
should use a type parameter for the C value instead of an associated
type. This allows us to implement conversion from multiple C value types
to the same Rust type. Another benefit is that it allows us to
implement `MaybeUnsetConfigValue` for types that are not
`'static`, which will soon be useful for `Option<&RetryPolicy>`.

This commit refactors the `MaybeUnsetConfigValue` trait, as well as
adds a `CValue` type parameter to the `MaybeUnsetConfig` enum.
Without that, Rust refused `CValue` blanket type parameters in the
`impl` block for `MaybeUnsetConfig`, because it was not constrained
in any way.
Fortunately, this type parameter will be easily inferred in most cases,
so this won't bring any fuss.
Implementation for handling retry policies as `MaybeUnsetConfigValue`
is added. It will be used in further commits.
This refactors the function `cass_cluster_set_retry_policy` to use
the `MaybeUnsetConfig` type for the retry policy argument. This is
another step towards unifying the way maybe-unset configuration options
are handled in the Scylla Rust wrapper.
This refactors the function `cass_execution_profile_set_retry_policy` to
use the `MaybeUnsetConfig` type for the retry policy argument. This is
another step towards unifying the way maybe-unset configuration options
are handled in the Scylla Rust wrapper.
This refactors the statement's and batch's retry policy setters to
use the `MaybeUnsetConfig` type for the retry policy argument. This is
another step towards unifying the way maybe-unset configuration options
are handled in the Scylla Rust wrapper.
@wprzytula wprzytula force-pushed the fix-driver-defaults branch from 3d8b0ee to af1bb73 Compare July 8, 2025 19:03
@wprzytula
Copy link
Collaborator Author

Made CValue a type parameter instead of an associated type on MaybeUnsetConfigValue.

@wprzytula wprzytula requested a review from Lorak-mmk July 8, 2025 20:00
Comment on lines +191 to +196
// Load balancing is set in LoadBalancingConfig::build().
// The default load balancing policy is DCAware in CPP Driver.
// NOTE: CPP Driver initializes the local DC to the first node the client
// connects to. This is tricky, a possible footgun, and hard to be done with
// the current Rust driver implementation, which is why do don't use DC awareness
// by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably cassandra.h. Is there any other user-facing documentation?

@wprzytula wprzytula merged commit c0ddcc3 into scylladb:master Jul 10, 2025
13 of 17 checks passed
@wprzytula wprzytula deleted the fix-driver-defaults branch July 10, 2025 12:34
@wprzytula wprzytula mentioned this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 P1 priority item - very important
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants