Skip to content

[CDRIVER-6048] Add New Time and Duration Functionality #2074

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 55 commits into
base: master
Choose a base branch
from

Conversation

vector-of-bool
Copy link
Contributor

@vector-of-bool vector-of-bool commented Jul 24, 2025

Refer: CDRIVER-6048

Background

This is a preliminary work for CSOT, to give us a stable foundation for working with time going forward.

The C driver has had some duration/time types (mcd_time_point and mcd_duration), but they weren't widely used in the codebase and could use some TLC. This PR description will only present a high-level overview, as much of the implementation explanation has been added as commentary within the source files themselves.

This PR also updates one core internal submodule to use the new time APIs, as a test-run of deploying them into the codebase in preparation for CSOT.

Fortunately, the vast majority of code additions in this changeset are explanatory comments.

New Time APIs

The new time types are similar to the previous mcd_ time types:

  • struct mlib_duration represents a (possibly negative) difference between two points in time, with a fixed resolution (currently it has microsecond granularity). This is meant to replace any plain integers in the codebase that represent durations as a count of units. Using plain integers is problematic since the associated time unit cannot be statically enforced, and doing arithmetic on plain integers is inherently unsafe. The mlib_duration arithmetic is fully well-defined to use saturating arithmetic, so attempting to make a "too large" duration value will simply clamp to the nearest representable duration. While this is potentially problematic in extreme conditions, it only appears when juggling durations of hundreds of millennia, which reconnects Russia+Alaska and deletes Japan.
  • struct mlib_time_point represents a fixed point-in-time. Currently the implementation is written against the program's monotonic clock. It is encoded as an mlib_duration relative to the unspecified monotonic clock epoch. Because the epoch is unspecified, this type cannot be reliably converted to any "IRL" time point. (I tried writing an lldb pretty-printer to do this for debugging purposes, but it turns out to be very non-trivial and platform-dependent.)
    • The main API surface for interacting with the underlying clock is mlib_now() which simply returns a time-point for the moment that the function is called.
    • The mlib_now() function actually replaces the implementation of bson_get_monotonic_time function, which incidentally fixes CDRIVER-4526, since mlib_now uses the Win32 fine-grained monotonic clock.
    • Adding support for different clocks would be possible, but would require that the struct carry a clock identifier, and would make things significantly more complicated, as doing comparisons/differences between time points of different clocks requires knowing how the clock epochs are related, which, as mentioned, is non-trivial, enough that no one really wants to do this. For C++, for example, there is no stdlib conversation between the monotonic clock time points and the "IRL" clock time points.
  • mlib_timer represents a deadline. This simply stores an "expires-at" mlib_time_point. The main reason for this being a separate type is to provide distinct types and functions specific to manipulating and inspecting deadlines.

Creating and Manipulating Durations

The initial duration API was extremely verbose, and looked like:

int foo(int a, int b) {
  mlib_duration d = mlib_duration_add(mlib_milliseconds(a), mlib_seconds(b));
}

while this works, its incredibly tedious to write. Instead, an mlib_duration() function-like macro was added, which can be used used with two arguments to create a duration from a unit count, or three arguments to do duration arithmetic, and supports nested argument lists:

int foo(int a, int b) {
  mlib_duration d = mlib_duration((a, ms), plus, (b, sec));
}

See the doc comments in mlib/duration.h for an explanation. The macro trickery isn't pretty, but it is heavily commented for future maintainers.

Usage in mongoc-async-cmd

The mongoc-async-cmd component was chosen arbitrarily as a test bed for the new APIs. Unfortunately, the code in there was heavily under-commented and used non-descriptive struct member names, so a lot of effort was spent just deducing what it is actually trying to do with the dozen int64_t values it was using for timeouts/timestamps/deadlines.

A heavy amount of comments and API renames have been applied to the module, to ensure that we're doing what we expect, and so future refactors won't need to spend a whole 3+ days trying to deduce what everything means. This doesn't give me a great outlook on future CSOT refactor work, depending on whether all time-handling code is similarly obtuse or I just happened to pick a module that was particularly problematic.

Additionally, my refactor of mongoc-async-cmd code led to the discovery that command deadlines can be violated (up to 2x or possibly greater) because of a hack that was added for CDRIVER-1571. Since this is a core module underlying all command execution, refactors will be required to prevent this if we want a successful CSOT implementation. An inline function doc-comment explains the problem.

Review Order

Initially, there was attempt to do isolated commits in logical order, but it somewhat broke down towards the end. Instead, it is recommended to review the final changes in the following order:

  1. mlib/platform.h is added to just #include some common platform headers "correctly", since that's actually non-trivial.
  2. mlib/duration.h is the basis of all time functions.
  3. mlib/time_point.h builds upon duration to create points-in-time.
  4. mlib/timer.h a simpler file that adds deadline-specific functions.
  5. The mongoc-async and mongoc-async-cmd files, which refactor to use the time types.
  6. Everything else: There are several other minor changes across the codebase.

This is intended to replace `mcd_duration` as a more
capable duration type.
This uses preprocessor trickery to support more
terse conversion and arithmetic of duration types.
The following changes are made, in order of significance:

1. The `mongoc_stream_poll` function declaration now says the units on its
   timeout.
2. Add a `_mongoc_stream_poll_internal` that takes a deadline timer rather than
   a timeout duration.
3. Move async-command typedefs into the async-command header.
4. async command creation now takes typed durations instead of integers
5. Async command internally uses timers, time points, and durations,
   rather than juggling `int64_t` for everything.
7. Heavy comments in async-cmd files are added all over the place to explain
   what is going on.
@vector-of-bool vector-of-bool marked this pull request as ready for review July 24, 2025 22:33
@vector-of-bool vector-of-bool requested a review from a team as a code owner July 24, 2025 22:33
@vector-of-bool vector-of-bool requested a review from kevinAlbs July 24, 2025 22:33
@kevinAlbs kevinAlbs requested a review from eramongodb July 29, 2025 14:27
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvements. Initial feedback, more pending.

Comment on lines +51 to +52
#define MONGOC_SECURE_CHANNEL_ENABLED() @MONGOC_ENABLE_SSL_SECURE_CHANNEL@
#if MONGOC_SECURE_CHANNEL_ENABLED() != 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes might be premature potential API breaking changes (e.g. note, here). Consider reverting them for now and deferring public header function-macro improvements to a separate ticket/PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it would be a breaking change, as the func-macro takes the same value from the @replacement@ that is used for the obj-macro (note the func-macro has a different name, so the obj-macro is still available).

The reason I added these drive-by changes is that some changes cause a "dead-write" warning to spookily appear from the scan-build task in mongoc-async-cmd.cm, and it was easier to rewrite the offending block as a ?: than to dance around the conditional compilation, so I added these macros.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a breaking change due to users who may be providing their own mongoc-config.h header file (as a whole), not merely setting the MONGOC_ENABLE_SSL_SECURE_CHANNEL macro (like we would prefer them to do instead), per the first linked note:

The internal preprocessor symbol HAVE_STRINGS_H has been renamed BSON_HAVE_STRINGS_H. If you maintain a handwritten bson-config.h you must rename this symbol.

Then again, I suppose this note suggests changing these config macros does not constitute an API breaking change (1.10.0 release notes)? More API ambiguity that needs to be addressed at some point. 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users who may be providing their own mongoc-config.h header file (as a whole)

I hope most users do not. I expect their copies need to be updated when new macros are added. I see a semi-relevant result on GitHub code search (from 7 years ago):

# Rather than run automake and autoconf, etc we simply copy the file in
$(CWD)/mongo-c-driver/src/mongoc/mongoc-config.h: $(CWD)/mongoc-config.h
	@cp $< $@~ && mv $@~ $@

More API ambiguity that needs to be addressed at some point. 😅

Agreed. Erring towards caution: consider defining the MONGOC_SECURE_CHANNEL_ENABLED() function macros in a private header (maybe common-config.h.in?). That may side-step the issue until we have a better defined API policy (CDRIVER-5705).

* - `mlib_duration(<dur>, <op>, <operand>)`
* Manipulates a duration according to `<op>`.
*
* In the above, `<dur>` may be a parenthsized `mlib_duration` argument list or a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* In the above, `<dur>` may be a parenthsized `mlib_duration` argument list or a
* In the above, `<dur>` may be a parenthesized `mlib_duration` argument list or a

* @brief Obtain the count of microseconds represented by the duration (round
* toward zero)
*/
static inline mlib_duration_rep_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not BSON_INLINE? Can/should we remove use of BSON_INLINE for functions? (Not necessarily in this PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now require C99, BSON_INLINE is no longer necessary, and should probably be avoided going forward since it has non-standard behavior in some edge conditions. We can likely change it to basic inline in all public headers in a future PR.

// Duration scalar multiply
#define _mlibDurationInfixOperator_mul(LHS, Fac) _mlibDurationMultiply (_mlibDurationArgument (LHS), (Fac))
static inline mlib_duration
_mlibDurationMultiply (const mlib_duration dur, int fac) mlib_noexcept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_mlibDurationMultiply (const mlib_duration dur, int fac) mlib_noexcept
_mlibDurationMultiply (const mlib_duration dur, mlib_upsized_integer fac) mlib_noexcept

Can these functions be made to accept an mlib_upsized_integer?

// Check for POSIX clock functions functions
#if (defined(_POSIX_C_SOURCE) && _POSIX_C_SOURCE >= 199309L) || (defined(_DEFAULT_SOURCE) && !defined(_WIN32))
#include <sys/types.h>
#define mlib_have_posix_clocks() 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define mlib_have_posix_clocks() 1
#undef mlib_have_posix_clocks
#define mlib_have_posix_clocks() 1

Safe-guard against pre-existing definitions.

* @brief Given two time points, selects the time point that occurs earliest
*/
static inline mlib_time_point
mlib_soonest (mlib_time_point l, mlib_time_point r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mlib_soonest (mlib_time_point l, mlib_time_point r)
mlib_earliest (mlib_time_point l, mlib_time_point r)

I understand it's meant to be "soonest" as in "sooner vs. later", but this is nevertheless odd/confusing to me.

Suggest borrowing the doc comment and renaming this to mlib_earliest instead, as in "early vs. late".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that name better, yes.

Comment on lines 142 to 143
* @brief User-provided command event callback. Invoked with the final result
* and once when the command connects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @brief User-provided command event callback. Invoked with the final result
* and once when the command connects.
* @brief User-provided command event callback. Invoked after a new
* connection is established, and again when the command completes.

Confused by this wording. Suggest reusing the wording used for the typedef, which is clearer to me.

*/
static inline mlib_timer
_acmd_deadline (const mongoc_async_cmd_t *self)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{
{
BSON_ASSERT_PARAM (self);

*/
static inline void
_acmd_cancel (mongoc_async_cmd_t *self)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{
{
BSON_ASSERT_PARAM (self);

static inline void
_acmd_cancel (mongoc_async_cmd_t *self)
{
// XXX: Should this check if ther command has already finished/failed?
Copy link
Contributor

@eramongodb eramongodb Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// XXX: Should this check if ther command has already finished/failed?
// XXX: Should this check if the command has already finished/failed?

I think the answer might be yes, to ensure:

The [event] callback will be invoked after a new connection is established, and again when the command completes.

If the command is already completed ("finished"), but then the state transitions to cancelled (is this even possible by the time we've left the if (nactive > 0) branch in mongoc_async_run()?), the event callback might be invoked again in mongoc_async_run() with *_ERROR after already being invoked in mongoc_async_cmd_run() with *_(SUCCESS|ERROR|TIMEOUT).

{
BSON_ASSERT_PARAM (cmd);
BSON_ASSERT_PARAM (dbname);

mongoc_async_cmd_t *const acmd = BSON_ALIGNED_ALLOC0 (mongoc_async_cmd_t);
acmd->_connect_delay_timer = mlib_expires_after (connect_delay);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a call to mlib_now() both here for _connect_delay_timer and for _start_time below. Can/should they be reduced to a single mlib_now()?

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added comments on the async APIs are very much appreciated.

#define _mlibCommaIfParens(...) ,

/**
* @brief Expands to `1` if the given macro argument is a parenthesized group of
* tokens, otherwize `0`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* tokens, otherwize `0`
* tokens, otherwise `0`

// Feature detection
#ifdef __has_include
#if __has_include(<features.h>)
#include <features.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is features.h needed? If no, consider removing. https://man7.org/linux/man-pages/man7/feature_test_macros.7.html notes:

Note: applications do not need to directly include <features.h>; indeed, doing so is actively discouraged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like it all builds fine without this header. Can't recall why I initially thought it was necessary.

ret.time_since_monotonic_start = mlib_duration_from_timespec (ts);
return ret;
#elif defined(_WIN32)
// Win32 APIs for the high-performance monotonic counter. These APIs never fail after Windows XP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using mlib_check to verify the returns of QueryPerformanceFrequency and QueryPerformanceCounter are successful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel safe omitting these checks as even the standard libray implementation does it

// Number of microseconds beyond the last whole second:
const int64_t subsecond_us = ((ticks % ticks_per_second) * one_million) / ticks_per_second;
mlib_time_point ret;
ret.time_since_monotonic_start = mlib_duration ((whole_seconds_1m, us), plus, (subsecond_us, us));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ret.time_since_monotonic_start = mlib_duration ((whole_seconds_1m, us), plus, (subsecond_us, us));
ret.time_since_monotonic_start = mlib_duration ((whole_seconds, sec), plus, (subsecond_us, us));

Suggest removing whole_seconds_1m to simplify.

* @brief Timer types and functions
* @date 2025-04-18
*
* This file contains APIs for creating fixed-deadline timer object that represent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* This file contains APIs for creating fixed-deadline timer object that represent
* This file contains APIs for creating fixed-deadline timer object that represents

MONGOC_ASYNC_CMD_SETUP,
// The command has no stream and needs to connect to a peer
MONGOC_ASYNC_CMD_PENDING_CONNECT,
// The command has connected and has a stream, but needs to run stream setup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest noting setup can be the TLS handshake (which I think is the only setup currently done).

Suggested change
// The command has connected and has a stream, but needs to run stream setup
// The command has connected and has a stream, but needs to run stream setup (e.g. TLS handshake)

@@ -2214,7 +2216,7 @@ test_mongoc_client_descriptions_pooled (void *unused)
/* wait for background thread to discover all members */
start = bson_get_monotonic_time ();
do {
_mongoc_usleep (1000);
mlib_sleep_for (1, ms);
/* Windows IPv4 tasks may take longer to connect since connection to the
* first address returned by getaddrinfo may be IPv6, and failure to
* connect may take a couple seconds. See CDRIVER-3639. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: but consider also updating the below check with mlib_timer:

mlib_timer tm = mlib_expires_after (3, sec);
// ...
if (mlib_timer_is_expired (tm)) {

Comment on lines +51 to +52
#define MONGOC_SECURE_CHANNEL_ENABLED() @MONGOC_ENABLE_SSL_SECURE_CHANNEL@
#if MONGOC_SECURE_CHANNEL_ENABLED() != 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users who may be providing their own mongoc-config.h header file (as a whole)

I hope most users do not. I expect their copies need to be updated when new macros are added. I see a semi-relevant result on GitHub code search (from 7 years ago):

# Rather than run automake and autoconf, etc we simply copy the file in
$(CWD)/mongo-c-driver/src/mongoc/mongoc-config.h: $(CWD)/mongoc-config.h
	@cp $< $@~ && mv $@~ $@

More API ambiguity that needs to be addressed at some point. 😅

Agreed. Erring towards caution: consider defining the MONGOC_SECURE_CHANNEL_ENABLED() function macros in a private header (maybe common-config.h.in?). That may side-step the issue until we have a better defined API policy (CDRIVER-5705).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants