Skip to content

Commit b403fa9

Browse files
authored
docs(gctx): a bit more of how config deserialization works (#16094)
### What does this PR try to resolve? Enhance the documentation of how config deserialization works. ### How to test and review this PR? Read the doc?
2 parents 6f3896d + 0fba11f commit b403fa9

File tree

3 files changed

+72
-34
lines changed

3 files changed

+72
-34
lines changed

src/cargo/util/context/de.rs

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,25 @@
1-
//! Support for deserializing configuration via `serde`
1+
//! Deserialization for converting [`ConfigValue`] instances to target types.
2+
//!
3+
//! The [`Deserializer`] type is the main driver of deserialization.
4+
//! The workflow is roughly:
5+
//!
6+
//! 1. [`GlobalContext::get<T>()`] creates [`Deserializer`] and calls `T::deserialize()`
7+
//! 2. Then call type-specific deserialize methods as in normal serde deserialization.
8+
//! - For primitives, `deserialize_*` methods look up [`ConfigValue`] instances
9+
//! in [`GlobalContext`] and convert.
10+
//! - Structs and maps are handled by [`ConfigMapAccess`].
11+
//! - Sequences are handled by [`ConfigSeqAccess`],
12+
//! which later uses [`ArrayItemDeserializer`] for each array item.
13+
//! - [`Value<T>`] is delegated to [`ValueDeserializer`] in `deserialize_struct`.
14+
//!
15+
//! The purpose of this workflow is to:
16+
//!
17+
//! - Retrieve the correct config value based on source location precedence
18+
//! - Provide richer error context showing where a config is defined
19+
//! - Provide a richer internal API to map to concrete config types
20+
//! without touching underlying [`ConfigValue`] directly
21+
//!
22+
//! [`ConfigValue`]: CV
223
324
use crate::util::context::value;
425
use crate::util::context::{ConfigError, ConfigKey, GlobalContext};

src/cargo/util/context/mod.rs

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,31 @@
1111
//!
1212
//! There are a variety of helper types for deserializing some common formats:
1313
//!
14-
//! - `value::Value`: This type provides access to the location where the
14+
//! - [`value::Value`]: This type provides access to the location where the
1515
//! config value was defined.
16-
//! - `ConfigRelativePath`: For a path that is relative to where it is
16+
//! - [`ConfigRelativePath`]: For a path that is relative to where it is
1717
//! defined.
18-
//! - `PathAndArgs`: Similar to `ConfigRelativePath`, but also supports a list
19-
//! of arguments, useful for programs to execute.
20-
//! - `StringList`: Get a value that is either a list or a whitespace split
18+
//! - [`PathAndArgs`]: Similar to [`ConfigRelativePath`],
19+
//! but also supports a list of arguments, useful for programs to execute.
20+
//! - [`StringList`]: Get a value that is either a list or a whitespace split
2121
//! string.
2222
//!
23+
//! ## Config deserialization
24+
//!
25+
//! Cargo uses a two-layer deserialization approach:
26+
//!
27+
//! 1. **External sources → `ConfigValue`** ---
28+
//! Configuration files, environment variables, and CLI `--config` arguments
29+
//! are parsed into [`ConfigValue`] instances via [`ConfigValue::from_toml`].
30+
//! These parsed results are stored in [`GlobalContext`].
31+
//!
32+
//! 2. **`ConfigValue` → Target types** ---
33+
//! The [`GlobalContext::get`] method uses a [custom serde deserializer](Deserializer)
34+
//! to convert [`ConfigValue`] instances to the caller's desired type.
35+
//! Precedence between [`ConfigValue`] sources is resolved during retrieval
36+
//! based on [`Definition`] priority.
37+
//! See the top-level documentation of the [`de`] module for more.
38+
//!
2339
//! ## Map key recommendations
2440
//!
2541
//! Handling tables that have arbitrary keys can be tricky, particularly if it
@@ -40,14 +56,6 @@
4056
//! structs/maps, but if it is a struct or map, then it will not be able to
4157
//! read the environment variable due to ambiguity. (See `ConfigMapAccess` for
4258
//! more details.)
43-
//!
44-
//! ## Internal API
45-
//!
46-
//! Internally config values are stored with the `ConfigValue` type after they
47-
//! have been loaded from disk. This is similar to the `toml::Value` type, but
48-
//! includes the definition location. The `get()` method uses serde to
49-
//! translate from `ConfigValue` and environment variables to the caller's
50-
//! desired type.
5159
5260
use crate::util::cache_lock::{CacheLock, CacheLockMode, CacheLocker};
5361
use std::borrow::Cow;
@@ -2061,6 +2069,7 @@ enum KeyOrIdx {
20612069
Idx(usize),
20622070
}
20632071

2072+
/// Similar to [`toml::Value`] but includes the source location where it is defined.
20642073
#[derive(Eq, PartialEq, Clone)]
20652074
pub enum ConfigValue {
20662075
Integer(i64, Definition),

src/cargo/util/context/value.rs

Lines changed: 28 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,38 @@
1-
//! Deserialization of a `Value<T>` type which tracks where it was deserialized
2-
//! from.
1+
//! Deserialization of a [`Value<T>`] type which tracks where it was deserialized from.
2+
//!
3+
//! ## Rationale for `Value<T>`
34
//!
45
//! Often Cargo wants to report semantic error information or other sorts of
56
//! error information about configuration keys but it also may wish to indicate
67
//! as an error context where the key was defined as well (to help user
78
//! debugging). The `Value<T>` type here can be used to deserialize a `T` value
89
//! from configuration, but also record where it was deserialized from when it
910
//! was read.
11+
//!
12+
//! ## How `Value<T>` deserialization works
13+
//!
14+
//! Deserializing `Value<T>` is pretty special, and serde doesn't have built-in
15+
//! support for this operation. To implement this we extend serde's "data model"
16+
//! a bit. We configure deserialization of `Value<T>` to basically only work with
17+
//! our one deserializer using configuration.
18+
//!
19+
//! We define that `Value<T>` deserialization asks the deserializer for a very
20+
//! special [struct name](NAME) and [struct field names](FIELDS). In doing so,
21+
//! the deserializer will recognize this and synthesize a magical value for the
22+
//! `definition` field when we deserialize it. This protocol is how we're able
23+
//! to have a channel of information flowing from the configuration deserializer
24+
//! into the deserialization implementation here.
25+
//!
26+
//! You'll want to also check out the implementation of `ValueDeserializer` in
27+
//! the [`de`] module. Also note that the names below are intended to be invalid
28+
//! Rust identifiers to avoid conflicts with other valid structures.
29+
//!
30+
//! Finally the `definition` field is transmitted as a tuple of i32/string,
31+
//! which is effectively a tagged union of [`Definition`] itself. You should
32+
//! update both places here and in the impl of [`serde::de::MapAccess`] for
33+
//! `ValueDeserializer` when adding or modifying enum variants of [`Definition`].
34+
//!
35+
//! [`de`]: crate::util::context::de
1036
1137
use crate::util::context::GlobalContext;
1238
use serde::de;
@@ -29,24 +55,6 @@ pub struct Value<T> {
2955

3056
pub type OptValue<T> = Option<Value<T>>;
3157

32-
// Deserializing `Value<T>` is pretty special, and serde doesn't have built-in
33-
// support for this operation. To implement this we extend serde's "data model"
34-
// a bit. We configure deserialization of `Value<T>` to basically only work with
35-
// our one deserializer using configuration.
36-
//
37-
// We define that `Value<T>` deserialization asks the deserializer for a very
38-
// special struct name and struct field names. In doing so the deserializer will
39-
// recognize this and synthesize a magical value for the `definition` field when
40-
// we deserialize it. This protocol is how we're able to have a channel of
41-
// information flowing from the configuration deserializer into the
42-
// deserialization implementation here.
43-
//
44-
// You'll want to also check out the implementation of `ValueDeserializer` in
45-
// `de.rs`. Also note that the names below are intended to be invalid Rust
46-
// identifiers to avoid how they might conflict with other valid structures.
47-
// Finally the `definition` field is transmitted as a tuple of i32/string, which
48-
// is effectively a tagged union of `Definition` itself.
49-
5058
pub(crate) const VALUE_FIELD: &str = "$__cargo_private_value";
5159
pub(crate) const DEFINITION_FIELD: &str = "$__cargo_private_definition";
5260
pub(crate) const NAME: &str = "$__cargo_private_Value";

0 commit comments

Comments
 (0)