Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ syn = { version = "1.0.109", optional = true }

[features]
assign = []
default = ["std", "serde", "json", "resolve", "assign", "delete"]
default = ["std", "serde", "json", "toml", "resolve", "assign", "delete"]
delete = ["resolve"]
json = ["dep:serde_json", "serde"]
resolve = []
Expand Down
126 changes: 84 additions & 42 deletions src/assign.rs
Original file line number Diff line number Diff line change
Expand Up @@ -133,55 +133,66 @@ pub trait Assign {
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
╔══════════════════════════════════════════════════════════════════════════════╗
║ ║
AssignError
║ ¯¯¯¯¯¯¯¯¯¯¯¯¯
Error
¯¯¯¯¯¯¯
╚══════════════════════════════════════════════════════════════════════════════╝
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
*/

// TODO: should AssignError be deprecated?
/// Alias for [`Error`].
///
/// Possible error returned from [`Assign`] implementations for
/// [`serde_json::Value`] and
/// [`toml::Value`](https://docs.rs/toml/0.8.14/toml/index.html).
pub type AssignError = Error;

/// Possible error returned from [`Assign`] implementations for
/// [`serde_json::Value`] and
/// [`toml::Value`](https://docs.rs/toml/0.8.14/toml/index.html).
#[derive(Debug, PartialEq, Eq)]
pub enum AssignError {
/// A `Token` within the `Pointer` failed to be parsed as an array index.
pub enum Error {
/// A [`Token`] within the [`Pointer`] failed to be parsed as an array index.
FailedToParseIndex {
/// Position (index) of the token which failed to parse as an `Index`
position: usize,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is to aid with the error message, right? Thinking about it, we could technically derive it on-the-fly from the offset + token. I know this was part of the raison d'etre for this PR, but I just realised I don't really understand the motivation.

/// Offset of the partial pointer starting with the invalid index.
offset: usize,
/// The source [`ParseIndexError`]
source: ParseIndexError,
},

/// target array.
/// A [`Token`] within the [`Pointer`] contains an [`Index`] which is out of bounds.
///
/// The current or resulting array's length is less than the index.
OutOfBounds {
/// Position (index) of the token which failed to parse as an `Index`
position: usize,
/// Offset of the partial pointer starting with the invalid index.
offset: usize,
/// The source [`OutOfBoundsError`]
source: OutOfBoundsError,
},
}

impl fmt::Display for AssignError {
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::FailedToParseIndex { offset, .. } => {
write!(
f,
"assignment failed due to an invalid index at offset {offset}"
)
Self::FailedToParseIndex { .. } => {
write!(f, "assignment failed due to an invalid index")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I'm reading this correctly we'll be deprecating the offset field entirely later on? I'm not sure it makes sense to do this in stages, because even adding position is already breaking, unfortunately. Might as well rip the band-aid off in one go.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we should deprecate it. It actually is incredibly useful to have and a positive consequence of my blunder to opt for offset to begin with.

Folks can have pretty printed errors that do something like

/some/example/invalid/more
             ^^^^^^^^

without much additional effort to determine where that starts.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! In that case, why did you remove it from the message? 😄

Copy link
Owner Author

@chanced chanced Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have mentioned that I went through the exact same thought process as you.

I started with "ugh, I guess I add position, deprecate offset, and plan on removing it." But that path has so many breaking changes across numerous releases.

Then it dawned on me that it's actually really useful.

Copy link
Owner Author

@chanced chanced Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope - I'm back on this was a mistake. I'd like to remove offset but I think we may be stuck with it. Or we potentially break a lot.

I just discovered mini-crater, which may give some insight into what deleting offset would likely mean, at least to known dependents.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, have been busy, finally getting through my backlog.

No need to apologize about that at all.

because if you want to keep ownership you can always pass a reference. This will create a clone internally, but consider the alternative of taking a reference — then we'll always clone, even if the caller doesn't need the string anymore

So for Pointer::parse, I don't think we should return the string. For one, they already have the string - we literally cannot take ownership of it. Two, a reference would introduce lifetimes which would make the errors a pain. The only upside is reporting which not everyone needs but introduces additional work for everyone else to rid themselves of lifetimes with a call to into_owned.

PointerBuf::parse is different. In the case of &str, we have already allocated with .into(). In the case of an owned String, we have an existing allocation which we are currently dropping on the error path.

If we require the caller to keep a copy for their error path, we are causing an additional allocation on the happy path.

Right now, in #93 I have

pub struct ParseBufError {
    pub value: String,
    pub source: ParseError,
}

and ParseError implements From<ParseBufError>.

I've got to run for now - I'll reply to the other when I'm in front of my laptop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for Pointer::parse, I don't think we should return the string.

Agreed, although including an owned copy there would allow us to report with the #[source_code] attribute without the annoying lifetime. Given this is the unhappy path I think it'd be reasonable to allocate for that.

If we require the caller to keep a copy for their error path, we are causing an additional allocation on the happy path.

Yep, hence the suggestion to make it use a Cow. See my full suggestion in #96 (I think you meant to link that one?).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! In that case, why did you remove it from the message? 😄

sorry, I missed this question and I can't recall :(

Copy link
Owner Author

@chanced chanced Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh!! You meant the error message! Sorry, I should have held off on replying until I could devote enough energy to properly think through the question.

I removed the offset from the error message initially because I wasn't sure how much value it was adding vs making the error message more verbose and confusing. In retrospect, I'm not sure either are the case. Some may appreciate the offset in the message and I think the error message is now longer than what it was before.

I'll add them back.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just to clarify it's not that I think we need the offset in the message, but just that the fact it's not used makes it seem like it's not useful (and if it isn't, perhaps it should be deprecated). But with miette in the picture I think it definitely is useful because then it can be used for the label.

}
Self::OutOfBounds { offset, .. } => {
Self::OutOfBounds { .. } => {
write!(
f,
"assignment failed due to index at offset {offset} being out of bounds"
"assignment failed due to index an index being out of bounds"
)
}
}
}
}

#[cfg(feature = "std")]
impl std::error::Error for AssignError {
impl std::error::Error for Error {
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
match self {
Self::FailedToParseIndex { source, .. } => Some(source),
Expand All @@ -207,7 +218,7 @@ enum Assigned<'v, V> {

#[cfg(feature = "json")]
mod json {
use super::{Assign, AssignError, Assigned};
use super::{Assign, Assigned, Error};
use crate::{Pointer, Token};
use alloc::{
string::{String, ToString},
Expand Down Expand Up @@ -235,7 +246,7 @@ mod json {
}
impl Assign for Value {
type Value = Value;
type Error = AssignError;
type Error = Error;
fn assign<V>(&mut self, ptr: &Pointer, value: V) -> Result<Option<Self::Value>, Self::Error>
where
V: Into<Self::Value>,
Expand All @@ -248,14 +259,15 @@ mod json {
mut ptr: &Pointer,
mut dest: &mut Value,
mut value: Value,
) -> Result<Option<Value>, AssignError> {
) -> Result<Option<Value>, Error> {
let mut offset = 0;

let mut position = 0;
while let Some((token, tail)) = ptr.split_front() {
let tok_len = token.encoded().len();

let assigned = match dest {
Value::Array(array) => assign_array(token, tail, array, value, offset)?,
Value::Array(array) => assign_array(token, tail, array, value, position, offset)?,
Value::Object(obj) => assign_object(token, tail, obj, value),
_ => assign_scalar(ptr, dest, value),
};
Expand All @@ -273,6 +285,7 @@ mod json {
}
}
offset += 1 + tok_len;
position += 1;
}

// Pointer is root, we can replace `dest` directly
Expand All @@ -285,14 +298,23 @@ mod json {
remaining: &Pointer,
array: &'v mut Vec<Value>,
src: Value,
position: usize,
offset: usize,
) -> Result<Assigned<'v, Value>, AssignError> {
) -> Result<Assigned<'v, Value>, Error> {
// parsing the index
let idx = token
.to_index()
.map_err(|source| AssignError::FailedToParseIndex { offset, source })?
.map_err(|source| Error::FailedToParseIndex {
position,
offset,
source,
})?
.for_len_incl(array.len())
.map_err(|source| AssignError::OutOfBounds { offset, source })?;
.map_err(|source| Error::OutOfBounds {
position,
offset,
source,
})?;

debug_assert!(idx <= array.len());

Expand Down Expand Up @@ -381,7 +403,7 @@ mod json {

#[cfg(feature = "toml")]
mod toml {
use super::{Assign, AssignError, Assigned};
use super::{Assign, Assigned, Error};
use crate::{Pointer, Token};
use alloc::{string::String, vec, vec::Vec};
use core::mem;
Expand All @@ -406,7 +428,7 @@ mod toml {

impl Assign for Value {
type Value = Value;
type Error = AssignError;
type Error = Error;
fn assign<V>(&mut self, ptr: &Pointer, value: V) -> Result<Option<Self::Value>, Self::Error>
where
V: Into<Self::Value>,
Expand All @@ -419,14 +441,15 @@ mod toml {
mut ptr: &Pointer,
mut dest: &mut Value,
mut value: Value,
) -> Result<Option<Value>, AssignError> {
) -> Result<Option<Value>, Error> {
let mut offset = 0;
let mut position = 0;

while let Some((token, tail)) = ptr.split_front() {
let tok_len = token.encoded().len();

let assigned = match dest {
Value::Array(array) => assign_array(token, tail, array, value, offset)?,
Value::Array(array) => assign_array(token, tail, array, value, position, offset)?,
Value::Table(tbl) => assign_object(token, tail, tbl, value),
_ => assign_scalar(ptr, dest, value),
};
Expand All @@ -444,6 +467,7 @@ mod toml {
}
}
offset += 1 + tok_len;
position += 1;
}

// Pointer is root, we can replace `dest` directly
Expand All @@ -457,14 +481,23 @@ mod toml {
remaining: &Pointer,
array: &'v mut Vec<Value>,
src: Value,
position: usize,
offset: usize,
) -> Result<Assigned<'v, Value>, AssignError> {
) -> Result<Assigned<'v, Value>, Error> {
// parsing the index
let idx = token
.to_index()
.map_err(|source| AssignError::FailedToParseIndex { offset, source })?
.map_err(|source| Error::FailedToParseIndex {
position,
offset,
source,
})?
.for_len_incl(array.len())
.map_err(|source| AssignError::OutOfBounds { offset, source })?;
.map_err(|source| Error::OutOfBounds {
position,
offset,
source,
})?;

debug_assert!(idx <= array.len());

Expand Down Expand Up @@ -550,7 +583,7 @@ mod toml {
#[cfg(test)]
#[allow(clippy::too_many_lines)]
mod tests {
use super::{Assign, AssignError};
use super::{Assign, Error};
use crate::{
index::{OutOfBoundsError, ParseIndexError},
Pointer,
Expand All @@ -574,9 +607,6 @@ mod tests {
V::Error: Debug + PartialEq,
Result<Option<V>, V::Error>: PartialEq<Result<Option<V::Value>, V::Error>>,
{
fn all(tests: impl IntoIterator<Item = Test<V>>) {
tests.into_iter().enumerate().for_each(|(i, t)| t.run(i));
}
fn run(self, i: usize) {
let Test {
ptr,
Expand Down Expand Up @@ -607,7 +637,7 @@ mod tests {
fn assign_json() {
use alloc::vec;
use serde_json::json;
Test::all([
[
Test {
ptr: "/foo",
data: json!({}),
Expand Down Expand Up @@ -731,7 +761,8 @@ mod tests {
ptr: "/1",
data: json!([]),
assign: json!("foo"),
expected: Err(AssignError::OutOfBounds {
expected: Err(Error::OutOfBounds {
position: 0,
offset: 0,
source: OutOfBoundsError {
index: 1,
Expand All @@ -751,7 +782,8 @@ mod tests {
ptr: "/a",
data: json!([]),
assign: json!("foo"),
expected: Err(AssignError::FailedToParseIndex {
expected: Err(Error::FailedToParseIndex {
position: 0,
offset: 0,
source: ParseIndexError::InvalidInteger(usize::from_str("foo").unwrap_err()),
}),
Expand All @@ -761,7 +793,8 @@ mod tests {
ptr: "/002",
data: json!([]),
assign: json!("foo"),
expected: Err(AssignError::FailedToParseIndex {
expected: Err(Error::FailedToParseIndex {
position: 0,
offset: 0,
source: ParseIndexError::LeadingZeros,
}),
Expand All @@ -771,13 +804,17 @@ mod tests {
ptr: "/+23",
data: json!([]),
assign: json!("foo"),
expected: Err(AssignError::FailedToParseIndex {
expected: Err(Error::FailedToParseIndex {
position: 0,
offset: 0,
source: ParseIndexError::InvalidCharacters("+".into()),
}),
expected_data: json!([]),
},
]);
]
.into_iter()
.enumerate()
.for_each(|(i, t)| t.run(i));
}

/*
Expand All @@ -791,7 +828,7 @@ mod tests {
fn assign_toml() {
use alloc::vec;
use toml::{toml, Table, Value};
Test::all([
[
Test {
data: Value::Table(toml::Table::new()),
ptr: "/foo",
Expand Down Expand Up @@ -910,7 +947,8 @@ mod tests {
data: Value::Array(vec![]),
ptr: "/1",
assign: "foo".into(),
expected: Err(AssignError::OutOfBounds {
expected: Err(Error::OutOfBounds {
position: 0,
offset: 0,
source: OutOfBoundsError {
index: 1,
Expand All @@ -923,12 +961,16 @@ mod tests {
data: Value::Array(vec![]),
ptr: "/a",
assign: "foo".into(),
expected: Err(AssignError::FailedToParseIndex {
expected: Err(Error::FailedToParseIndex {
position: 0,
offset: 0,
source: ParseIndexError::InvalidInteger(usize::from_str("foo").unwrap_err()),
}),
expected_data: Value::Array(vec![]),
},
]);
]
.into_iter()
.enumerate()
.for_each(|(i, t)| t.run(i));
}
}
Loading
Loading