-
-
Notifications
You must be signed in to change notification settings - Fork 149
fix: hierarchical json flattening restriction #1058
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: hierarchical json flattening restriction #1058
Conversation
Pull Request Test Coverage Report for Build 12618356442Details
💛 - Coveralls |
src/utils/json/flatten.rs
Outdated
/// 5. `{"a":{"b":{"c":{"d":{"e":["a","b"]}}}}}` ~> returns error - heavily nested, cannot flatten this JSON | ||
pub fn flatten_json(value: &Value) -> Result<Vec<Value>, anyhow::Error> { | ||
if has_more_than_four_levels(value, 1) { | ||
return Err(anyhow!("heavily nested, cannot flatten this JSON")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why error out? Why not just not flatten upto a certain extent and keep the type as is afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. {"a": [{"b":{"c":{"d":["e","f"]}}}, "c": {}]}
~> [{"a":{"b":{"c":{"d":["e","f"]}}}}, {"a":{"c": {}}}]
get the level of hierarchy from the json perform generic flattening only if level of nesting is <=4
78e24a2
to
adf87dd
Compare
perf: test nesting level only once
let size: usize = body.len(); | ||
let mut parsed_timestamp = Utc::now().naive_utc(); | ||
if time_partition.is_none() { | ||
if custom_partition.is_none() { | ||
let size = size as u64; | ||
create_process_record_batch( | ||
stream_name, | ||
req, | ||
body_val, | ||
static_schema_flag.as_ref(), | ||
None, | ||
parsed_timestamp, | ||
&HashMap::new(), | ||
size, | ||
schema_version, | ||
) | ||
.await?; | ||
} else { | ||
let data = convert_array_to_object( | ||
body_val, | ||
None, | ||
None, | ||
custom_partition.as_ref(), | ||
schema_version, | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the following fix the issue of not wanting to convert array to object?
let data = if time_partition.is_some() || custom_partition.is_some() {
convert_array_to_object(
body_val,
time_partition.as_ref(),
time_partition_limit,
custom_partition.as_ref(),
schema_version,
)?
} else {
vec![body_val]
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to test out all the scenarios before I can confirm the changes, as discussed, lets take the optimisation of this one as separate issue.
hierarchical json flattening restriction get the level of hierarchy from the json perform generic flattening only if level of nesting is <=4 --------- Co-authored-by: Devdutt Shenoi <[email protected]>
get the level of hierarchy from the json
perform generic flattening only if level of nesting is <=4