Skip to content

Conversation

@TonyXiang8787
Copy link
Member

@TonyXiang8787 TonyXiang8787 commented Nov 11, 2025

This PR experiments an idea of multi-dimensional batch calculation in the C-API without introducing breaking changes. The Python API is also adjusted without breaking changes.

Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
@TonyXiang8787 TonyXiang8787 self-assigned this Nov 11, 2025
@TonyXiang8787 TonyXiang8787 added feature New feature or request do-not-merge This should not be merged labels Nov 11, 2025
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
Signed-off-by: Tony Xiang <[email protected]>
@TonyXiang8787 TonyXiang8787 changed the title Feature: support multi-dimensional batch calculation from C-API [experimental] Feature: support multi-dimensional batch calculation[experimental] Nov 11, 2025
Signed-off-by: Tony Xiang <[email protected]>
@sonarqubecloud
Copy link

Comment on lines +312 to +341
/**
* @brief Create a PGM_MultiDimensionalDataset from multiple PGM_ConstDataset instances
*
* @param handle
* @param const_datasets
* @param n_datasets
* @return
*/
PGM_API PGM_MultiDimensionalDataset*
PGM_dataset_create_multidimensional_from_const(PGM_Handle* handle, PGM_ConstDataset const** const_datasets,
PGM_Idx n_datasets);

/**
* @brief Get the array pointer from a PGM_MultiDimensionalDataset
*
* @param handle
* @param multidimensional_dataset
* @return
*/
PGM_API PGM_ConstDataset const*
PGM_get_array_pointer_from_multidimensional(PGM_Handle* handle,
PGM_MultiDimensionalDataset const* multidimensional_dataset);

/**
* @brief destroy the multidimensional dataset object
*
* @param multidimensional_dataset
*/
PGM_API void PGM_destroy_multidimensional_dataset(PGM_MultiDimensionalDataset* multidimensional_dataset);

Copy link
Member Author

@TonyXiang8787 TonyXiang8787 Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgovers This is one option to pass a MD datasets into the C-API. You need to have additional helper functions to create an array of datasets. We need a new opaque struct.

Another interesting approach would be using chaining. So define a function called dataset_const_chain_another, which store a pointer to the next dimension dataset into this one. Internnaly we can check if the next_dataset pointer is nullptr or not to traverse dimensions. This way is less intuitive, but still used in many other APIs. And you only need one extra function. We don't even need to change the options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer this separate ND approach, but how would we use ND-datasets, e.g. a 3-dimensional one instead of only 2-dimensional

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not really a ND, but a list of datasets. The core will make a cross-join ND calculation of combinations in the list.

The questions, do we create separate opaque struct to store an array of datasets? This requires a lot of new functions. Or do we do the chaining, this only needs one new function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do believe under the current implementation, sure, but it also restricts us going forward. i feel like if we are going to support 1D and have a separate object for 2D, then we might as well go for ND. otherwise, we'll fall into the same rabbit hole that C++ has been in (watch any video on std::mdspan and you'll know, they are better at explaining what i mean than i am)

Copy link
Member Author

@TonyXiang8787 TonyXiang8787 Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current experiment is about ND, not just 2D.

The ND calculation is created by a list of datasets. Each dataset in the list represents the mutation of that dimension. The core creates a cross-join on the mutations.

requires(!is_indptr_mutable_v<dataset_type>)
{
assert(0 <= scenario && scenario < batch_size());
assert(0 <= scenario && scenario <= batch_size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this will crash because buffer.indptr[scenario + 1] is called. In python, it's also not possible to call [1, 2, 3][3] (IndexOutOfRange)

}

// get slice dataset from batch
Dataset get_slice_scenario(Idx begin, Idx end) const
Copy link
Member

@mgovers mgovers Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really useful 👍 If we add an additional increment here, we can actually also use this in the dispatching of multithreaded batch calculations

using PGM_MutableDataset = power_grid_model::meta_data::Dataset<power_grid_model::mutable_dataset_t>;
using PGM_WritableDataset = power_grid_model::meta_data::Dataset<power_grid_model::writable_dataset_t>;
using PGM_DatasetInfo = power_grid_model::meta_data::DatasetInfo;
using PGM_MultiDimensionalDataset = std::vector<PGM_ConstDataset>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like this should be able to use arbitrary dimensions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++23 has std::mdspan, which would probably solve this

*
* @param multidimensional_dataset
*/
PGM_API void PGM_destroy_multidimensional_dataset(PGM_MultiDimensionalDataset* multidimensional_dataset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent PGM_dataset_create_* vs PGM_destroy_*

* @brief Opaque struct for the multi dimensional dataset class.
*
*/
typedef struct PGM_MultiDimensionalDataset PGM_MultiDimensionalDataset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe following the conventions of std::mdspan:

Suggested change
typedef struct PGM_MultiDimensionalDataset PGM_MultiDimensionalDataset;
typedef struct PGM_MDDataset PGM_MDDataset;

*/
PGM_API PGM_ConstDataset const*
PGM_get_array_pointer_from_multidimensional(PGM_Handle* handle,
PGM_MultiDimensionalDataset const* multidimensional_dataset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can do a multi-indexing approach instead? e.g.

PGM_mddataset_get_data(PGM_Handle* handle, PGM_MDDataset const* mddataset);

PGM_mddataset_get_flat_element(PGM_Handle* handle, PGM_MDDataset const* mddataset, PGM_Idx flattened_index);

PGM_mddataset_get_scenario(PGM_Handle* handle, PGM_MDDataset const* mddataset, PGM_Idx** multi_index, PGM_Idx n_dimensions);

(we can even omit n_dimensions, although it's probably safer to keep it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge This should not be merged feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] 2-D or cross-join of batch updates

3 participants