Skip to content

Conversation

@saralhamo
Copy link

This PR extends the functionality of the map() kernel to support row- and column-wise mapping in addition to the existing element-wise mapping. This means that now, UDFs that operate on entire rows/columns are supported. If the input of the UDF is a row matrix, the output can be a row or a scalar; if the input is a column matrix, the output can be a column or a scalar.

Changes to DaphneDSL parser

Introduced two additional optional parameters: axis (int64_t) and udfReturnsScalar (bool). By leaving out these parameters, the existing element-wise map operation can be performed unchanged. For a row-wise map, axis must be set to 0, and for a column-wise map to 1. Additionally, if the UDF returns a scalar and not a matrix, it is required to set udfReturnsScalar = true. Per default this is false. If this value does not match the UDFs output type, an error will be thrown during lowering of the UDF.

Changes to DAPHNE compiler and IR

The shape of the result matrix is no longer always the same as the input matrix, but now depends on both axis and the output type of the UDF (matrix or scalar). During the shape inference pass, the correct shape is set. This means, for an element-wise map the shape from the input matrix is preserved, while for a row- or column wise map one dimension stays the same while the other is unknown. Later, once the UDFs output type is known, the unknown dimension is set to 1 if the return type is a scalar (See SpecializeGenericFunctionsPass.cpp).

Testing

Extensive test cases were added for row- and column-wise mapping, covering both matrices and scalars as possible UDF outputs. These test cases also show how the shape of the result matrix can shrink/grow compared to the input matrix. Some tests expect an error, e.g. if axis or udfReturnsScalar has the wrong value. All new tests can be found under test/api/cli/secondorder, as well as test/runtime/local/kernels/MapTest.cpp.

Limitations

  • During row-/column-wise mapping the current row/column must be extracted in order to acquire the input for the UDF. This variable is not cleaned up with DataObjectFactory::destroy(), due to the fact that this leads to the result matrix becoming a view and consequently a segmentation fault if the result is printed.
  • The map kernel for type Matrix (Map<Matrix<VTRes>, Matrix<VTArg>>) currently does not support column-wise mapping, if the UDF result is a matrix. This is because the function append() works only on strictly increasing coordinates.
  • axis and udfReturnsScalar have to be set explicitly and cannot be inferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants