sys: util: Add fast version of FOR_EACH_x, GET_ARG and GET_ARGS_LESS #31904

nordic-krch · 2021-02-03T08:12:51Z

Apparently, iterative macros (FOR_EACH_x, GET_ARG_N, GET_ARGS_LESS_N) requires a lot of preprocessing and when widely used (especially recursively) it can significantly increase compilation time. There are 2 limitations:

number of arguments limited to 32 (however it can be increased)
FOR_EACH_IDX_x macros are called with reversed, descending order of indexes.

If macros are widely used then compile time is significantly shorter (order of magnitude). For example unit test tests/unit/cbprintf in #31102 compiles in 0.8s with fast version, > 7seconds with the standard one.

jukkar · 2021-02-03T08:16:40Z

Could the fast version be the default, and user could use the slow version if really needed?

nordic-krch · 2021-02-03T11:40:38Z

@jukkar yes, i guess that would be ok except for FOR_EACH_IDX.. where indexes comes in reverse order, that is not one to one and in many cases user may rely on the assumption that first element from the list gets 0 index.

I will try to do the switch and check if CI accepts. Not sure if there are lists longer than 32 used somewhere (DT?).

mbolivar-nordic · 2021-02-03T18:07:37Z

Not sure if there are lists longer than 32 used somewhere (DT?).

DT doesn't use any FOR_EACH or GET_ARG macros.

de-nordic · 2021-02-15T09:00:16Z

Could the fast version be the default, and user could use the slow version if really needed?

I am also in favor in making this default and actually renaming macros, this way code that somehow can not compile with FAST macros could still use the old but all other would be forced to use the FAST. Additionally Kconfig options could be added to roll back to using "old" macros in all places.

pabigot · 2021-02-15T12:00:50Z

I am also in favor in making this default and actually renaming macros

Agreed.

Additionally Kconfig options could be added to roll back to using "old" macros in all places.

IMO this should not be necessary; I'd be concerned that the non-default behavior wouldn't be adequately tested over all places the macros are used.

de-nordic · 2021-02-15T12:43:50Z

IMO this should not be necessary; I'd be concerned that the non-default behavior wouldn't be adequately tested over all places the macros are used.

I am concerned that some user code that is not available to us, for testing, may get broken, for some reason; adding such option, to rollback to original macros, would give us time to gather information on such instances and users to adapt new macros.
The original macros may get later deprecated together with the Kconfig option.
Does that make sense?

pabigot · 2021-02-15T13:51:08Z

The original macros may get later deprecated together with the Kconfig option.

IMO the renamed original macros should remain available for the cases where we don't have a known upper bound on the number of iterations, as may be the case in future uses in devictree.

Does that make sense?

It does, but I think the cost is too high. We can't promise perfect backwards compatibility, and can't make every case where that's so support opt-out. I think it's fine to require out of tree users to modify their code to use the renamed generic versions if the new ones don't work for them because they need an unbounded limit.

mbolivar-nordic · 2021-02-16T21:00:55Z

as may be the case in future uses in devictree.

I wouldn't make any decisions that assume DT might use these later. DT actually deliberately avoids using the util_macro.h FOR_EACH macros.

DT's for-each macros use helpers that are generated at build time in devicetree_unfixed.h, like this:

/* in devicetree_unfixed.h */
#define <node_id>_FOREACH_CHILD(fn) fn(<child 0 node_id1>) fn(<child 1 node_id>) [...] fn(<child n-1 node_id>)

/* in devicetree.h */
#define DT_FOREACH_CHILD(node_id, fn) DT_CAT(node_id, _FOREACH_CHILD)(fn)

Similarly, whenever the DT APIs need to index into something, we avoid using GET_ARG and instead just expand each array element in the generated header. For array type properties, that looks like this:

#define <node_id>_P_<property>_IDX_0 1342177280
#define <node_id>_P_<property>_IDX_1 512
#define <node_id>_P_<property>_IDX_2 1342178560
#define <node_id>_P_<property>_IDX_3 768

Then macros like DT_PROP_BY_IDX(node_id, prop, idx) do their job by concatenation and recursive macro expansion.

TL;DR I think it is safe to say the devicetree does not now and will not ever need the FOR_EACH or GET_ARG macros.

The only macro that devicetree.h really relies on that takes a variable number of arguments is MACRO_MAP_CAT. (Remaining uses of UTIL_CAT could easily be switched to the internal DT_CAT macro instead.) Looking around, devicetree.h is the only user of MACRO_MAP_CAT in the tree, so we can always make that a private helper for DT if you need to change the util.h implementation in an incompatible way.

pabigot · 2021-02-16T21:40:21Z

If nobody thinks we need the old ones, we can just replace the implementation and beg forgiveness when it breaks somebody.

nordic-krch · 2021-02-17T08:49:40Z

good news is that i've managed to resolve issue with reverse indexing and increased support to 64 arguments. Currently i don't see any behavioral differences. I will replace the implementation then.

Recursive macros are more generic but they are very depending for preprocessor. When they are used extensively they can prolong compilation even ten times. Replaced them with brute force implementation for: - FOR_EACH macros - GET_N_ARG - GET_ARGS_LESS_N - UTIL_LISTIFY - UTIL_REPEAT New implementation provides same functionality but it is limited to 64 input arguments. This is not a hard limitation and can be increased in the future. Signed-off-by: Krzysztof Chruscinski <[email protected]>

Added test for new macro. Signed-off-by: Krzysztof Chruscinski <[email protected]>

Extended test to check that multiple arguments are passed in UTIL_LISTIFY macro. Signed-off-by: Krzysztof Chruscinski <[email protected]>

pabigot

One optional comment because I think what UTIL_LISTIFY does given arguments is a little opaque.

pabigot · 2021-02-21T21:43:39Z

tests/unit/util/test.inc

+	UTIL_LISTIFY(4, INC, 1, 2)

-	zassert_equal(i, 0 + 1 + 2 + 3, NULL);
+	zassert_equal(i, 0 + 1 + 2 + 3 + 4 * (1 + 2), NULL);


This would be a little more clear expressed as:

zassert_equal(i, (0 + 1 + 2) + (1 + 1 + 2) + (2 + 1 + 2) + (3 + 1 + 2), NULL);

since that represents the effect of the four calls.

nordic-krch requested review from MaureenHelm, carlescufi, galak and nashif as code owners February 3, 2021 08:12

github-actions bot added area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test labels Feb 3, 2021

nordic-krch requested review from de-nordic, mbolivar-nordic and pabigot February 3, 2021 08:13

nordic-krch force-pushed the util_fast_for_each branch 3 times, most recently from 52a7e13 to 6547389 Compare February 3, 2021 10:22

zephyrbot added area: Base OS Base OS Library (lib/os) area: Testsuite Testsuite labels Feb 3, 2021

zephyrbot requested review from andyross and dcpleung February 3, 2021 15:25

zephyrbot assigned andyross Feb 3, 2021

nashif unassigned andyross Feb 15, 2021

nordic-krch force-pushed the util_fast_for_each branch 2 times, most recently from 869b406 to 741a83a Compare February 17, 2021 10:39

nordic-krch force-pushed the util_fast_for_each branch 6 times, most recently from ec516b2 to 91b2e25 Compare February 18, 2021 13:34

nordic-krch added 3 commits February 18, 2021 21:36

tests: unit: util: Add test for REVERSE_ARGS

ee4feef

Added test for new macro. Signed-off-by: Krzysztof Chruscinski <[email protected]>

tests: unit: util: Extended UTIL_LISTIFY test

0b6845f

Extended test to check that multiple arguments are passed in UTIL_LISTIFY macro. Signed-off-by: Krzysztof Chruscinski <[email protected]>

nordic-krch force-pushed the util_fast_for_each branch from 91b2e25 to 0b6845f Compare February 18, 2021 20:36

pabigot approved these changes Feb 21, 2021

View reviewed changes

mbolivar-nordic approved these changes Feb 22, 2021

View reviewed changes

carlescufi merged commit 6f5f0b2 into zephyrproject-rtos:master Feb 23, 2021

nordic-krch mentioned this pull request Feb 26, 2021

sys: util: Update implementation of loop macros #32692

Merged

sys: util: Add fast version of FOR_EACH_x, GET_ARG and GET_ARGS_LESS #31904

sys: util: Add fast version of FOR_EACH_x, GET_ARG and GET_ARGS_LESS #31904

Uh oh!

Conversation

nordic-krch commented Feb 3, 2021

Uh oh!

jukkar commented Feb 3, 2021

Uh oh!

nordic-krch commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbolivar-nordic commented Feb 3, 2021

Uh oh!

de-nordic commented Feb 15, 2021

Uh oh!

pabigot commented Feb 15, 2021

Uh oh!

de-nordic commented Feb 15, 2021

Uh oh!

pabigot commented Feb 15, 2021

Uh oh!

mbolivar-nordic commented Feb 16, 2021

Uh oh!

pabigot commented Feb 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nordic-krch commented Feb 17, 2021

Uh oh!

pabigot left a comment

Choose a reason for hiding this comment

Uh oh!

pabigot Feb 21, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

nordic-krch commented Feb 3, 2021 •

edited

Loading

pabigot commented Feb 16, 2021 •

edited

Loading