linker: aarch32: automatic derivation of region names from devicetree nodes #36365

JordanYates · 2021-06-17T12:43:13Z

Update the aarch32 linker file to automatically derive the name of memory regions from devicetree nodes instead of providing hardcoded values. This is required for automatic generation of regions based on compatibles, and also allows chosen and alias nodes to be used to control variable placement (as desired in #36152).

This PR is the first step to enable automatically generating regions from devicetree.
The next step is to transition to something like:

    DT_REGIONS_FROM_COMPAT(mmio_sram);

For SRAM generation, to eliminate SoC specific node labels from the core linker script.

mbolivar-nordic

Just focusing on the main change here.

Have you considered using a string-valued property instead?

You could use the DT_STRING_TOKEN macro being added in #35460 to tokenize a string. That would in turn avoid the fragility of this approach.

mbolivar-nordic · 2021-06-17T20:42:55Z

include/linker/devicetree_regions.h

This doesn't seem like a good idea to me. It seems quite fragile. Users should be free to add whatever kind of node labels they want to any nodes without modifying things like this.

As long as users use DT_REGION_NAME(node_id) in linker scripts, it is not possible to break this setup AFAICT.
Any changes in the name caused by overlays will automatically propagate through to the linker scripts.
Users are absolutely free to add labels as they wish, its just that the last one applied will be the name generated by the linker.

JordanYates · 2021-06-18T08:18:09Z

Have you considered using a string-valued property instead?

This was the approach taken in #36196. The downsides listed there were:

No guarantee of uniqueness
Zephyr has been generally transitioning away from label property usage

Adding a new property is the wrong path forward IMO as by definition no existing nodes will have it.
Most nodes already have a node label, minimizing diffs.

You could use the DT_STRING_TOKEN macro being added in #35460 to tokenize a string. That would in turn avoid the fragility of this approach.

Quoted vs unquoted strings doesn't actually make a difference for the region names. I am unsure how this affects fragility?

mbolivar-nordic · 2021-06-18T15:56:38Z

OK, I thought the region names were being used to form variable names and therefore need to be unquoted. If they're always just strings then DT_STRING_TOKEN is not needed and you can just use a string valued property.

I still feel pretty strongly that "pick whatever the last node label is" is the wrong approach though, for the reasons I've already given. If you're not comfortable with a string property, can you suggest an alternative?

JordanYates · 2021-06-19T00:44:07Z

I still feel pretty strongly that "pick whatever the last node label is" is the wrong approach though, for the reasons I've already given. If you're not comfortable with a string property, can you suggest an alternative?

Personally I don't mind whether its node label or string property. The node label approach was suggested by @henrikbrixandersen and @erwango and I agreed that their reasons were sound.

I don't understand the reasons you have given. Users are free to give whatever node labels they want to nodes. This isn't fragile because you can always get the output region name via DT_REGION_NAME. I can trivially change the PR so that it always picks the first node label, but that eliminates the option for users to rename the node if they wished.

At the end of the day it doesn't really matter what that name is, as long as it is derived from the node. Why introduce another property when we already have a human readable, unique, node label to base it on?

mbolivar-nordic · 2021-07-14T22:23:56Z

Why introduce another property when we already have a human readable, unique, node label to base it on?

Precisely because node labels are not unique. Picking one arbitrarily feels really wrong to me. Node labels are just arbitrary ways to refer to nodes and I don't think it's sound to just pick one and use it as a name in code.

JordanYates · 2021-07-14T22:54:05Z

Precisely because node labels are not unique.

#36196 (comment) suggested they were unique, and testing this morning appears to validate that:

	test: device_0 {};
	test: device_1 {};

results in this when compiling:

baord.dts.pre.tmp:409.17-20: ERROR (duplicate_label): /device_1: Duplicate label 'test' on /device_1 and /device_0
ERROR: Input tree has errors, aborting (use -f to force output)
CMake Error at C:/Users/Jordan/code/zephyr/zephyr/cmake/dts.cmake:214 (message):
  command failed with return code: 2

I don't think it's sound to just pick one and use it as a name in code.

It doesn't matter which property or label we choose to generate the names from, users don't interact with it in any way apart from seeing the name in the final memory usage summary. All usage in code/linker scripts is via DT_REGION_NAME, which will accept any reference to a node (DT_ALIAS, DT_NODELABEL, etc).

mbolivar-nordic · 2021-07-14T23:52:46Z

#36196 (comment) suggested they were unique, and testing this morning appears to validate that:

OK, we're using 'unique' in different ways. A single node label is unique treewide (your meaning), but a node does not have a single unique node label (my meaning).

It doesn't matter which property or label we choose to generate the names from, users don't interact with it in any way apart from seeing the name in the final memory usage summary.

If it's not visible to the user at all, why not just use ordinal numbers then?

JordanYates · 2021-07-15T01:12:14Z

If it's not visible to the user at all, why not just use ordinal numbers then?

They're not visible in code, it is visible in the final memory usage summary:

[19/19] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:      274768 B         1 MB     26.20%
            SRAM:       59204 B       256 KB     22.58%
        IDT_LIST:          0 GB         2 KB      0.00%

Replacing SRAM with sram0 is good (arguably an improvement because it shows the user a direct relation between the dts node and linker), but replacing it with dts_ord_23 is not acceptable IMO. All that "use the last node label" means is that if someone is using sram3 for IPC, they could add an additional node label in the dts to get a more specific name in the memory summary:

          sram3:       59204 B       256 KB     22.58%

vs

     my_ipc_ram:       59204 B       256 KB     22.58%

tejlmand

I think the general approach looks sane.
Will let @mbolivar-nordic and @galak drive the naming part / node label as they are the experts here.

Found a couple of minor nits to be fixed.

scripts/dts/gen_defines.py

include/devicetree.h

tejlmand · 2021-07-15T21:35:51Z

However, using the full path seems even clearer, and can't cause conflicts e.g. if the user adds a FLASH node label to a node somewhere in the DT.

@mbolivar-nordic maybe you can elaborate ?
Already today those node labels are used in linker scripts, like here:

zephyr/include/arch/arm/aarch32/cortex_m/scripts/linker.ld

Line 91 in f5e5f7d

    
           DT_REGION_FROM_NODE_STATUS_OKAY(FLASH_CCFG, rwx, DT_NODELABEL(ti_ccfg_partition))

which then corresponds to the node label here:

zephyr/dts/arm/ti/cc2652r.dtsi

Line 24 in f5e5f7d

ti_ccfg_partition: partition@57fa8 {

Now, currently on master, that memory section would be named FLASH_CCFG, but with @JordanYates changes the default section name will now be identical to the node label, in this case ti_ccfg_partition.
But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

So already today, the node label ti_ccfg_partition is in use, and thus can be considered reserved.
Of course, if user decides he wants a better name, and then adds an additional node label my_foo_ccfg in addition to ti_ccfg_partition, then it is the user's responsibility to ensure that my_foo_ccfg doesn't collide with anything.
Also in that case, I do think it makes sense to use the last node label given as the section name in the generated linker script.

The labels in use before and after this PR haven't changed, from what I can see.
@JordanYates is that correctly understood and observed ?

mbolivar-nordic · 2021-07-15T21:50:17Z

Already today those node labels are used in linker scripts, like here:

Right, but that is different: that is taking a node label, getting a node identifier from it. DT_NODELABEL takes a node label and returns a node identifier from it. It does not return a node label.

Now, currently on master, that memory section would be named FLASH_CCFG

Right, so the linker script author is responsible for naming the memory region using the first argument to DT_REGION_FROM_NODE_STATUS_OKAY, and we can see the list of them in one place.

But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

Right, and that is still what I'm not convinced about. The fact that the memory section names are not predictable and depend on whatever node label the tooling happens to choose feels like the wrong thing to me. I can understand @JordanYates 's objection that having a bunch of ordinal numbers is not good for the --print-memory-usage output and I agree. But I think that if we're going to get the region names from devicetree nodes, we should be using a single, canonical piece of information that we get from the node. The fact that nodes can have zero, one, or many node labels means there is no good canonical choice. That is why I originally suggested getting the value out of a property in the node. When that was rejected, I suggested using the path, since that doesn't require extra boilerplate in the DTS but is canonical and unique.

tejlmand · 2021-07-16T07:17:05Z

But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

Right, and that is still what I'm not convinced about. The fact that the memory section names are not predictable and depend on whatever node label the tooling happens to choose feels like the wrong thing to me.

Fair enough.
I'm not having a strong preference for one or the other atm, so good arguments can persuade me in either direction.
I see the benefit in today's solution where the section names are known and located together in the linker script, but I also see the benefit of allowing the users to rename memory sections to their liking, without having to modify the linker script template.

Question is, is the DT the logical place for controlling the memory section names by adding an additional naming 🤔 ?
And also, are we solving a real issue here ?
What is the real problem in today's solution, that requires us to control those names through devicetree, and what are the consequences ?

Imaging on slack, someone posting a problem:

Hi guys,
Since my last rebase the foo region overflows, can anyone help:

Memory region         Used Size  Region Size  %age Used
            foo:       26512 B        24 KB    107.88%
            bar:       24592 B     178968 B     13.74%
            baz:          0 GB         2 KB      0.00
... section `xyz' will not fit in region `foo'
... region `FLASH' overflowed by 1936 bytes

In most cases we can probably deduce that foo, bar, and baz are, but if a sample or the user has renamed sram3 into foo, then it can actually be harder for us to help that user as we don't know that it was actually sram3 that overflowed, compared to today's situation where sram3 is fixed in linker script.

I would really like to see an issue that clearly describes the problem with today's solution, and what use-cases such a change will handle, so that we can also consider if there might be other side-effects we have not considered (like helping users with issues later).

JordanYates · 2021-07-16T07:52:33Z

And also, are we solving a real issue here ?

The goal of this PR is to enable automatically generating regions from devicetree.
What the final solution looks like would be something like:

    DT_REGIONS_FROM_COMPAT(mmio_sram);

Which would just automatically create a region for every sram node on the device, instead of the current setup where every new SRAM label on some SoC requires adding SoC specific code to the main linker scripts (sram1 through 4, sdram1 and sdram2, etc). This is required for all automatic integration between devicetree and the linker, not just SRAM.

For this to work the name needs to be derived from the node itself, it can't be manually provided. Being able to rename the regions is merely a side effect of the name now coming from the node.

JordanYates · 2021-07-16T08:27:02Z

I'm not a huge fan of the node path option. It looks okay for SRAM nodes, /soc/memory@20000000, but It quickly gets unwieldy, e.g. /soc/flash-controller@4001e000/flash@0/partitions/partition@da000

These paths also don't contain the information that actually differentiates multiple instances of the same node from each other (beyond the address). Without digging into the .dts, you don't have any idea whether that second path is a TI_CCFG region, space assigned to the bootloader or the application space.

In short I think the path is great as a unique identifier, but bad at containing the information you actually want in the memory summary. Whereas the node label is explicitly chosen as a short, human readable string that has some meaning.

If the fact that the name can change based on what the user does is a significant problem and not a benefit, then we can just use the first label instead of the last. This will just result in regions always having the label from their original definition. (sram0, etc).

tejlmand · 2021-07-16T11:21:22Z

And also, are we solving a real issue here ?

The goal of this PR is to enable automatically generating regions from devicetree.

But this is not what this PR does now.
All regions are described in the linker script, and only names are chosen based on the node label.

What the final solution looks like would be something like:
    DT_REGIONS_FROM_COMPAT(mmio_sram);
Which would just automatically create a region for every sram node on the device, instead of the current setup where every new SRAM label on some SoC requires adding SoC specific code to the main linker scripts (sram1 through 4, sdram1 and sdram2, etc). This is required for all automatic integration between devicetree and the linker, not just SRAM.

Now, that sounds like a good use-case.
So should this PR be considered a draft until that part is also ready ?

btw. thanks for the extended reasoning. I think some of that info should be copied to the description of this PR.

JordanYates · 2021-07-16T11:55:41Z

So should this PR be considered a draft until that part is also ready ?

That part has been ready in various incarnations for several months (#34537, #33656).
This was an attempt to split the larger body of work to ease review.

mbolivar-nordic · 2021-07-16T18:02:58Z

I think we all agree that node labels are meaningful identifiers. The disagreement seems to be about whether they are the right identifiers to be using in this context.

It seems like @tejlmand shares my concerns about making an arbitrary choice. I'm not convinced about the proposal "just pick the first node label" as a workaround. I generally remain unconvinced that any arbitrary choice (first, last, middle, whatever) is going to produce a good result in the case of multiple node labels. And thinking about it a bit more, requiring users to put at least one node label seems like an avoidable requirement also.

Using a property like zephyr,region-name would resolve my concern. If missing, you could always generate a short one using f'{node.name} [{node.dep_ordinal}]' or so, which is shorter than the path but should still be unique. I get that there's a potential problem if two regions share a zephyr,region-name, but that seems like the sort of thing that gen_defines.py could catch and error out on.

mbolivar-nordic · 2021-07-16T18:05:50Z

(Or if a zephyr-specific property won't work, label as discussed by @JordanYates a while ago could serve.)

JordanYates · 2021-07-17T01:57:27Z

A summary of my arguments against each option. Each option that requires some value to exist could instead have some sane fallback, be that the path or some new value as proposed by @mbolivar-nordic. I am still of the opinion that the arbitrariness of the name doesn't really matter, as long as it is predictable, consistent and useful. After all, FLASH and SRAM are arbitrary names that have worked fine for years.

Node Label:

Arbitrary number of labels per node requires picking one
Requires nodes resulting in a memory region to have a node label (few nodes are missing this currently)

Node Path:

Long and unwieldy
Information sparse/poor

Custom Property:

Requires modifying every node that should result in a memory region (all nodes are missing this currently)
Introduces a 5th identifier/name to the node (node label, node name, label property, path)
No uniqueness guarantees

Label Property:

Zephyr appears to be moving away from label properties in general
Makes the label property required: true; for all ram and flash nodes (many nodes are missing this currently)
Labels tend to just duplicate the node label
No uniqueness guarantees

mbolivar-nordic · 2021-07-19T20:21:33Z

Thanks for the summary; very useful.

No uniqueness guarantees

A reminder that I pointed out uniqueness guarantees for a custom property can be enforced by gen_defines.py. And we could revive #32682 for the label property. So I don't think the points about lack of uniqueness guarantees hold.

Requires modifying every node that should result in a memory region (all nodes are missing this currently)

I think I already addressed that too above:

If missing, you could always generate a short one using f'{node.name} [{node.dep_ordinal}]' or so, which is shorter than the path but should still be unique.

`UTIL_DECR` was removed in #de8457 with the transition to brute force `FOREACH` macros. This macro is however still useful in the context of converting a `_NUM` define to an index to the last value, which would otherwise require some combination of `NUM_VA_ARGS_LESS_1` and `UTIL_LISTIFY`. The special case of `#define Z_DECR_0 0` was copied from the previous implementation. Signed-off-by: Jordan Yates <[email protected]>

Test the re-introduced `UTIL_DECR` macro. Signed-off-by: Jordan Yates <[email protected]>

Add definitions that allow users to go from a node identifier back to the original node labels. Signed-off-by: Jordan Yates <[email protected]>

Add a comment to separate the children related macros from whats above in the generated `devicetree_unfixed.h` file. Signed-off-by: Jordan Yates <[email protected]>

Adds functions to query the node labels of a node. The awkward naming of `DT_NODE_NUM_NODELABEL` and `DT_NODE_NODELABEL_BY_IDX` is due to `DT_NODELABEL` already being defined to get a node from a nodelabel. Signed-off-by: Jordan Yates <[email protected]>

Test the new node label retrieval API's. Signed-off-by: Jordan Yates <[email protected]>

Generate memory region names from devicetree nodes instead of using hardcoded values. Region names can be overridden by using an overlay similar to the following: ``` new_region_name: &sram1 {}; ``` Signed-off-by: Jordan Yates <[email protected]>

As memory region names are now derived purely from devicetree, remove the `name` parameter from `DT_REGION_FROM_NODE_STATUS_OKAY`. Signed-off-by: Jordan Yates <[email protected]>

galak · 2021-07-22T15:29:59Z

dev-review (Jul 22, 2021):

Suggestion is to use a custom zephyr specific property zephyr,linker...
try and maintain some backward compat for some period for know things in linker script today
have dt script, check for uniqueness of the zephyr,linker.. property values
tooling will scan for zephyr,linker.. property to allow these on any "valid" node which one can determine a base address/size for

galak · 2021-07-29T15:24:27Z

Clearing dev-review label, please add back if this still needs discussion in that forum.

github-actions · 2021-09-28T00:28:52Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

JordanYates requested review from MaureenHelm, andyross, bwitherspoon, carlescufi, dcpleung, erwango, galak, ioannisg, mbolivar-nordic and nashif as code owners June 17, 2021 12:43

github-actions bot added area: API Changes to public APIs area: ARM ARM (32-bit) Architecture area: Devicetree area: Devicetree Tooling PR modifies or adds a Device Tree tooling area: Tests Issues related to a particular existing or missing test platform: STM32 ST Micro STM32 labels Jun 17, 2021

JordanYates mentioned this pull request Jun 17, 2021

linker: aarch32: automatic derivation of region names from devicetree nodes #36196

Closed

henrikbrixandersen self-requested a review June 17, 2021 13:22

mbolivar-nordic suggested changes Jun 17, 2021

View reviewed changes

JordanYates added the dev-review To be discussed in dev-review meeting label Jul 7, 2021

carlescufi requested review from de-nordic, mbolivar-nordic and tejlmand July 12, 2021 13:15

tejlmand requested changes Jul 15, 2021

View reviewed changes

scripts/dts/gen_defines.py Outdated Show resolved Hide resolved

scripts/dts/gen_defines.py Outdated Show resolved Hide resolved

scripts/dts/gen_defines.py Outdated Show resolved Hide resolved

include/devicetree.h Outdated Show resolved Hide resolved

Jordan Yates added 8 commits July 22, 2021 21:50

tests: util: test UTIL_DECR

97f2a86

Test the re-introduced `UTIL_DECR` macro. Signed-off-by: Jordan Yates <[email protected]>

dts: gen_defines: generate nodelabels

5221345

Add definitions that allow users to go from a node identifier back to the original node labels. Signed-off-by: Jordan Yates <[email protected]>

dts: gen_defines: comment to separate children

330c975

Add a comment to separate the children related macros from whats above in the generated `devicetree_unfixed.h` file. Signed-off-by: Jordan Yates <[email protected]>

tests: devicetree: api: test node label retrieval

5ae4a87

Test the new node label retrieval API's. Signed-off-by: Jordan Yates <[email protected]>

linker: remove manual name specification

f2f4f0f

As memory region names are now derived purely from devicetree, remove the `name` parameter from `DT_REGION_FROM_NODE_STATUS_OKAY`. Signed-off-by: Jordan Yates <[email protected]>

JordanYates mentioned this pull request Jul 28, 2021

linker: aarch32: automatic derivation of region names from devicetree nodes #37279

Merged

JordanYates marked this pull request as draft July 28, 2021 11:19

galak removed the dev-review To be discussed in dev-review meeting label Jul 29, 2021

github-actions bot added the Stale label Sep 28, 2021

JordanYates closed this Oct 6, 2021

JordanYates mentioned this pull request Nov 23, 2021

[POC | RFC | DNM] Introducing framework to configure MPU regions from DT-defined SRAM regions #40422

Closed

JordanYates deleted the 210616_linker_dt_nodelabels branch August 15, 2022 11:42

linker: aarch32: automatic derivation of region names from devicetree nodes #36365

linker: aarch32: automatic derivation of region names from devicetree nodes #36365

Uh oh!

Conversation

JordanYates commented Jun 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbolivar-nordic left a comment

Choose a reason for hiding this comment

Uh oh!

mbolivar-nordic Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

JordanYates Jun 18, 2021

Choose a reason for hiding this comment

Uh oh!

JordanYates commented Jun 18, 2021

Uh oh!

mbolivar-nordic commented Jun 18, 2021

Uh oh!

JordanYates commented Jun 19, 2021

Uh oh!

mbolivar-nordic commented Jul 14, 2021

Uh oh!

JordanYates commented Jul 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbolivar-nordic commented Jul 14, 2021

Uh oh!

JordanYates commented Jul 15, 2021

Uh oh!

tejlmand left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tejlmand commented Jul 15, 2021

Uh oh!

mbolivar-nordic commented Jul 15, 2021

Uh oh!

tejlmand commented Jul 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JordanYates commented Jul 16, 2021

Uh oh!

JordanYates commented Jul 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tejlmand commented Jul 16, 2021

Uh oh!

JordanYates commented Jul 16, 2021

Uh oh!

mbolivar-nordic commented Jul 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbolivar-nordic commented Jul 16, 2021

Uh oh!

JordanYates commented Jul 17, 2021

Uh oh!

mbolivar-nordic commented Jul 19, 2021

Uh oh!

galak commented Jul 22, 2021

Uh oh!

galak commented Jul 29, 2021

Uh oh!

github-actions bot commented Sep 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JordanYates commented Jun 17, 2021 •

edited

Loading

JordanYates commented Jul 14, 2021 •

edited

Loading

tejlmand commented Jul 16, 2021 •

edited

Loading

JordanYates commented Jul 16, 2021 •

edited

Loading

mbolivar-nordic commented Jul 16, 2021 •

edited

Loading