Skip to content

Conversation

@JordanYates
Copy link
Contributor

@JordanYates JordanYates commented Jun 17, 2021

Update the aarch32 linker file to automatically derive the name of memory regions from devicetree nodes instead of providing hardcoded values. This is required for automatic generation of regions based on compatibles, and also allows chosen and alias nodes to be used to control variable placement (as desired in #36152).

This PR is the first step to enable automatically generating regions from devicetree.
The next step is to transition to something like:

    DT_REGIONS_FROM_COMPAT(mmio_sram);

For SRAM generation, to eliminate SoC specific node labels from the core linker script.

@github-actions github-actions bot added area: API Changes to public APIs area: ARM ARM (32-bit) Architecture area: Devicetree area: Devicetree Tooling PR modifies or adds a Device Tree tooling area: Tests Issues related to a particular existing or missing test platform: STM32 ST Micro STM32 labels Jun 17, 2021
@henrikbrixandersen henrikbrixandersen self-requested a review June 17, 2021 13:22
Copy link
Contributor

@mbolivar-nordic mbolivar-nordic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just focusing on the main change here.

Have you considered using a string-valued property instead?

You could use the DT_STRING_TOKEN macro being added in #35460 to tokenize a string. That would in turn avoid the fragility of this approach.

Comment on lines 34 to 35
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a good idea to me. It seems quite fragile. Users should be free to add whatever kind of node labels they want to any nodes without modifying things like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as users use DT_REGION_NAME(node_id) in linker scripts, it is not possible to break this setup AFAICT.
Any changes in the name caused by overlays will automatically propagate through to the linker scripts.
Users are absolutely free to add labels as they wish, its just that the last one applied will be the name generated by the linker.

@JordanYates
Copy link
Contributor Author

Have you considered using a string-valued property instead?

This was the approach taken in #36196. The downsides listed there were:

  • No guarantee of uniqueness
  • Zephyr has been generally transitioning away from label property usage

Adding a new property is the wrong path forward IMO as by definition no existing nodes will have it.
Most nodes already have a node label, minimizing diffs.

You could use the DT_STRING_TOKEN macro being added in #35460 to tokenize a string. That would in turn avoid the fragility of this approach.

Quoted vs unquoted strings doesn't actually make a difference for the region names. I am unsure how this affects fragility?

@mbolivar-nordic
Copy link
Contributor

OK, I thought the region names were being used to form variable names and therefore need to be unquoted. If they're always just strings then DT_STRING_TOKEN is not needed and you can just use a string valued property.

I still feel pretty strongly that "pick whatever the last node label is" is the wrong approach though, for the reasons I've already given. If you're not comfortable with a string property, can you suggest an alternative?

@JordanYates
Copy link
Contributor Author

I still feel pretty strongly that "pick whatever the last node label is" is the wrong approach though, for the reasons I've already given. If you're not comfortable with a string property, can you suggest an alternative?

Personally I don't mind whether its node label or string property. The node label approach was suggested by @henrikbrixandersen and @erwango and I agreed that their reasons were sound.

I don't understand the reasons you have given. Users are free to give whatever node labels they want to nodes. This isn't fragile because you can always get the output region name via DT_REGION_NAME. I can trivially change the PR so that it always picks the first node label, but that eliminates the option for users to rename the node if they wished.

At the end of the day it doesn't really matter what that name is, as long as it is derived from the node. Why introduce another property when we already have a human readable, unique, node label to base it on?

@JordanYates JordanYates added the dev-review To be discussed in dev-review meeting label Jul 7, 2021
@mbolivar-nordic
Copy link
Contributor

Why introduce another property when we already have a human readable, unique, node label to base it on?

Precisely because node labels are not unique. Picking one arbitrarily feels really wrong to me. Node labels are just arbitrary ways to refer to nodes and I don't think it's sound to just pick one and use it as a name in code.

@JordanYates
Copy link
Contributor Author

JordanYates commented Jul 14, 2021

Precisely because node labels are not unique.

#36196 (comment) suggested they were unique, and testing this morning appears to validate that:

	test: device_0 {};
	test: device_1 {};

results in this when compiling:

baord.dts.pre.tmp:409.17-20: ERROR (duplicate_label): /device_1: Duplicate label 'test' on /device_1 and /device_0
ERROR: Input tree has errors, aborting (use -f to force output)
CMake Error at C:/Users/Jordan/code/zephyr/zephyr/cmake/dts.cmake:214 (message):
  command failed with return code: 2

I don't think it's sound to just pick one and use it as a name in code.

It doesn't matter which property or label we choose to generate the names from, users don't interact with it in any way apart from seeing the name in the final memory usage summary. All usage in code/linker scripts is via DT_REGION_NAME, which will accept any reference to a node (DT_ALIAS, DT_NODELABEL, etc).

@mbolivar-nordic
Copy link
Contributor

#36196 (comment) suggested they were unique, and testing this morning appears to validate that:

OK, we're using 'unique' in different ways. A single node label is unique treewide (your meaning), but a node does not have a single unique node label (my meaning).

It doesn't matter which property or label we choose to generate the names from, users don't interact with it in any way apart from seeing the name in the final memory usage summary.

If it's not visible to the user at all, why not just use ordinal numbers then?

@JordanYates
Copy link
Contributor Author

If it's not visible to the user at all, why not just use ordinal numbers then?

They're not visible in code, it is visible in the final memory usage summary:

[19/19] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:      274768 B         1 MB     26.20%
            SRAM:       59204 B       256 KB     22.58%
        IDT_LIST:          0 GB         2 KB      0.00%

Replacing SRAM with sram0 is good (arguably an improvement because it shows the user a direct relation between the dts node and linker), but replacing it with dts_ord_23 is not acceptable IMO. All that "use the last node label" means is that if someone is using sram3 for IPC, they could add an additional node label in the dts to get a more specific name in the memory summary:

          sram3:       59204 B       256 KB     22.58%

vs

     my_ipc_ram:       59204 B       256 KB     22.58%

Copy link
Contributor

@tejlmand tejlmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the general approach looks sane.
Will let @mbolivar-nordic and @galak drive the naming part / node label as they are the experts here.

Found a couple of minor nits to be fixed.

@tejlmand
Copy link
Contributor

However, using the full path seems even clearer, and can't cause conflicts e.g. if the user adds a FLASH node label to a node somewhere in the DT.

@mbolivar-nordic maybe you can elaborate ?
Already today those node labels are used in linker scripts, like here:

DT_REGION_FROM_NODE_STATUS_OKAY(FLASH_CCFG, rwx, DT_NODELABEL(ti_ccfg_partition))

which then corresponds to the node label here:
ti_ccfg_partition: partition@57fa8 {

Now, currently on master, that memory section would be named FLASH_CCFG, but with @JordanYates changes the default section name will now be identical to the node label, in this case ti_ccfg_partition.
But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

So already today, the node label ti_ccfg_partition is in use, and thus can be considered reserved.
Of course, if user decides he wants a better name, and then adds an additional node label my_foo_ccfg in addition to ti_ccfg_partition, then it is the user's responsibility to ensure that my_foo_ccfg doesn't collide with anything.
Also in that case, I do think it makes sense to use the last node label given as the section name in the generated linker script.

The labels in use before and after this PR haven't changed, from what I can see.
@JordanYates is that correctly understood and observed ?

@mbolivar-nordic
Copy link
Contributor

Already today those node labels are used in linker scripts, like here:

Right, but that is different: that is taking a node label, getting a node identifier from it. DT_NODELABEL takes a node label and returns a node identifier from it. It does not return a node label.

Now, currently on master, that memory section would be named FLASH_CCFG

Right, so the linker script author is responsible for naming the memory region using the first argument to DT_REGION_FROM_NODE_STATUS_OKAY, and we can see the list of them in one place.

But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

Right, and that is still what I'm not convinced about. The fact that the memory section names are not predictable and depend on whatever node label the tooling happens to choose feels like the wrong thing to me. I can understand @JordanYates 's objection that having a bunch of ordinal numbers is not good for the --print-memory-usage output and I agree. But I think that if we're going to get the region names from devicetree nodes, we should be using a single, canonical piece of information that we get from the node. The fact that nodes can have zero, one, or many node labels means there is no good canonical choice. That is why I originally suggested getting the value out of a property in the node. When that was rejected, I suggested using the path, since that doesn't require extra boilerplate in the DTS but is canonical and unique.

@tejlmand
Copy link
Contributor

tejlmand commented Jul 16, 2021

But with the added possibility that users can now create an overlay that adds an extra label (name) that will instead be used for the section name.

Right, and that is still what I'm not convinced about. The fact that the memory section names are not predictable and depend on whatever node label the tooling happens to choose feels like the wrong thing to me.

Fair enough.
I'm not having a strong preference for one or the other atm, so good arguments can persuade me in either direction.
I see the benefit in today's solution where the section names are known and located together in the linker script, but I also see the benefit of allowing the users to rename memory sections to their liking, without having to modify the linker script template.

Question is, is the DT the logical place for controlling the memory section names by adding an additional naming 🤔 ?
And also, are we solving a real issue here ?
What is the real problem in today's solution, that requires us to control those names through devicetree, and what are the consequences ?

Imaging on slack, someone posting a problem:

Hi guys,
Since my last rebase the foo region overflows, can anyone help:

Memory region         Used Size  Region Size  %age Used
            foo:       26512 B        24 KB    107.88%
            bar:       24592 B     178968 B     13.74%
            baz:          0 GB         2 KB      0.00
... section `xyz' will not fit in region `foo'
... region `FLASH' overflowed by 1936 bytes

In most cases we can probably deduce that foo, bar, and baz are, but if a sample or the user has renamed sram3 into foo, then it can actually be harder for us to help that user as we don't know that it was actually sram3 that overflowed, compared to today's situation where sram3 is fixed in linker script.

I would really like to see an issue that clearly describes the problem with today's solution, and what use-cases such a change will handle, so that we can also consider if there might be other side-effects we have not considered (like helping users with issues later).

@JordanYates
Copy link
Contributor Author

And also, are we solving a real issue here ?

The goal of this PR is to enable automatically generating regions from devicetree.
What the final solution looks like would be something like:

    DT_REGIONS_FROM_COMPAT(mmio_sram);

Which would just automatically create a region for every sram node on the device, instead of the current setup where every new SRAM label on some SoC requires adding SoC specific code to the main linker scripts (sram1 through 4, sdram1 and sdram2, etc). This is required for all automatic integration between devicetree and the linker, not just SRAM.

For this to work the name needs to be derived from the node itself, it can't be manually provided. Being able to rename the regions is merely a side effect of the name now coming from the node.

@JordanYates
Copy link
Contributor Author

JordanYates commented Jul 16, 2021

I'm not a huge fan of the node path option. It looks okay for SRAM nodes, /soc/memory@20000000, but It quickly gets unwieldy, e.g. /soc/flash-controller@4001e000/flash@0/partitions/partition@da000

These paths also don't contain the information that actually differentiates multiple instances of the same node from each other (beyond the address). Without digging into the .dts, you don't have any idea whether that second path is a TI_CCFG region, space assigned to the bootloader or the application space.

In short I think the path is great as a unique identifier, but bad at containing the information you actually want in the memory summary. Whereas the node label is explicitly chosen as a short, human readable string that has some meaning.

If the fact that the name can change based on what the user does is a significant problem and not a benefit, then we can just use the first label instead of the last. This will just result in regions always having the label from their original definition. (sram0, etc).

@tejlmand
Copy link
Contributor

And also, are we solving a real issue here ?

The goal of this PR is to enable automatically generating regions from devicetree.

But this is not what this PR does now.
All regions are described in the linker script, and only names are chosen based on the node label.

What the final solution looks like would be something like:

    DT_REGIONS_FROM_COMPAT(mmio_sram);

Which would just automatically create a region for every sram node on the device, instead of the current setup where every new SRAM label on some SoC requires adding SoC specific code to the main linker scripts (sram1 through 4, sdram1 and sdram2, etc). This is required for all automatic integration between devicetree and the linker, not just SRAM.

Now, that sounds like a good use-case.
So should this PR be considered a draft until that part is also ready ?

btw. thanks for the extended reasoning. I think some of that info should be copied to the description of this PR.

@JordanYates
Copy link
Contributor Author

So should this PR be considered a draft until that part is also ready ?

That part has been ready in various incarnations for several months (#34537, #33656).
This was an attempt to split the larger body of work to ease review.

@mbolivar-nordic
Copy link
Contributor

mbolivar-nordic commented Jul 16, 2021

I think we all agree that node labels are meaningful identifiers. The disagreement seems to be about whether they are the right identifiers to be using in this context.

It seems like @tejlmand shares my concerns about making an arbitrary choice. I'm not convinced about the proposal "just pick the first node label" as a workaround. I generally remain unconvinced that any arbitrary choice (first, last, middle, whatever) is going to produce a good result in the case of multiple node labels. And thinking about it a bit more, requiring users to put at least one node label seems like an avoidable requirement also.

Using a property like zephyr,region-name would resolve my concern. If missing, you could always generate a short one using f'{node.name} [{node.dep_ordinal}]' or so, which is shorter than the path but should still be unique. I get that there's a potential problem if two regions share a zephyr,region-name, but that seems like the sort of thing that gen_defines.py could catch and error out on.

@mbolivar-nordic
Copy link
Contributor

(Or if a zephyr-specific property won't work, label as discussed by @JordanYates a while ago could serve.)

@JordanYates
Copy link
Contributor Author

A summary of my arguments against each option. Each option that requires some value to exist could instead have some sane fallback, be that the path or some new value as proposed by @mbolivar-nordic. I am still of the opinion that the arbitrariness of the name doesn't really matter, as long as it is predictable, consistent and useful. After all, FLASH and SRAM are arbitrary names that have worked fine for years.

Node Label:

  • Arbitrary number of labels per node requires picking one
  • Requires nodes resulting in a memory region to have a node label (few nodes are missing this currently)

Node Path:

  • Long and unwieldy
  • Information sparse/poor

Custom Property:

  • Requires modifying every node that should result in a memory region (all nodes are missing this currently)
  • Introduces a 5th identifier/name to the node (node label, node name, label property, path)
  • No uniqueness guarantees

Label Property:

  • Zephyr appears to be moving away from label properties in general
  • Makes the label property required: true; for all ram and flash nodes (many nodes are missing this currently)
  • Labels tend to just duplicate the node label
  • No uniqueness guarantees

@mbolivar-nordic
Copy link
Contributor

Thanks for the summary; very useful.

No uniqueness guarantees

A reminder that I pointed out uniqueness guarantees for a custom property can be enforced by gen_defines.py. And we could revive #32682 for the label property. So I don't think the points about lack of uniqueness guarantees hold.

Requires modifying every node that should result in a memory region (all nodes are missing this currently)

I think I already addressed that too above:

If missing, you could always generate a short one using f'{node.name} [{node.dep_ordinal}]' or so, which is shorter than the path but should still be unique.

Jordan Yates added 8 commits July 22, 2021 21:50
`UTIL_DECR` was removed in #de8457 with the transition to brute force
`FOREACH` macros. This macro is however still useful in the context of
converting a `_NUM` define to an index to the last value, which would
otherwise require some combination of `NUM_VA_ARGS_LESS_1` and
`UTIL_LISTIFY`.

The special case of `#define Z_DECR_0 0` was copied from the previous
implementation.

Signed-off-by: Jordan Yates <[email protected]>
Test the re-introduced `UTIL_DECR` macro.

Signed-off-by: Jordan Yates <[email protected]>
Add definitions that allow users to go from a node identifier back to
the original node labels.

Signed-off-by: Jordan Yates <[email protected]>
Add a comment to separate the children related macros from whats above
in the generated `devicetree_unfixed.h` file.

Signed-off-by: Jordan Yates <[email protected]>
Adds functions to query the node labels of a node. The awkward naming
of `DT_NODE_NUM_NODELABEL` and `DT_NODE_NODELABEL_BY_IDX` is due to
`DT_NODELABEL` already being defined to get a node from a nodelabel.

Signed-off-by: Jordan Yates <[email protected]>
Test the new node label retrieval API's.

Signed-off-by: Jordan Yates <[email protected]>
Generate memory region names from devicetree nodes instead of using
hardcoded values. Region names can be overridden by using an overlay
similar to the following:
```
new_region_name: &sram1 {};
```

Signed-off-by: Jordan Yates <[email protected]>
As memory region names are now derived purely from devicetree, remove
the `name` parameter from `DT_REGION_FROM_NODE_STATUS_OKAY`.

Signed-off-by: Jordan Yates <[email protected]>
@galak
Copy link
Contributor

galak commented Jul 22, 2021

dev-review (Jul 22, 2021):

  • Suggestion is to use a custom zephyr specific property zephyr,linker...
  • try and maintain some backward compat for some period for know things in linker script today
  • have dt script, check for uniqueness of the zephyr,linker.. property values
  • tooling will scan for zephyr,linker.. property to allow these on any "valid" node which one can determine a base address/size for

@JordanYates JordanYates marked this pull request as draft July 28, 2021 11:19
@galak galak removed the dev-review To be discussed in dev-review meeting label Jul 29, 2021
@galak
Copy link
Contributor

galak commented Jul 29, 2021

Clearing dev-review label, please add back if this still needs discussion in that forum.

@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: API Changes to public APIs area: ARM ARM (32-bit) Architecture area: Devicetree Tooling PR modifies or adds a Device Tree tooling area: Devicetree area: Tests Issues related to a particular existing or missing test platform: STM32 ST Micro STM32 Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants