Skip to content

Conversation

@lyakh
Copy link
Contributor

@lyakh lyakh commented Nov 8, 2024

this is just two commits on top of #80626 - it will be rebased once it's merged.

The LLEXT linker for Xtensa cannot relocate FLIX commands, disable
them in extensions only until we have a solution. Note, that this
only affects extensions, the main Zephyr binary is still built with
FLIX commands.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
When linking and relocations are performed on the ELF object itself
with no copying, also global binding linking can break references.
Disable linking globally for such cases.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
@lyakh lyakh changed the title [TO BE REBASED] disable FLIX commands disable FLIX commands Nov 18, 2024
@lyakh lyakh changed the title disable FLIX commands LLEXT/Xtensa/detached: disable FLIX commands and link once Nov 18, 2024
@lyakh lyakh marked this pull request as ready for review November 18, 2024 10:03
@zephyrbot zephyrbot added area: Build System area: Toolchains Toolchains area: llext Linkable Loadable Extensions labels Nov 18, 2024
Copy link
Contributor

@pillo79 pillo79 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this 🚀 (TBH, never understood the reason for disabling only a specific a class of relocations in the first place). I think it's another step in the right direction for Xtensa!

With this PR relocate_local does not do what its name suggests anymore. I would suggest to go one step further and add a different parameter, skip_relocation, and skip the link process if either is not the default value,
This would give a few advantages:

  • we could deprecate the now confusingly named relocate_local in a future release (once SOF updates);
  • once that is deprecated, all load_params would default to 0, so you could directly initialize a proper config with only your needed fields (struct llext_load_param cfg = { .nonzero = 1 }).

@teburd what do you think about this?

@teburd
Copy link
Contributor

teburd commented Nov 18, 2024

I love this 🚀 (TBH, never understood the reason for disabling only a specific a class of relocations in the first place). I think it's another step in the right direction for Xtensa!

With this PR relocate_local does not do what its name suggests anymore. I would suggest to go one step further and add a different parameter, skip_relocation, and skip the link process if either is not the default value, This would give a few advantages:

* we could deprecate the now confusingly named `relocate_local` in a future release (once SOF updates);

* once that is deprecated, all load_params would default to 0, so you could directly initialize a proper config with only your needed fields (`struct llext_load_param cfg = { .nonzero = 1 }`).

@teburd what do you think about this?

I think load_param as a way of trying to direct llext_load is the wrong approach to solving SoF's problem. I re-iterate my stance here, I believe exposing more verbs (functions) is the better route, and will lead to simpler code everywhere. I keep thinking about this, and I keep thinking we've taken a mis-step somewhere to lead to having SoF adding these flags, and having to do work itself rather than providing the tools to make it happen.

llext_load should be the Easy Button, not the it does everything function.

We should support custom allocation and placement strategies, but not with the same function that gets overloaded with flags that lead to confusing branching in the load path.

My recommendation here would be to add some small state tracking to the extension. Noting where the sections are to be placed, where the source material is, and if they have been copied or not. Having instead a set of functions that basically do what SoF does today but in a more general and broadly useful way.

E.g. perhaps some functions like the following...

/**
 * @brief Link the extension with itself and others
 *
 * Assumes the address of each ELF section is given by dest_addrs and uses this memory
 * location as if the ELF section were already located there while linking. The address
 * of the sections may be the same as the destination address.
 * 
 * Assumes that the sections are available in the extension already and are writable.
 *
 * @retval -EINVAL if any destination addresses are invalid
 */
int llext_link(struct llext *ext, void *dest_addrs[LLEXT_MEM_COUNT]); 

(copy the map, link the elf with the destination addresses in mind, could be the same as the source address!)

and llext_copy, would presume that the destination addresses are already allocated and would take a slotmap of what to copy

/** 
 * @brief Copies (without allocating) any sections marked to be copied in the slot map|
 *
 * Copies any sections marked in the copy_sect slot map to their destination addresses
 * if the source and destination addresses are not equal.
 * 
 * Assumes no overlap
 *
 * @retval -EINVAL if any source overlaps with any destination
 */
int llext_copy(struct llext *ext, bool copy_sect[LLEXT_MEM_COUNT]);

Adding and allocating the struct llext in its initial state can be done in a separate function.

/**
 * @brief Initialize, parse, and add an extension
 *
 * Initializes an allocation llext struct with a pointer to an elf data source.
 */
int llext_mem_init(struct llext *ext, const void * const elf);

Now the sof code...

https://github.com/thesofproject/sof/blob/main/src/library_manager/llext_manager.c#L107

Can look more like its using llext rather than trying to work around everything.

llext_manager_load(...) {
   llext_copy(...)
}

llext_manager_link(...) {
   llext_link(...)
}

llext_manager_allocate_module(...) {
   struct llext *ext = /* sof allocated extension struct */
   llext_mem_init(...);
}

All memory allocation is now directly expected to occur in SoF code as it already does. And SoF code is completely in control over the flow as it already is.

The llext_load() function can remain being an Easy Button option for things like the shell, demos, and a good starting point for anyone not interested in custom placement and allocation. We can also completely drop the buf loader idea, maybe in fact dropping the loader as a whole and instead expecting people to provide a pointer to the ELF data. The FS loader feels misguided at this point given the way people are actually using this stuff.

If custom section placement/linking/copying is needed (as SoF does) then we'd make the parameters perhaps a struct instead, containing both the predefined slot maps as well as stringly maps with sizes known...

struct llext_link {
   void *dest_addrs[LLEXT_MEM_COUNT];
   const char **cust_sect_names;
   void *cust_sect_addrs;
   size_t cust_sects;
}

int llext_link(struct llext *ext, const struct llext_link *link);

@pillo79
Copy link
Contributor

pillo79 commented Nov 18, 2024

My question was very much related to this PR and how to move LLEXT forward incrementally: deprecate or live with legacy choices?

Your proposal is instead a rather sizable effort and a major API change. You should make an RFC from that post so we can discuss it in more detail (I am not opposed, though I remain kinda skeptical about the benefits). I didn't see an answer in there, but I guess the approval is my clue! 😅

The FS loader feels misguided at this point given the way people are actually using this stuff.

Not sure I understand this part. Do we have real-world examples of LLEXT use (or specifically of the FS loader) that abuse something in the API?

@teburd
Copy link
Contributor

teburd commented Nov 19, 2024

My question was very much related to this PR and how to move LLEXT forward incrementally: deprecate or live with legacy choices?

Your proposal is instead a rather sizable effort and a major API change. You should make an RFC from that post so we can discuss it in more detail (I am not opposed, though I remain kinda skeptical about the benefits), I didn't see an answer in there, but I guess the approval is my clue! 😅

Agreed, I've been mentally building the picture up of how this might look. I'll try and gather my thoughts by the end of the year on all this.

The FS loader feels misguided at this point given the way people are actually using this stuff.

Not sure I understand this part. Do we have real-world examples of LLEXT use (or specifically of the FS loader) that abuse something in the API?

It's more this discrepancy with the file like API the loader provides vs how LLEXT is actually being used. I didn't mean to speak of the fs loader itself but the API the loader provides being file-like rather than memory-like.

@lyakh
Copy link
Contributor Author

lyakh commented Nov 19, 2024

E.g. perhaps some functions like the following...

@teburd shall we move this to a "feature?" Yes, if we could do as you describe above in SOF, which would supposedly mean preserving what currently is the loader context for the entire life-time of the extension, that would stream-line it a good deal.

@lyakh
Copy link
Contributor Author

lyakh commented Nov 19, 2024

we could deprecate the now confusingly named relocate_local in a future release (once SOF updates);

@pillo79 yes, the name isn't fitting very well now, we can either do as you propose - update the current API, or leave it as is for now and concentrate on a more global change along the lines, proposed by @teburd

@lyakh lyakh added the bug The issue is a bug, or the PR is fixing a bug label Nov 19, 2024
@lyakh
Copy link
Contributor Author

lyakh commented Nov 19, 2024

@fabiobaltieri fabiobaltieri merged commit cbb6199 into zephyrproject-rtos:main Nov 20, 2024
34 checks passed
@lyakh lyakh deleted the flix branch November 20, 2024 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Build System area: llext Linkable Loadable Extensions area: Toolchains Toolchains bug The issue is a bug, or the PR is fixing a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants