Skip to content

Conversation

dcpleung
Copy link
Member

@dcpleung dcpleung commented Jul 31, 2025

This adds handlers for DTLB multihit and load/store ring exceptions. These are usually the result of having both kernel and user threads accessing the same memory page.

  • In some situations, both kernel and user PTEs get pulled into the TLB caches, resulting in multihit exception. So we need to remove those entries from TLB cache and let hardware reloads the correct PTE from page table.
  • For the load/store ring exception, if only the kernel PTE is populated in TLB cache, the hardware raises the load/store exception as a type of access violation. However, we need to determine manually if that is actually a valid violation by testing the associated PTE in the user page table. If it is a valid access, we need to invalidate the cached TLB entry and let the hardware reload the user PTE for access.

To test this, some context switching tests have been added to the userspace test suite.

@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from fd6d3e0 to 2a4e384 Compare July 31, 2025 17:15
@dcpleung dcpleung marked this pull request as ready for review July 31, 2025 19:15
@zephyrbot zephyrbot added area: Xtensa Xtensa Architecture area: Userspace Userspace labels Jul 31, 2025
@zephyrbot zephyrbot requested review from andyross, ceolin and nashif July 31, 2025 19:16
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch 2 times, most recently from 18cc903 to be8dff4 Compare July 31, 2025 20:24
@softwarecki softwarecki self-requested a review August 1, 2025 10:45
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from be8dff4 to 09db043 Compare August 5, 2025 00:12
@dcpleung
Copy link
Member Author

dcpleung commented Aug 5, 2025

Compliance does not like labels in assembly.

softwarecki
softwarecki previously approved these changes Aug 5, 2025
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from 09db043 to e3bc863 Compare August 7, 2025 01:44
@dcpleung
Copy link
Member Author

dcpleung commented Aug 7, 2025

I removed the part to stash permissions into SW bits on L1 PTEs. Come to think of it, it is not very useful. L1 PTEs need only binary access permission at this point: pointing to L2 table, or not pointer anywhere. So we are always reset back to not pointing anywhere.

@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch 4 times, most recently from 6258141 to 23fcd7d Compare August 11, 2025 19:47
@dcpleung dcpleung marked this pull request as draft August 11, 2025 21:42
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from 23fcd7d to 310fa4a Compare August 13, 2025 22:09
@dcpleung dcpleung marked this pull request as ready for review August 28, 2025 17:37
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from 310fa4a to ca3c125 Compare August 28, 2025 17:38
@dcpleung
Copy link
Member Author

Plain rebase.

lyakh added a commit to lyakh/sof that referenced this pull request Sep 16, 2025
This tests zephyrproject-rtos/zephyr#93947

Signed-off-by: Guennadi Liakhovetski <[email protected]>
lyakh
lyakh previously approved these changes Sep 16, 2025
Copy link
Contributor

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while waiting for thesofproject/sof#10242 to finish the test, I confirm, that this PR fixes the zephyr/samples/userspace/prod_consumer test on PTL ADSP

@dcpleung
Copy link
Member Author

dcpleung commented Oct 8, 2025

Found an issue that would make this set not bisect-able. So fixed that.

And I have also renamed xtensa_asm2_s.h as I am tired of checkpatch.pl treating it as C file.

@zephyrbot zephyrbot added the area: Tests Issues related to a particular existing or missing test label Oct 8, 2025
@dcpleung
Copy link
Member Author

dcpleung commented Oct 8, 2025

As for those indentations, it is done so that we would have only one return at the end of function. IIRC, it was one of the MISRA requirements. Since it's not a short function, I would like to keep it this way.

@lyakh
Copy link
Contributor

lyakh commented Oct 9, 2025

As for those indentations, it is done so that we would have only one return at the end of function. IIRC, it was one of the MISRA requirements. Since it's not a short function, I would like to keep it this way.

@dcpleung This post seems to provide a plausible explanation behind the "single return" policy: https://stackoverflow.com/a/268218 So, I'd say, as long as no clean up is needed - early returns make reading much easier.

teburd
teburd previously approved these changes Oct 9, 2025
@teburd
Copy link
Contributor

teburd commented Oct 9, 2025

As for those indentations, it is done so that we would have only one return at the end of function. IIRC, it was one of the MISRA requirements. Since it's not a short function, I would like to keep it this way.

@dcpleung This post seems to provide a plausible explanation behind the "single return" policy: https://stackoverflow.com/a/268218 So, I'd say, as long as no clean up is needed - early returns make reading much easier.

This is the problem with trying to adhere to some rules we can't actually see :-/ everything is conjecture.

@dcpleung
Copy link
Member Author

dcpleung commented Oct 9, 2025

I think that's MISRA 15.5

The software bits inside PTE are used to store original PTE
attributes and ring value, and those bits are used to
restore PTE to previous state.

This modifies reset_region() to properly restore attributes
and ring value when resetting memory regions to the same as
when the system boots.

Signed-off-by: Daniel Leung <[email protected]>
This saves the EXCCAUSE and EXCVADDR into BSA during exception.
These will then be used during exception handling. The reason
for doing this instead of reading from both registers during
exception handing is that another exception will override
the value in these register (e.g. DTLB miss). When returning to
the previous exception handler, these register no longer
contains the original exception cause and address. We need to
save it so that we are actually handling the orignal exception.

Coredump also now uses the EXCCAUSE and EXCVADDR from BSA
instead of reading the registers.

Signed-off-by: Daniel Leung <[email protected]>
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from e28b01e to efce11a Compare October 9, 2025 18:22
@dcpleung
Copy link
Member Author

dcpleung commented Oct 9, 2025

Fixed PTE restoration that I found working on something else.

And I have changed to early returns. The source already has bunch of them in the file anyway.

teburd
teburd previously approved these changes Oct 9, 2025
ring = (bsa->ps & XCHAL_PS_RING_MASK) >> XCHAL_PS_RING_SHIFT;

if (ring != XTENSA_MMU_USER_RING) {
return false;
Copy link
Contributor

@lyakh lyakh Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcpleung this case was returning true before, was that wrong? I suppose it was, right? If access was performed from the kernel ring, then it isn't an access violation exception? So this seems to actually be a fix too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to return true. So fixed that.

return false;
}

xtensa_dtlb_entry_invalidate_sync(dtlb_entry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't just checking, it's also performing some non-trivial action. Can we update its doxygen description? Can be a follow-up PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the wording on the doxygen.

uint32_t restored_pte;

uint32_t original_sw = XTENSA_MMU_PTE_SW_GET(pte);
uint32_t original_attr = XTENSA_MMU_PTE_SW_ATTR_GET(original_sw);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an earlier version of this commit had

uint32_t original_attr = XTENSA_MMU_PTE_SW_ATTR_GET(pte);

here and that worked for SOF. This version breaks SOF user-space. Is this something that I have to change in SOF user-space implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier version was incorrect as it used the wrong input to get the backup attributes. The current version (with input as original_sw) has the correct behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is a missing mapping function somewhere on your side?

When a page can be accessed by both kernel and user threads,
the autofill DTLB may contain an entry for kernel thread.
This will result in load/store ring exception when it is
accessed by user thread later. In this case, we need to
invalidate all associated TLBs related to kernel access so
hardware can reload the page table the correct permission
for user thread.

Signed-off-by: Daniel Leung <[email protected]>
This adds a function to handle DTLB multihit exception when
userspace is enabled, as this exception may be raised due to
the same page being able to accessed by both kernel and user
threads. The auto-refill DTLBs may contain entries for same
page, one for kernel and one for user under some situations.
We need to invalidate those existing DTLB entries so that
hardware can reload from the page table.

This is an alternative to the kconfig invalidating DTLBs on
every swap: CONFIG_XTENSA_MMU_FLUSH_AUTOREFILL_DTLBS_ON_SWAP.
Although this does not have extra processing per context
switching, exception handling itself has a high cost. So
care needs to be taken on whether to enable that kconfig.

Signed-off-by: Daniel Leung <[email protected]>
This moves the block of FPU and HIFI registers in the Base Save
Area (BSA), used during interrupts and exceptions, to the end of
the block. This is done to minimize code usage in the vector
text section. During interrupt entrance, the code stores a small
set of registers first before jumping to the long handler. When
the offset to these registers from the beginning of BSA is small
enough, the compiler will emit S32I.N instead of S32I. This
saves us one byte per store (2 vs 3 bytes). This is mainly done
to avoid overflowing the vector text area.

Signed-off-by: Daniel Leung <[email protected]>
xtensa_asm2_s.h is not exactly a C header. Rename it to
xtensa_asm2.inc.S to clearly state that it is has assembly
code in it as its main purpose is to declare assembly macros.
This is being done to keep checkpatch.pl from treating it as
C file and complaining about non-C syntax.

Signed-off-by: Daniel Leung <[email protected]>
This is simply done to conserve code space in the vector text
areas. These vector text areas are very small and we should
only put code that is absolutely necessary for interrupt and
exception entrance. The saving of PS into the thread struct
can be deferred a bit. So do that.

Signed-off-by: Daniel Leung <[email protected]>
The assert error messages when l2_page_table_map() fails are not
correct. It returns false when it cannot allocate new L2 table,
and it is not able the address having already mapped. So correct
the error message.

Signed-off-by: Daniel Leung <[email protected]>
In functions which manipulate both L1 and L2 tables, make
the variable names obvious by prefixing them with l1_ or l2_.
This is mainly done to avoid confusion when reading through
those functions.

Signed-off-by: Daniel Leung <[email protected]>
arch_mem_map() takes in some flags to describe the to-be mapped
memory regions' permissions and cache status. When the flags are
translated, they become attributes in PTEs. So for functions
being called by arch_mem_map() and beyond, rename flags to
attrs to better describe its purpose.

Signed-off-by: Daniel Leung <[email protected]>
The priv_stack_ptr for reading and writing tests was not set
correctly on Xtensa, and was always 0. Test would pass since
the NULL page should always generate access faults on most
Xtensa platforms. However, those tests are supposed to test
reading and writing to the privilege stack. So fix that by
populating the priv_stack_ptr correctly.

Signed-off-by: Daniel Leung <[email protected]>
This adds a few tests to test switching between threads where
they are under different memory domains. This is mainly to test
if permissions are set correctly during context switch.

By default the new tests are disabled and must be explicitly
enabled. This is due to some architectures requiring some config
changes (for example, x86 needing more reserved memory domains,
and ARM64 needing more translation tables). To avoid causing
mass CI failures, only enable platforms that have been tested,
for now.

Signed-off-by: Daniel Leung <[email protected]>
@dcpleung dcpleung force-pushed the tests/mem_domain_userspace_switching branch from efce11a to d8b8697 Compare October 10, 2025 16:47
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Tests Issues related to a particular existing or missing test area: Userspace Userspace area: Xtensa Xtensa Architecture

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants