Skip to content

Conversation

@ArcaneNibble
Copy link
Collaborator

@ArcaneNibble ArcaneNibble commented Aug 5, 2025

  • Enable EV3 speaker only upon first sound, as suggested here
  • Rename EV3 caching macros, as suggested here
  • Fix MicroPython crash mentioned here
  • Mark .dma buffers as NOBITS

This prevents hearing a pop sound whenever the brick boots up.
The amplifier stays on after any sound is played in order to
prevent repeated popping noises.
@coveralls
Copy link

coveralls commented Aug 5, 2025

Coverage Status

coverage: 56.973% (+0.02%) from 56.958%
when pulling 3142d3c on ArcaneNibble:fixups
into b77115b on pybricks:master.

@ArcaneNibble ArcaneNibble changed the title [EV3] Two misc fixups [EV3] Three misc fixups Aug 5, 2025

// Accesses a variable via the uncached memory alias
#define PBDRV_UNCACHED(x) (*(volatile __typeof__(x) *)((uint32_t)(&(x)) + 0x10000000))
#define PBDRV_EV3_UNCACHED(x) (*(volatile __typeof__(x) *)((uintptr_t)(&(x)) + PBDRV_EV3_UNCACHED_OFFSET))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You explained that these aren't necessarily specific to the EV3, so why are we making this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, it's specific to the way we're currently configuring the EV3, and you had originally suggested renaming it.

Perhaps I misunderstood what you were asking for?

Copy link
Member

@dlech dlech Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are putting it in include/pbdrv then it shouldn't be anything platform-specific by name. If we expect other platforms with memory managers that do something similar, then we could make a PBDRV_CONFIG_MM_UNCACHED_OFFSET config option, e.g. for the 0x10000000 and leave the macro names generic (no EV3).

Or if this is something that only EV3 drivers will ever use, we should move this to an "internal" header file that is specific to the EV3 memory manager.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the macros would be generic but would have platform-specific definitions.

Is there a generic "target is an EV3" macro? Perhaps these can be surrounded by #if PBDRV_EV3?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added a new PBDRV_CONFIG_CACHE and PBDRV_CONFIG_CACHE_EV3 for this purpose.

const char *name_str = qstr_str(name);

for (mpy_info_t *info = mpy_first; info < mpy_end;
for (mpy_info_t *info = mpy_first; (uintptr_t)info + sizeof(uint32_t) < (uintptr_t)mpy_end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently only support 32-bit platforms, but I think this should be pointer-size aligned rather than 32-bit aligned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mpy_info_t is explicitly defined with a 32-bit size regardless of platform pointer size

 uint8_t mpy_size[4];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what that has to do with the binary file itself needing to be pointer-aligned in memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like we are planning to fix alignment issues later.

The check here is sufficient for preventing an invalid read.

@ArcaneNibble ArcaneNibble changed the title [EV3] Three misc fixups [EV3] Four misc fixups Aug 5, 2025
@ArcaneNibble
Copy link
Collaborator Author

Added yet another miscellaneous fix which was found as I was going through other work

for (unsigned int i = 0; i < SYSTEM_RAM_SZ_MB; i++) {
uint32_t addr_phys = 0xC0000000 + i * MMU_SECTION_SZ;
uint32_t addr_virt = 0xD0000000 + i * MMU_SECTION_SZ;
uint32_t addr_virt = addr_phys + PBDRV_EV3_UNCACHED_OFFSET;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move this function (renamed to pbdrv_cache_ev3_early_init) and pbdrv_cache_* out of platform.c to lib/pbio/drv/cache/cache_ev3.c to make a proper driver file for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems problematic to move because of all the MMU-related macros. Moving the other functions makes sense though.

This makes this common header extensible in case future platforms
also require cache management. It also makes it clear which functionality
is specific to how Pybricks sets up the EV3.
Because the storage layer rounds up code size to a multiple of
the word size, it is possible for the set of code modules to have
trailing padding. Previously, this could cause the module search
function to not terminate as expected, read out of bounds, and crash.
Otherwise the recent addition of the .dma section defaults to PROGBITS
and causes the entire .bss section to be filled with 0s *on disk*,
thus bloating the firmware.elf file.
@dlech
Copy link
Member

dlech commented Aug 5, 2025

What is here is good, but I think the .dma section needs rethinking.

For example:

static struct {
// This is used when transmitting so that the last byte clears CSHOLD.
uint32_t tx_last_word;
// This is used to hold the initial command to the SPI peripheral.
uint8_t spi_cmd_buf_tx[SPI_CMD_BUF_SZ];
// This is used to hold the replies to commands to the SPI peripheral.
uint8_t spi_cmd_buf_rx[SPI_CMD_BUF_SZ];
// This is used when SPI only needs to receive. It should always stay as 0.
uint8_t tx_dummy_byte;
// This is used when received data is to be discarded. Its value should be ignored.
uint8_t rx_dummy_byte;
} spi_dev_bufs PBDRV_DMA_BUF;

Should be:

static struct {
    // This is used when transmitting so that the last byte clears CSHOLD.
    uint32_t tx_last_word;
    // This is used to hold the initial command to the SPI peripheral.
    uint8_t spi_cmd_buf_tx[SPI_CMD_BUF_SZ] PBDRV_DMA_BUF;
    // This is used to hold the replies to commands to the SPI peripheral.
    uint8_t spi_cmd_buf_rx[SPI_CMD_BUF_SZ] PBDRV_DMA_BUF;
    // This is used when SPI only needs to receive. It should always stay as 0.
    uint8_t tx_dummy_byte;
    // This is used when received data is to be discarded. Its value should be ignored.
    uint8_t rx_dummy_byte;
} spi_dev_bufs;

So that e.g tx_dummy_byte is not in the same cache line as spi_cmd_buf_rx.

And the align of the .dma section in the linker script is redundant with the align in the PBDRV_DMA_BUF macro. So I think we can just drop the .dma section and I would rename PBDRV_DMA_BUF to PBDRV_DMA_ALIGN for clarity.

@dlech
Copy link
Member

dlech commented Aug 5, 2025

Actually...

static struct {
    // This is used when transmitting so that the last byte clears CSHOLD.
    uint32_t tx_last_word;
    // This is used when SPI only needs to receive. It should always stay as 0.
    uint8_t tx_dummy_byte;
    // This is used when received data is to be discarded. Its value should be ignored.
    uint8_t rx_dummy_byte;
    // This is used to hold the initial command to the SPI peripheral.
    uint8_t spi_cmd_buf_tx[SPI_CMD_BUF_SZ] PBDRV_DMA_BUF;
    // This is used to hold the replies to commands to the SPI peripheral.
    uint8_t spi_cmd_buf_rx[SPI_CMD_BUF_SZ];
} spi_dev_bufs;

Is probably sufficient.

@ArcaneNibble
Copy link
Collaborator Author

tx_last_word is accessed via DMA as well, so it does need a cache flush. This struct fits entirely within one cache line, which is why I've declared it the way it's been done.

The alignment is not entirely sufficient, because there is a need to bump subsequent data out of the same cache line as well. This was the entire reason for the separate memory section.

@ArcaneNibble
Copy link
Collaborator Author

ArcaneNibble commented Aug 5, 2025

Having the TX and RX buffers share a cache line is fine (edit: and was intentional) in this particular case. The cache line is flushed for TX and invalidated when RX is finished, and only one operation can be in flight at a given time, so they cannot overlap.

@dlech
Copy link
Member

dlech commented Aug 5, 2025

This struct fits entirely within one cache line, which is why I've declared it the way it's been done.

ok

The alignment is not entirely sufficient, because there is a need to bump subsequent data out of the same cache line as well. This was the entire reason for the separate memory section.

The align attribute does (should do?) this. This came up in some work I was doing in Linux recently and pretty sure this is what I saw.

@ArcaneNibble
Copy link
Collaborator Author

I specifically saw the attribute being insufficient for this. For example, with this change:

diff --git a/lib/pbio/include/pbdrv/cache.h b/lib/pbio/include/pbdrv/cache.h
index 2f036ff2..611d2464 100644
--- a/lib/pbio/include/pbdrv/cache.h
+++ b/lib/pbio/include/pbdrv/cache.h
@@ -28,7 +28,7 @@ void pbdrv_cache_prepare_after_dma(const void *buf, size_t sz);
 #define PBDRV_CACHE_LINE_SZ         32

 // Align data to a cache line, which is needed for clean RX DMA
-#define PBDRV_DMA_BUF               __attribute__((aligned(PBDRV_CACHE_LINE_SZ), section(".dma")))
+#define PBDRV_DMA_BUF               __attribute__((aligned(PBDRV_CACHE_LINE_SZ)))

 #endif // PBDRV_CONFIG_CACHE

diff --git a/lib/pbio/platform/ev3/platform.ld b/lib/pbio/platform/ev3/platform.ld
index aaf520c2..b16b0608 100644
--- a/lib/pbio/platform/ev3/platform.ld
+++ b/lib/pbio/platform/ev3/platform.ld
@@ -66,11 +66,6 @@ SECTIONS
         _bss_start = .;
         *(.bss)
         *(.bss.*)
-        /* Put DMA RX buffers here so that they never share cache lines with
-           unrelated data. This makes sure that they can be invalidated properly. */
-        . = ALIGN(32);
-        *(.dma)
-        . = ALIGN(32);
         _bss_end = .;
     } > DDR

I see in the map file

 .bss.spi_dev.lto_priv.0
                0x00000000c00592d8        0xc /var/folders/k0/7hfmqmf961x2hhkjbvr3w4_w0000gn/T//firmware.elf.sEYL01.ltrans4.ltrans.o
                0x00000000c00592d8                spi_dev.lto_priv.0
 *fill*         0x00000000c00592e4       0x1c 
 .bss.spi_dev_bufs.lto_priv.0
                0x00000000c0059300       0x10 /var/folders/k0/7hfmqmf961x2hhkjbvr3w4_w0000gn/T//firmware.elf.sEYL01.ltrans4.ltrans.o
                0x00000000c0059300                spi_dev_bufs.lto_priv.0
 .bss.stdin_buf.0
                0x00000000c0059310       0x15 /var/folders/k0/7hfmqmf961x2hhkjbvr3w4_w0000gn/T//firmware.elf.sEYL01.ltrans4.ltrans.o

So the spi_dev_bufs variable has been aligned to a cache line for its start, but stdin_buf is sharing the same cache line (at offset 0x10, where the cache line size is 0x20)

@dlech
Copy link
Member

dlech commented Aug 5, 2025

Ah, OK. For it to work correctly, it looks like the align attribute needs to be inside of a struct (this is how we are doing in the Linux stuff I mentioned).

static struct {
    // This is used when transmitting so that the last byte clears CSHOLD.
    uint32_t tx_last_word PBDRV_DMA_BUF;
    // This is used to hold the initial command to the SPI peripheral.
    uint8_t spi_cmd_buf_tx[SPI_CMD_BUF_SZ];
    // This is used to hold the replies to commands to the SPI peripheral.
    uint8_t spi_cmd_buf_rx[SPI_CMD_BUF_SZ];
    // This is used when SPI only needs to receive. It should always stay as 0.
    uint8_t tx_dummy_byte;
    // This is used when received data is to be discarded. Its value should be ignored.
    uint8_t rx_dummy_byte;
} spi_dev_bufs;

Then we get the expected alignment before and after:

                0xc005a010        0x4 /tmp/cciKeVm6.ltrans4.ltrans.o
 *fill*         0xc005a014        0xc 
 .bss.spi_dev_bufs.lto_priv.0
                0xc005a020       0x20 /tmp/cciKeVm6.ltrans4.ltrans.o
                0xc005a020                spi_dev_bufs.lto_priv.0
 .bss.spi_dev.lto_priv.0
                0xc005a040        0xc /tmp/cciKeVm6.ltrans4.ltrans.o
                0xc005a040                spi_dev.lto_priv.0

So I think we should do it like this everywhere. Otherwise, we will need a separate section for every single buffer to do the alignment in the linker script.

@ArcaneNibble
Copy link
Collaborator Author

That doesn't work for the ADC's channel_data though (which isn't a struct and is just a bare array).

@ArcaneNibble
Copy link
Collaborator Author

ArcaneNibble commented Aug 5, 2025

It currently doesn't create a unique section for each buffer. They all go into a single .dma section. Once in there, the align(32) is sufficient, since everything in .dma is aligned.

@dlech
Copy link
Member

dlech commented Aug 5, 2025

That doesn't work for the ADC's channel_data though (which isn't a struct and is just a bare array).

What is stopping us from putting it in a struct?

@ArcaneNibble
Copy link
Collaborator Author

We could. It seems slightly more error-prone than using a separate .dma section, since the .dma solution works automatically for structs or arrays.

@dlech
Copy link
Member

dlech commented Aug 5, 2025

It currently doesn't create a unique section for each buffer. They all go into a single .dma section. Once in there, the align(32) is sufficient, since everything in .dma is aligned.

OK, I guess that works too. Still have a slight preference for not having an extra section though.

@ArcaneNibble
Copy link
Collaborator Author

The section doesn't appear in the final firmware.elf, since the linker script merges it into .bss. It's only present in intermediate .o files.

@dlech
Copy link
Member

dlech commented Aug 5, 2025

I know. The issue I have with the separate section is that we can't apply the attribute to a member of a struct if we ever need to do that. We can only apply it to a whole static struct.

@ArcaneNibble
Copy link
Collaborator Author

Ah, that is indeed the case. However, applying alignment to a single field of a struct also has potentially unexpected effects? (It'll change the alignment of the entire struct), so I would've never even thought to try doing that... (I always put it on the whole struct)

@dlech dlech merged commit 7d9b601 into pybricks:master Aug 5, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants