Skip to content

Conversation

@rogeryou
Copy link

This PR is updated for #43915.

Add SPI NAND block device driver for using SPI NAND Flash like Macronix Flash MX31LF4GE4BC.
This driver is tested on STM32L562E-DK.

The license of BCH code is updated.

Copy link
Member

@erwango erwango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution.
Some clean up required.
My main request would be to remove STM32 mentions, since there is nothing STM32 specific here aside from the board used for test.

@erwango erwango removed the platform: STM32 ST Micro STM32 label Sep 27, 2022
@carlescufi
Copy link
Member

@rogeryou can you please format your commits as described here:
https://docs.zephyrproject.org/latest/contribute/guidelines.html#commit-guidelines

@rogeryou rogeryou force-pushed the macronix_spi_nand_v1 branch from 9755c14 to 901b863 Compare September 28, 2022 03:16
@rogeryou
Copy link
Author

@rogeryou can you please format your commits as described here: https://docs.zephyrproject.org/latest/contribute/guidelines.html#commit-guidelines

I have formated my commits.

@Laczen
Copy link
Contributor

Laczen commented Sep 29, 2022

@rogeryou, thank you for this contribution. It would be very nice to have nand flash support in zephyr.

Regarding the PR, I have seen that the ecc is hidden for the user as to be able to use the standard flash interface. This is a good thing, but doesn't this also make any other part of the oob data unavailable ? IMHO it would be good to extend the flash driver for nand flashes with read/write possibilities of the oob data. This would allow adding bad block info to the oob area.

@RyKolb
Copy link

RyKolb commented Nov 1, 2022

@rogeryou
It seems as though part of this driver is written for either a 2048 or 4096 page size NAND flash, and yet other parts use the defined page/block offset/mask. Is there reasoning behind this?

@carlescufi carlescufi requested review from Laczen and removed request for GeorgeCGV, gmarull and mbolivar-nordic November 2, 2022 14:12
erwango
erwango previously requested changes Nov 8, 2022
@rogeryou
Copy link
Author

@rogeryou, thank you for this contribution. It would be very nice to have nand flash support in zephyr.

Regarding the PR, I have seen that the ecc is hidden for the user as to be able to use the standard flash interface. This is a good thing, but doesn't this also make any other part of the oob data unavailable ? IMHO it would be good to extend the flash driver for nand flashes with read/write possibilities of the oob data. This would allow adding bad block info to the oob area.

Thank you for your suggestion. We will add APIs that operate OOB later.

@rogeryou rogeryou closed this Nov 23, 2022
@rogeryou
Copy link
Author

@rogeryou It seems as though part of this driver is written for either a 2048 or 4096 page size NAND flash, and yet other parts use the defined page/block offset/mask. Is there reasoning behind this?

Could you tell me which part of the program you mentioned?

@rogeryou rogeryou reopened this Nov 23, 2022
@zephyrbot zephyrbot added the area: Devicetree Binding PR modifies or adds a Device Tree binding label Nov 23, 2022
@daniel-0723 daniel-0723 force-pushed the macronix_spi_nand_v1 branch from e352357 to 84f8da1 Compare June 9, 2025 13:51
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jun 9, 2025

@daniel-0723
Copy link

@de-nordic Please review again.

@danieldegrasse danieldegrasse modified the milestones: v4.2.0, v4.3.0 Jul 14, 2025
@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Sep 13, 2025
@maxmclau
Copy link
Contributor

would love to see this cleaned up and merged

@arnout
Copy link

arnout commented Sep 17, 2025

+1 to getting this merged. In a couple of weeks time we'll have the hardware to test this if that is useful to make it progress.

Please remove the stale label.

@pdgendt
Copy link
Contributor

pdgendt commented Sep 22, 2025

There's a build issue for the documentation, but the action is no longer available. Try to rebase and fix the doc issues.

@daniel-0723
Copy link

There's a build issue for the documentation, but the action is no longer available. Try to rebase and fix the doc issues.

OK, I will try to fix it.

struct spi_nand_config {
/* Devicetree SPI configuration */
struct spi_dt_spec spi;
uint8_t id[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use SPI_NAND_ID_LEN here?

Comment on lines +606 to +608
if ((addr < 0) || ((addr + size) > data->flash_size)) {
return -EINVAL;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With size == 0 we are supposed to just return 0.
Additionally, to be safe against int overflows, the condition should be:

if ((addr < 0) || (addr >= data->flash_size) || (size > data->flash_size) || ((data->flash_size - addr) < size))

uint32_t written_bytes = 0;

/* should be between 0 and flash size */
if ((addr < 0) || ((addr + size) > flash_size)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ((addr < 0) || (addr >= data->flash_size) || (size > data->flash_size) || ((data->flash_size - addr) < size))

Comment on lines +662 to +663
} else {
#if IS_ENABLED(CONFIG_SPI_NAND_SOFTWARE_ECC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if (IS_ENABLED(...)

struct spi_nand_data *data = dev->data;
int ret = 0;
/* erase area must be subregion of device */
if ((addr < 0) || ((size + addr) > data->flash_size)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ((addr < 0) || (addr >= data->flash_size) || (size > data->flash_size) || ((data->flash_size - addr) < size))

Copy link

@embed embed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the errors and try compiling with CONFIG_SPI_NAND_SOFTWARE_ECC=y to test the bch code as well.

spi-max-frequency = <500000>;
id = [c2 1E];
cs-wait-delay = <0>;
page-size = <2048>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
page-size = <2048>;
page-size = <2048>;
flash-size = <536870912>;

Since flash-size is mandatory too :-)

k_heap_free(&bch_heap, bch->buf2);
k_heap_free(&bch_heap, bch->buf3);
k_heap_free(&bch_heap, bch->g);
k_heap_free(bch);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found by copilot, this is not how you call this function.

Suggested change
k_heap_free(bch);
k_heap_free(&bch_heap, bch);

return ret;
}

if (!(reg & SPI_NAND_WIP_BIT) == 0U) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there the inversion? WIP bit must be zero for us to return. (thank you copilot)

Suggested change
if (!(reg & SPI_NAND_WIP_BIT) == 0U) {
if ((reg & SPI_NAND_WIP_BIT) == 0U) {

LOG_DBG("bch init failed, params should be m: 8 ~ 13, t: 1 ~ 12\r\n");
return -EINVAL;
}
bch = bch_alloc(sizeof(bch_t), &err);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct members bch->n and other fields are used before being initialized. The struct should be initialized with proper values (m, t, n) before using them for allocation sizes. (copilot)

Suggested change
bch = bch_alloc(sizeof(bch_t), &err);
bch = bch_alloc(sizeof(bch_t), &err);
if (bch == NULL) {
return -ENOMEM;
}
bch->m = m;
bch->t = t;
bch->n = (1 << m) - 1;
bch->ecc_words = (m * t + 31) / 32;
bch->len = (bch->n + 1) / 8;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also delete the duplicate bch->m = m; etc a few lines down below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not seem to be tested...

LOG_ERR("invalid eccbytes %d, should be %d\n", eccbytes, data->nbc.bch->ecc_words);
return -EINVAL;
}
data->page_buf = (uint8_t *)k_malloc(data->page_size + data->oob_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +367 to +377
bch->a_pow = bch_alloc((bch->n + 1) * sizeof(*bch->a_pow), &err);
bch->a_log = bch_alloc((bch->n + 1) * sizeof(*bch->a_log), &err);
bch->mod_tab = bch_alloc(bch->ecc_words * 16 * 4 * sizeof(*bch->mod_tab), &err);
bch->ecc = bch_alloc(bch->ecc_words * sizeof(*bch->ecc), &err);
bch->ecc2 = bch_alloc(bch->ecc_words * sizeof(*bch->ecc2), &err);
bch->buf = bch_alloc((t + 1) * sizeof(*bch->buf), &err);
bch->buf2 = bch_alloc((t + 1) * sizeof(*bch->buf2), &err);
bch->buf3 = bch_alloc(bch->len, &err);
bch->syn = bch_alloc(t * 2 * sizeof(*bch->syn), &err);
bch->elp = bch_alloc((t + 1) * sizeof(*bch->elp), &err);
bch->g = bch_alloc((bch->ecc_words + 1) * sizeof(*bch->g), &err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks https://docs.zephyrproject.org/latest/contribute/coding_guidelines/index.html, rule 14. Haven't this already been mentioned in the review for this part?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heap is defined with K_HEAP_DEFINE, which is static allocation. Is this not allowed in Zephyr?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@de-nordic , Could you explain more about the dynamic memory allocation issue. I followed your suggestion to modify the memory allocation method to static allocation before. And I found this in drivers/flash/flash_cadence_nand_ll.c.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@de-nordic , Could you explain more about the dynamic memory allocation issue. I followed your suggestion to modify the memory allocation method to static allocation before. And I found this in drivers/flash/flash_cadence_nand_ll.c.

They should not be doing that either. But they potentially kick their users where it hurts, and here seems to be an universal NAND driver.

What you have here is not static. The buffer allocated for your local heap is static, but allocation here is dynamic, aside for literally calling alloc, in lines like

bch->buf3 = bch_alloc(bch->len, &err);

where size of allocated buffer is evaluated at run-time. This also means that overall size of bch object tree is only known at run-time, which makes it hard to evaluate CONFIG_SPI_NAND_BCH_HEAP_SIZE and probably nobody would be able to give you the right value to serve their system, as doing the math requires literally looking at the alloc sequence and calculating the worst case scenario for it.

This also means you are risking situations where order of bch object allocations, because they may be variable in size, can make a difference between success and a failure.
Failure, from lack of mem, is a problem in embedded system that is hard to address; yes, I know that there is nothing you can do when NAND fails, SPI fails, hardware starts to behave wobbly, but with RAM it is configuration or design issue, that will be artificial risk added on top. Both hardware and lack of RAM errors can lead to device replacement, the former just happens the later should be avoided.

It will be also hard to estimate correct size of HEAP here, so probably it will be too much, just in case - which is what?

I am not even sure I want to kick at this, because I want the driver. I just know that one of the first questions I will be asked is how to calculate the memory size for it properly, because people want to use just enough RAM for their case and stop getting at random read/write errors.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just take ecc_bits from the device tree instead of the ONFI table, then everything can be calculated statically (other parameters like page size already come from device tree).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnout , you are right. The ecc_bits is only runtime factor that affects the heap size. And it can be specified in devicetree. Then what's your suggestion? @de-nordic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would be perfect if we could calculate all the things at compile time, by looking at devices we want to serve. Generally I would prefer to get close, as much as possible, to just allowing user to select how many bch objects can be used in parallel and do the math for them, for the worst scenario in the system - biggest "ecc" requirements for a single object of the "worst" device.
note also that the bch support may end up being used not only by your code here. In the future I would preferably moved it out to ECC subsystem.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not actually possible to do parallel writes (or reads) anyway, so one ECC object per spinand device should be OK, right? You'd need to define a macro to instantiate that object based on DTS parameters, so it's a little bit finnicky.

note also that the bch support may end up being used not only by your code here.

Are you thinking of other (non-SPI) NAND, like parallel NAND? Or something else entirely? It is my impression that NAND is on its way out (replaced by eMMC used in SPI mode). And parallel NAND has been declining for some time already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use device tree to get as much compile time characteristics as possible and calculate the the heap size automatically, or at least calculate the biggest bch object and allow user to select number of such objects. You can also set minimal requirement for number of such objects in Kconfig specifically for your driver.

if (ptr == NULL) {
*err = 1;
}
return ptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am quite sure there is memset missing here, for zeroing the reserved buffer. Otherwise sequence

void *ptr = bch_alloc(something, something); // success here
bch_free(ptr);

may end up in attempt to free some bogus pointers.

#include <zephyr/logging/log.h>

#ifndef CONFIG_SPI_NAND_BCH_HEAP_SIZE
#define CONFIG_SPI_NAND_BCH_HEAP_SIZE 51200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No non-Kconfig local CONFIG_ definitions, use some alias for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.