Skip to content

Conversation

@EmilioCBen
Copy link
Contributor

Fixes #87422
When users try to debug while flashing into ram,
JLink/Linkserver would fail to flash straight to ram. Each with their own issues. Both would reset after load which would flush the ram and point the PC to flash. Linkserver treated the ram as inaccessible, and the added memory region in board.cmake has the incorrect size for the region and is already included inside linkserver.

decsny
decsny previously requested changes Mar 20, 2025
@EmilioCBen EmilioCBen changed the title Fix for debugging when trying to flash to ram. Fix for debugging when trying to load to ram. Mar 20, 2025
if runner_cls.capabilities().reset and '--no-reset' not in final_argv:
if runner_cls.capabilities().reset and '--no-reset' not in final_argv and len(board_image_count):
if board_image_count is not None:
reset = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm your PR #69748 made the initial change here with respect to board_image_count and the forced addition of reset. What we found is that the test here for not None is always true because board_image_count is instantiated but doesn't have any elements in it... so not None. This results in the runner always adding a reset to the gdb commands. I suspect, from your comment block above, that this wasn't your intention.

We didn't notice this change because in the normal debug situation, you wouldn't care if the gdb did reset, load, reset if the image was being put in flash. But if you are targeting an image to a RAM space, the reset, load, reset causes the bootrom to load the flash image preventing you from debugging the image you created for RAM loading/debugging.

In parallel to this PR being added, I just saw this other PR [#87138] (@VynDragon) that just merged to force the larger logic to NOT ignore the no-reset runner option. We could have done the same thing now that I see this but I'm still questioning the logic of "if board_image_count not None" when combined with the comment block about multiple images.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

board_image_count is init in the common function, it should not not ever be none. As for why I put the none check in, I do not remember

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in this particular case, the len(board_image_count) was 0 and we needed to have this check to keep the rest of your logic from doing another reset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm Do you have any more comments on this PR?

Copy link
Contributor

@nordicjm nordicjm Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change here doesn't make sense, it should never have a len of 0 so you need to look into what is causing that, this change is not acceptable though without some in depth analysis showing there is a bug and given this code is in use and works fine on multiple boards by multiple vendors... I'm not seeing it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nordicjm it is working because likely most platforms have no problem with back to back resets from the debugger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can disable reset so that's not an issue

@decsny decsny dismissed their stale review March 31, 2025 22:02

removing block because lost track of this PR

@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch from cd5a1a8 to edae439 Compare April 1, 2025 14:18
@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch from edae439 to add46b5 Compare April 25, 2025 16:21
@pdgendt pdgendt assigned dleach02 and unassigned pdgendt Apr 29, 2025
@pdgendt
Copy link
Contributor

pdgendt commented Apr 29, 2025

Re-assigned as I can't comment on the changes.

@EmilioCBen
Copy link
Contributor Author

EmilioCBen commented May 28, 2025

After some time has passed, I noticed some context has been lost on this PR allow me to give a quick summary.
I originally was looking at debugging the 1170 when the app was loading from e.g. in the dts

/ {
	chosen {
		zephyr,flash = &itcm;
		zephyr,sram = &dtcm;
	};
};

This did not seem to work for Linkserver but I noticed it also did not work for jlink, this PR solves both issues.

@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch 2 times, most recently from dea5d4f to 440b5fa Compare May 29, 2025 15:28
Copy link
Member

@decsny decsny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a hard time understanding the commit message, and this PR looks like it is fixing 3 separate things, should it be 3 separate commits ?

@EmilioCBen
Copy link
Contributor Author

I am having a hard time understanding the commit message, and this PR looks like it is fixing 3 separate things, should it be 3 separate commits ?

You make a good point, I certainly can clean up the commit message and I will test to see if nothing breaks when I split these up, if not I can separate them into different commits.

@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch from 440b5fa to ab87489 Compare June 2, 2025 16:02
# If the flash node points to ram, linserver treats
# the ram as inaccessible and does not flash.
gdb_cmd += ['-ex', 'set mem inaccessible-by-default off']
gdb_cmd += ['-ex', 'monitor reset', '-ex', 'load']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the order of load, reset being reversed to reset, load?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to reset, we would want it to be before a load.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I do see a typo I will fix in the next push if I don't get further comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to reset, we would want it to be before a load.

Yes, I see that is what you are doing, you are restating what it is that I said that am observing, my question was why. You should explain in the commit message also why you are reversing this order. I don't know why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the commit message, let me know if that helps!

@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch 2 times, most recently from 72b66e7 to b983d94 Compare June 4, 2025 14:46
decsny
decsny previously approved these changes Jun 5, 2025
# reset or not. If this is not specified in the board/soc file, leave it up to
# the runner's default configuration to decide if a reset should occur.
if runner_cls.capabilities().reset and '--no-reset' not in final_argv:
if runner_cls.capabilities().reset and '--no-reset' not in final_argv and len(board_image_count):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code change is unexplained and does not make sense so requires a valid justification for why this would ever be none, there is a --no-reset argument for west flash already that can be used

Copy link
Contributor Author

@EmilioCBen EmilioCBen Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be completely fair this PR has nothing to do with me justifying that this value should be zero, I am only adding a check to ensure that this value is NOT zero. The fact is right now the value literally is zero during a flash and probably is for multiple platforms. The check only stops it from running the code intended for multi-image flashing. It's probably never been caught before since the mutli image code probably can handle a single image too. The thing that breaks right now and that is making this apparent is the additional reset that is added when going down this path for the "second" image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@EmilioCBen EmilioCBen Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I was not aware of the configuration pointed out by this documentation, but maybe I am not being clear enough on the issue as the documentation seems to be calling out a workaround for sysbuild. So explicitly just before this function now I have added a

board_count

just before this if statement occurs in the python.

I built then flash hello world using nrf5340dk/nrf5340/cpuapp and python hit the breakpoint and
a print out shows board_image_count == 0

board_count_2

Is this not an issue? I only bring up the nrf5340dk to show that I see this with multiple platforms.
I would not be surprised if this is happening to all the platforms in zephyr? But please let me know if I am
misunderstanding or missing some context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the issue? You are flashing with one image and without --no-reset and the runner supports reset, so it is doing a reset which is exactly what people would expect it to do. Can see the folow here:

diff --git a/scripts/west_commands/run_common.py b/scripts/west_commands/run_common.py
index b13ca502eb6..0022691a395 100644
--- a/scripts/west_commands/run_common.py
+++ b/scripts/west_commands/run_common.py
@@ -421,8 +421,13 @@ def do_run_common_image(command, user_args, user_runner_args, used_cmds,
     # the board has enabled this functionality, check if the board should be
     # reset or not. If this is not specified in the board/soc file, leave it up to
     # the runner's default configuration to decide if a reset should occur.
+    print("aa")
+    print(len(board_image_count))
+    print(board_image_count)
     if runner_cls.capabilities().reset and '--no-reset' not in final_argv:
+        print("bb")
         if board_image_count is not None:
+            print("cc")
             reset = True
 
             for cmd in used_cmds:

which for a sysbuild project gives:

-- west flash: using runner nrfutil
aa
2
defaultdict(<class 'run_common.ImagesFlashed'>, {'nrf5340dk/nrf5340/cpunet': ImagesFlashed(flashed=1, total=1), 'nrf5340dk/nrf5340/cpuapp': ImagesFlashed(flashed=0, total=1)})
bb
cc
-- runners.nrfutil: reset after flashing requested
Using board 001050097030
-- runners.nrfutil: Flashing file: /tmp/bb/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/_AA/merged_CPUNET.hex
-- runners.nrfutil: Erasing address ranges touched by firmware
-- runners.nrfutil: Programming image
-- runners.nrfutil: Verifying image
-- runners.nrfutil: Reset
-- runners.nrfutil: Board(s) with serial number(s) 1050097030 flashed successfully.
-- west flash: using runner nrfutil
aa
2
defaultdict(<class 'run_common.ImagesFlashed'>, {'nrf5340dk/nrf5340/cpunet': ImagesFlashed(flashed=1, total=1), 'nrf5340dk/nrf5340/cpuapp': ImagesFlashed(flashed=1, total=1)})
bb
cc
-- runners.nrfutil: reset after flashing requested
-- runners.nrfutil: Flashing file: /tmp/bb/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/_AA/merged.hex
-- runners.nrfutil: Erasing address ranges touched by firmware
-- runners.nrfutil: Programming image
-- runners.nrfutil: Verifying image
-- runners.nrfutil: Reset
-- runners.nrfutil: Board(s) with serial number(s) 1050097030 flashed successfully.

and for a single image for nrf5340dk//cpunet gives:

-- west flash: rebuilding
ninja: no work to do.
-- west flash: using runner nrfutil
aa
0
defaultdict(<class 'run_common.ImagesFlashed'>, {})
bb
cc
-- runners.nrfutil: reset after flashing requested
Using board 001050097030
-- runners.nrfutil: Flashing file: /tmp/bb/zephyr/samples/hello_world/_AA/merged.hex
-- runners.nrfutil: Erasing address ranges touched by firmware
-- runners.nrfutil: Programming image
-- runners.nrfutil: Verifying image
-- runners.nrfutil: Reset
-- runners.nrfutil: Board(s) with serial number(s) 1050097030 flashed successfully.

and for the same as before but with --no-reset gives:

-- west flash: rebuilding
ninja: no work to do.
-- west flash: using runner nrfutil
aa
0
defaultdict(<class 'run_common.ImagesFlashed'>, {})
Using board 001050097030
-- runners.nrfutil: Flashing file: /tmp/bb/zephyr/samples/hello_world/_AA/merged.hex
-- runners.nrfutil: Erasing address ranges touched by firmware
-- runners.nrfutil: Programming image
-- runners.nrfutil: Verifying image
-- runners.nrfutil: Board(s) with serial number(s) 1050097030 flashed successfully.

so it works fine

Copy link
Contributor Author

@EmilioCBen EmilioCBen Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for confirming this on your side, so it seems board_image_count can be 0. This was confusing to me because here I read "it should never be zero". #87423 (comment)
So I started to worry that this might be an intended issue.

I have just tested with the specific case of adding '--no-reset' and this does seem to work for me. But this PR is specific to the case that is dependent of setting zephyr,flash to a ram sector. As a result, the user will have to manually have to add --no-reset as I am not sure yet if I can make this automatic using the yml files.
But as you pointed out, this will not need to depend on the len(board_image_count) so I am going to drop this change.
Thank you for working with me to try and figure out if this check is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an update to add if(NOT SYSBUILD) in the board.cmake instead, gives the same effect I was trying to accomplish.

Comment on lines 34 to 35
board_runner_args(jlink "--no-reset")
board_runner_args(linkserver "--no-reset")
Copy link
Contributor

@nordicjm nordicjm Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 space indent for cmake, is also double space here between e.g. jlink and "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check on this, you did mean for me to put two spaces to indent these lines correct? The rest of the file is using a single space but due to the suggestion I used double space here, I may need to update the rest of the file in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 space indent for cmake e.g.:

if(BLAH)
  set()

1 space delimiter for cmake e.g.:

set(my_var "my_val")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, my recent update fixed this, but I may need to update the rest of the file in a different PR.

@nordicjm nordicjm dismissed their stale review June 26, 2025 08:18

blocking issue addressed

Updated board.cmake to check for single
image builds to indicate rather an additional
reset is needed for cases such as sysbuild.

Signed-off-by: Emilio Benavente <[email protected]>
Remove cmake code that updated the linkserver memory view, however
the values passed here are wrong and linkserver already includes
the correct space e.g location = 0x0 && size = 0x00080000

Signed-off-by: Emilio Benavente <[email protected]>
Update gdb arguments to enable linkserver
debugging when app does not reset after load.
This is useful for when the app is loading to
ram. In this case if gdb reset after loaded the
ram would be flushed when debugging and there would be
nothing no image in ram to debug.

Signed-off-by: Emilio Benavente <[email protected]>
@EmilioCBen EmilioCBen force-pushed the fix/mimxrt1170_flash_ram branch from 317b0cf to 9dc3b5d Compare June 27, 2025 18:10
@sonarqubecloud
Copy link

Copy link
Contributor

@hakehuang hakehuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my mimxrt1170_evk board rev A, west flash does not, but west debug works

((.venv) ) ubuntu@ubuntu-OptiPlex-7050:/home/shared/disk/zephyr_project/zephyr_test/zephyr/tests/drivers/flash/common$ LinkServer --version
LinkServer v25.3.31 [Build 31] [2025-03-26 14:12:20]

what I change is

diff --git a/boards/nxp/mimxrt1170_evk/mimxrt1170_evk_mimxrt1176_cm7.dts b/boards/nxp/mimxrt1170_evk/mimxrt1170_evk_mimxrt1176_cm7.dts
index f6216018957..71ef4f19443 100644
--- a/boards/nxp/mimxrt1170_evk/mimxrt1170_evk_mimxrt1176_cm7.dts
+++ b/boards/nxp/mimxrt1170_evk/mimxrt1170_evk_mimxrt1176_cm7.dts
@@ -26,7 +26,7 @@
                zephyr,shell-uart = &lpuart1;
                zephyr,canbus = &flexcan3;
                zephyr,flash-controller = &is25wp128;
-               zephyr,flash = &is25wp128;
+               zephyr,flash = &itcm;
                zephyr,code-partition = &slot0_partition;
                zephyr,uart-mcumgr = &lpuart1;
                zephyr,cpu1-region = &ocram;

run with

west flash --runner linserver --probe=<linkserver id>
Nc: Debug protocol: SWD. RTCK: Disabled. Vector catch: Disabled.
Ns: Content of CoreSight Debug ROM(s):
Nc: RBASE E00FD000: CID B105100D PID 000008E88C ROM (type 0x1)
Nc: ROM 1 E00FE000: CID B105100D PID 04000BB4C8 ROM (type 0x1)
Nc: ROM 2 E00FF000: CID B105100D PID 04000BB4C7 ROM (type 0x1)
Nc: ROM 3 E000E000: CID B105E00D PID 04000BB00C Gen SCS (type 0x0)
Nc: ROM 3 E0001000: CID B105E00D PID 04000BB002 Gen DWT (type 0x0)
Nc: ROM 3 E0002000: CID B105E00D PID 04000BB00E Gen (type 0x0)
Nc: ROM 3 E0000000: CID B105E00D PID 04000BB001 Gen ITM (type 0x0)
Nc: ROM 2 E0041000: CID B105900D PID 04001BB975 CSt ARM ETMv4.0 type 0x13 Trace Source - Core
Nc: ROM 2 E0042000: CID B105900D PID 04004BB906 CSt type 0x14 Debug Control - Trigger, e.g. ECT
Nc: ROM 1 E0043000: CID B105900D PID 04001BB908 CSt CSTF type 0x12 Trace Link - Trace funnel/router
Nc: NXP: MIMXRT1176xxxxx
Nc: DAP stride is 1024 bytes (256 words)
Nc: Inspected v.2 External Flash Device on SPI using SFDP JEDEC ID MIMXRT1170_SFDP_QSPI.cfx
Nc: Image 'iMXRT1170_SFDP_FlexSPI1_A_QSPI Mar 25 2025 17:33:21'
Nc: Opening flash driver MIMXRT1170_SFDP_QSPI.cfx
Nc: Sending VECTRESET to run flash driver
Nc: Flash variant 'iMXRT1170_SFDP_FlexSPI1_A_QSPI Mar 25 2025 17:33:21' detected (16MB = 256*64K at 0x30000000)
Nc: Closing flash driver MIMXRT1170_SFDP_QSPI.cfx
Pc: ( 65) Chip Setup Complete
Pc: ( 70) License Check Complete
Nt: Loading 'zephyr.bin' Binary 0x00000000 len 0x109EC
Ed:05: File 'zephyr.bin' load failure: Ef(11). No flash configured at the specified address.
Pc: (100) Target Operation Failed
CRITICAL: Critical error
ERRMSG: Exception: Flash operation exited with code 1
FATAL ERROR: command exited with status 255: /usr/local/LinkServer/LinkServer flash --probe JTAQCQAR MIMXRT1176xxxxx:MIMXRT1170-EVK load --addr 0 /home/shared/disk/zephyr_project/zephyr_test/zephyr/tests/drivers/flash/common/build/zephyr/zephyr.bin
=== Traceback (enabled by -vvv):
Traceback (most recent call last):
  File "/home/shared/disk/zephyr_project/.venv/lib/python3.12/site-packages/west/app/main.py", line 584, in run_command
    self.run_extension(args.command, argv)
  File "/home/shared/disk/zephyr_project/.venv/lib/python3.12/site-packages/west/app/main.py", line 739, in run_extension
    self.cmd.run(args, unknown, self.topdir, manifest=self.manifest,
  File "/home/shared/disk/zephyr_project/.venv/lib/python3.12/site-packages/west/commands.py", line 200, in run
    self.do_run(args, unknown)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/flash.py", line 32, in do_run
    do_run_common(self, my_args, runner_args, domain_file=domains_file)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/run_common.py", line 349, in do_run_common
    prev_runner = do_run_common_image(command, user_args, user_runner_args, used_cmds,
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/run_common.py", line 524, in do_run_common_image
    runner.run(command_name)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/runners/core.py", line 743, in run
    self.do_run(command, **kwargs)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/runners/linkserver.py", line 125, in do_run
    self.flash(**kwargs)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/runners/linkserver.py", line 227, in flash
    self.check_call(linkserver_cmd, **kwargs)
  File "/home/shared/disk/zephyr_project/zephyr_test/zephyr/scripts/west_commands/runners/core.py", line 890, in check_call
    subprocess.check_call(cmd, **kwargs)
  File "/usr/lib/python3.12/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/local/LinkServer/LinkServer', 'flash', '--probe', 'JTAQCQAR', 'MIMXRT1176xxxxx:MIMXRT1170-EVK', 'load', '--addr', '0', '/home/shared/disk/zephyr_project/zephyr_test/zephyr/tests/drivers/flash/common/build/zephyr/zephyr.bin']' returned non-zero exit status 255.

but if I using

west debug --runner linkserver --probe=<linkserver id>

it is OK

@EmilioCBen
Copy link
Contributor Author

@hakehuang previous comment raises a good point.

This behavior will happen in main too when changing the flash node
to itcm. This probably should be the expected behavior since the ram will receive
the image and not the flash but I don't want to rule out any alternatives with
this comment alone.

@dkalowsk
Copy link
Contributor

dkalowsk commented Jul 3, 2025

@nordicjm looking for your input on this for 4.2 inclusion

@nordicjm
Copy link
Contributor

nordicjm commented Jul 4, 2025

@nordicjm looking for your input on this for 4.2 inclusion

Is fixing a problem with a vendor's flash programming so could go in

@dkalowsk dkalowsk added this to the v4.2.0 milestone Jul 5, 2025
@dkalowsk dkalowsk merged commit 1142c40 into zephyrproject-rtos:main Jul 5, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RT1170 fails to load program when debugging from ram.

8 participants