Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions src/machine/machine_rp2_cores.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,46 @@ package machine

const numCPU = 2 // RP2040 and RP2350 both have 2 cores

// LockCore implementation for the cores scheduler.
// LockCore sets the affinity for the current goroutine to the specified core.
// This does not immediately migrate the goroutine; migration occurs at the next
// scheduling point. See machine_rp2.go for full documentation.
// Important: LockCore sets the affinity but does not immediately migrate the
// goroutine to the target core. The actual migration happens at the next
// scheduling point (e.g., channel operation, time.Sleep, or Gosched). After
// that point, the goroutine will wait in the target core's queue if that core
// is busy running another goroutine.
//
// To avoid potential blocking on a busy core, consider calling LockCore in an
// init function before any other goroutines have started. This guarantees the
// target core is available.
Comment on lines +11 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a hard requirements; that is, LockCore should panic if any other goroutine has started.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding "panic if goroutines started" - I have a specific use case for dynamic pinning:

Motion control board where:

Core 0: Communications and non-critical tasks
Core 1: Hard real-time step generation (must not be interrupted)
The pattern I need is:

func main() {
go func() {
machine.LockCore(1) // Pin worker to core 1
stepGenerationLoop()
}()
machine.LockCore(0) // Pin main to core 0
commsLoop()
}
If LockCore panics when goroutines have started, this pattern wouldn't work.

Would you accept one of these:

  • Adding Gosched() in LockCore (so it actually migrates before returning)
  • Keeping dynamic pinning allowed, with clear documentation of risks
  • Perhaps a convention that each core should only have ONE pinned goroutine?

The deadlock risk is manageable if users follow the pattern of pinning early and using one goroutine per core for pinned work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding Gosched() in LockCore (so it actually migrates before returning)

Yes, if we do this LockCore should not return before the core is pinned.

However, given your use case: is it possible to run the step generation off a hardware timer and an interrupt handler? That seems a better fit, and you can keep both cores running non-critical code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, That is precisely the implementation that I am trying to get away from.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate why? Assuming your stepGenerationLoop sometimes sleeps, it seems much less efficient to lock a core 100% of the time, even during sleeps than running an interrupt off a timer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interrupt-based approach has several issues for precision step generation:

  1. Interrupt Latency & Jitter Interrupts can be delayed by:
    Other higher-priority interrupts
    Critical sections in the runtime (GC, scheduler locks)
    interrupt.Disable() calls in drivers
    For step generation, even microseconds of jitter causes visible artifacts (rough surfaces, inaccurate positioning). A dedicated core can be made to have deterministic timin

  2. My step generation loop doesn't time.Sleep() - it does tight timing loops or busy-waits on hardware timers for sub-microsecond precision:

func stepGenerationLoop() {
for {
waitForNextStepTime() // Busy-wait or timer poll
pulseStepPins() // Must happen at EXACT time
calculateNextStep() // Acceleration/deceleration curves
}
}

The core isn't idle - it's maintaining precise timing. An interrupt would add latency between "timer fired" and "step pin toggled."

  1. Interrupt Context Limitations In an interrupt handler, I can't:
    Use blocking operations
    Allocate memory (GC issues)
    Have complex state machines easily
    Use goroutine synchronization primitives
    With a dedicated core, I can write normal Go code with loops, state, and even coordinate multiple axes cleanly.

  2. The Alternative is Worse Without core pinning, my options are:
    Interrupt-based (jitter issues above)
    PIO-only (limited for complex multi-axis coordination)
    External step generator chip (added cost/complexity)
    Core pinning gives me a way to isolate real-time work.

Copy link
Contributor

@eliasnaur eliasnaur Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I believe I'm doing something similar on a PIO-capable chip (rp2350). My solution is a 3-layer architecture:

  • Regular Go code generates high-level primitives (line segments, bezier curves etc.) and sends them over a channel. This code is only timing sensitive to the extent that it needs to run often enough to not starve the interrupt handler. Depending on your application, you can get rid of the timing requirement by generating batches of primitives where each batch ends with the machine at a safe point (zero velocity/acceleration etc.).
  • Interrupt handler receives from the channel and chops the primitives into fixed-duration updates and packs them into DMA buffers. For example, 2 steppers is 4 bits (2xdir, 2xstep) per update, and 8 updates per 32-bit PIO FIFO word.
    The interrupt handler is only timing sensitive insofar it must run often enough to fill the DMA buffers. If you don't have too much going on, you may not even need DMA and can get away with feeding the PIO FIFO buffer (8 words I believe) directly.
  • A PIO state machine receives updates from its FIFO (fed directly or through DMA) and multiplex them onto the corresponding GPIO pins. The nature of PIOs makes the timing very robust and decoupled from the activity of the system, except for contention on the memory bus.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is always multiple ways to skin a cat. I am using an approach similar to what you described as well. But having core pinning has its advantages as well. Also, It seems like several people have asked for it previously. Appreciate your reviews and responses

//
// This is useful for:
// - Isolating time-critical operations to a dedicated core
// - Improving cache locality for performance-sensitive code
// - Exclusive access to core-local resources
//
// Warning: Pinning goroutines can lead to load imbalance. The goroutine will
// wait in the specified core's queue even if other cores are idle. If a
// long-running goroutine occupies the target core, LockCore may appear to
// block indefinitely (until the next scheduling point on the target core).
//
// Valid core values are 0 and 1. Panics if core is out of range.
//
// Only available on RP2040 and RP2350 with the "cores" scheduler.
func LockCore(core int) {
if core < 0 || core >= numCPU {
panic("machine: core out of range")
}
machineLockCore(core)
}

// UnlockCore implementation for the cores scheduler.
// UnlockCore unpins the calling goroutine, allowing it to run on any available core.
// This undoes a previous call to LockCore.
//
// After calling UnlockCore, the scheduler is free to schedule the goroutine on
// any core for automatic load balancing.
//
// Only available on RP2040 and RP2350 with the "cores" scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous comment.

func UnlockCore() {
machineUnlockCore()
}
Expand Down
5 changes: 5 additions & 0 deletions src/runtime/runtime.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,9 @@ func os_sigpipe() {
// LockOSThread wires the calling goroutine to its current operating system thread.
// On microcontrollers with multiple cores (e.g., RP2040/RP2350), this pins the
// goroutine to the core it's currently running on.
// With the "cores" scheduler on RP2040/RP2350, this pins the goroutine to the
// core it's currently running on. The pinning takes effect at the next
// scheduling point (e.g., channel operation, time.Sleep, or Gosched).
// Called by go1.18 standard library on windows, see https://github.com/golang/go/issues/49320
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While here, remove this now irrelevant comment.

func LockOSThread() {
lockOSThreadImpl()
Expand All @@ -108,6 +111,8 @@ func LockOSThread() {
// UnlockOSThread undoes an earlier call to LockOSThread.
// On microcontrollers with multiple cores, this unpins the goroutine, allowing
// it to run on any available core.
// With the "cores" scheduler, this unpins the goroutine, allowing it to run on
// any available core.
func UnlockOSThread() {
unlockOSThreadImpl()
}
Expand Down
Loading