Skip to content

Smooth Linear Advance (Testing)#204

Closed
classicrocker883 wants to merge 46 commits into2025-Junefrom
2025-April-LA-Test
Closed

Smooth Linear Advance (Testing)#204
classicrocker883 wants to merge 46 commits into2025-Junefrom
2025-April-LA-Test

Conversation

@classicrocker883
Copy link
Copy Markdown
Owner

Description

Testing MarlinFirmware/Marlin#27710

Requirements

Benefits

Configurations

Related Issues

dbuezas and others added 26 commits February 24, 2025 19:15
(cherry picked from commit 29720dce9cff7a8990bb7d7b0c72776ee79ee18a)
(cherry picked from commit 28300f8cdff96399c5836cacb74c964da45fc453)
(cherry picked from commit 0f10d91b73d30c7f5fec71364ef0708aab896421)
(cherry picked from commit 2d74be5860748f5c72f2d71e45c5c1cfd015a96e)
(cherry picked from commit 4115faf52216c832e4562d1fc68d1ea524507a64)
(cherry picked from commit adb6c0087dc3bb210fcaa3fd58c3f5edd01e8ba1)
(cherry picked from commit 3722520134abb5a2a061027b415569847d192f47)
(cherry picked from commit 17ec816c0284f55c6fbc16cddcc16c49fa19a8d9)
(cherry picked from commit 3b5a4a6baffa31e1f92d7ed93b7911bc8a15f1e4)
(cherry picked from commit 68624a514cd9f19034f3652948638197884ddaaa)
(cherry picked from commit ff053d207b78e4fd6ea0e20cfb151aeb6df71a50)
(cherry picked from commit c4e7fd0bc7aa6e6fda3c7344061a08bf4f67390c)
(cherry picked from commit b8de47c618367580377425f8a8a88e6fd9873b1f)
(cherry picked from commit 9b5998d3dd34165c3d4962381cc76e4726d13389)
(cherry picked from commit 5c57e7033649b2fba441d252226fcd1ba577ab79)
(cherry picked from commit e25c0198ca681164e29526b31c8b5d14fc4de695)
(cherry picked from commit b41e8634b4d53ea3a9ba894b7f8fd1252bac6797)
This may improve high speed seams a small bit but it is very dependant on machine properties and can result in underextrusion too

(cherry picked from commit adc7ff273dd2b33a6364d26bd949e0f5a130e203)
(cherry picked from commit 2fb6c6b28cef48a0414f1479564f01cdee04d0d6)
@dbuezas
Copy link
Copy Markdown

dbuezas commented Apr 9, 2025

Things to consider when testing:

  • If extruder skips, increase tau.
  • Extruding moves will now use the full max acceleration & jerk, which will now not be limited by linear advance. If it was set too high or too low before you wouldn't have noticed it. Now you will
  • Same applies to JD
  • Best k value will likely be a bit different, particularly for high accelerations and speeds, so make sure to recalibrate.
  • The biggest time savings are best done in inner walls and infills, not on outer walls (good quality/speed balance by keeping accels & jerks lower for outer walls). Obvious but it was less relevant before.
  • Check your stepper driver temperatures when using high accels.
  • S-Curve may or may not work correctly. This algorithm can be more easily updated to be more tightly integrated with S-Curve in the future for optimal results.

Looking forward to hear your reaction to it

@classicrocker883
Copy link
Copy Markdown
Owner Author

I see there is an option to sync with Input Shaping, but INPUT_SHAPING_* doesnt need to be enabled right?

and I wouldn't know what to look for whether S curve is enabled or not. another thing is with the Aquila mainboard, OLD_ADAPTIVE_MULTISTEPPING is enabled following a PR (a while ago) otherwise it would make noises with the motors. maybe this can be disabled now. I upgraded my X/Y motors to 42-40, which run cool, so I can max things out.

I'm not sure what differences this would make but do you have any ideas of like what model to test print or what would be a suggested starting value, and what to look out for as to changing them?

@dbuezas
Copy link
Copy Markdown

dbuezas commented Apr 9, 2025

synching extrusion to input shaping is not required but if you use input shaping it does improve quality. This is because input shaping changes the trapezoid or s curve profile, and if extrusion is not synchronized, then the extrusion happens thinking the profile was unchanged (will over and under extrude in different parts of the profile)

This is also the case before smooth advance btw, but since it limits acceleration, the problem is less "exposed"

if (delayBuffer.index == IS_COMPENSATION_BUFFER_SIZE) {
delayBuffer.index = 0;
}
delayBuffer.index = (delayBuffer.index + 1) % IS_COMPENSATION_BUFFER_SIZE; // Wrap around
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

watch out, % is expensive

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if it needs to wrap around twice, the buffer is too small and taking that index will take the wrong value

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning uses more flash/ram?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appears to only add 8 bytes.
I compiled using % and without by reverting void add_to_buffer() and xy_float_t lookback()

here it is using the updated with %:

RAM:   [====      ]  37.9% (used 24812 bytes from 65536 bytes)
Flash: [====      ]  42.6% (used 223488 bytes from 524288 bytes)

here it is without (originally):

RAM:   [====      ]  37.9% (used 24812 bytes from 65536 bytes)
Flash: [====      ]  42.6% (used 223480 bytes from 524288 bytes)

are there any improvements you suggest?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about changing to:

- delayBuffer.index = (delayBuffer.index + 1) % IS_COMPENSATION_BUFFER_SIZE; // Wrap around
+ delayBuffer.index = (delayBuffer.index + 1) & (IS_COMPENSATION_BUFFER_SIZE - 1); // Wrap around (optimized for power-of-2 sizes)

This approach is faster because bitwise operations are generally more efficient than division. However, it only works if IS_COMPENSATION_BUFFER_SIZE is a power of two

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index increments by one, so the original approach is enough imo.
The buffer size depends on the minimum input shaping frequency, which actually also limits input shaping min frequency itself, so the comment about informing the user is not really necessary (if that were to happen, input shaping itself would be failing)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo takes more cpu cycles, it is as costly as division. Since the index is always incremented by 1, the cheap original solution is correct (and the fastest)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I updated this, it should be like the original, and cleaned up.

@classicrocker883
Copy link
Copy Markdown
Owner Author

classicrocker883 commented Apr 10, 2025

I made some updates that might improve some things. go ahead if you'd like to look through and copy over to your Marlin PR.

so some things like spacing and indentation needed fixing.
Made so it is more concise and efficient, leveraging modulo operations for circular buffer management.


1. add_to_buffer:

delayBuffer.buffer[delayBuffer.index] = input;
delayBuffer.index = (delayBuffer.index + 1) % IS_COMPENSATION_BUFFER_SIZE;

2. lookback:

uint16_t past_i = (delayBuffer.index + IS_COMPENSATION_BUFFER_SIZE - delay_steps) % IS_COMPENSATION_BUFFER_SIZE;
  • Uses modulo (%) to calculate the wrapped-around index (past_i) directly.
  • Uses modulo (%) to wrap the index around when it reaches the buffer size.
    This is a concise and efficient way to handle circular buffers.

3. Error Handling for Buffer Overflow:

  • Updated:

    • Assumes the buffer size is sufficient and does not explicitly handle overflow beyond a comment (TODO: How to inform user?).
  • Previously:

    • Explicitly resets past_i to delayBuffer.index if delay_steps exceeds the buffer size, ensuring no out-of-bounds access.

@classicrocker883
Copy link
Copy Markdown
Owner Author

Update

  • Reverted/removed % changes
  • Renamed t and past_i in stepper.cpp lookback() and add_to_buffer()
  • Added wrap_index() function to clean up the code.

@classicrocker883 classicrocker883 changed the base branch from 2025-April to 2025-June June 6, 2025 22:31
@classicrocker883 classicrocker883 deleted the 2025-April-LA-Test branch August 8, 2025 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants