Skip to content

Conversation

@WilkAndy
Copy link
Contributor

@WilkAndy WilkAndy commented Jan 7, 2026

Created some stack variables to replace heap variables in plasticity algorithms. A number of other heap variables should be put on stack to close the issue: this is a first-pass only. Refs #32178

Reason

Design

Impact

@moosebuild
Copy link
Contributor

Job Precheck, step Clang format on 5fc246c wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/32180/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format 8af10d8af7285032eae5d1e1be3ca2b69ea79b39

@moosebuild
Copy link
Contributor

moosebuild commented Jan 7, 2026

Job Documentation, step Docs: sync website on 81747b5 wanted to post the following:

View the site here

This comment will be updated on new commits.

@WilkAndy
Copy link
Contributor Author

WilkAndy commented Jan 7, 2026

Hi @loganharbour . While at INL, we did some profiling and found that heap allocation/deallocation was taking about 20% of the compute time in plasticity models. This is my attempt at addressing a bit of this problem (by making class variables). Could you please look and comment? There are probably better ways of doing this. I'm worried about:

  • any problems with threading and these mutable variables?
  • i have no idea how to change mastodon/blackbear at the same time (it fails to compile with this change)
  • there are obviously >=1 Internal app that also fails to compile because i've changed the signature of dstress_param_dstress

@moosebuild
Copy link
Contributor

moosebuild commented Jan 7, 2026

Job Coverage, step Generate coverage on 81747b5 wanted to post the following:

Framework coverage

5eee4e #32180 81747b
Total Total +/- New
Rate 85.81% 85.81% -0.00% -
Hits 126281 126280 -1 0
Misses 20876 20877 +1 0

Diff coverage report

Full coverage report

Modules coverage

Solid mechanics

5eee4e #32180 81747b
Total Total +/- New
Rate 84.98% 84.98% +0.00% 100.00%
Hits 28199 28220 +21 117
Misses 4983 4986 +3 0

Diff coverage report

Full coverage report

Full coverage reports

Reports

This comment will be updated on new commits.

Copy link
Member

@loganharbour loganharbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutable is mostly okay.

Do the functions that call it need to be const? If not, it's probably to just make the functions non-const so that it's clear that they change state. And add in the docstring that they changed that shared state.

For mastodon - you can actually re-add the old methods, and just have them call your new function (returning a copy of tensor). We can merge this, fix mastodon, and then remove the old methods.

@WilkAndy WilkAndy force-pushed the no_heap_A_32178 branch 2 times, most recently from 3152c75 to 497d3aa Compare January 8, 2026 02:48
@WilkAndy
Copy link
Contributor Author

WilkAndy commented Jan 8, 2026

Hi @loganharbour , thank you for your input so far. Can you have another look and tell me whether this approach is good? If so, i'll make a bunch of other similar changes. I suspect not only mastodon/blackbear will need modification - there might be closed-source apps that need small modification too.

Perhaps mutable+const is actually better, as "const" tells the reader "this isn't changing class members" which is reasonable conceptually for the dstressparam_dstress function, which doesn't change important things like stress. But then mutable is needed, and i don't see many mutable things in MOOSE.

@loganharbour
Copy link
Member

I suspect not only mastodon/blackbear will need modification - there might be closed-source apps that need small modification too.

Yeah, because of this I think it's best that we keep the old functions around (in their non-optimized state) so that we can cleanly patch them and eventually remove the old functions in MOOSE once the apps are patched.

@WilkAndy
Copy link
Contributor Author

WilkAndy commented Jan 8, 2026

OK, thanks @loganharbour . Today at some stage, i'm going to go ahead and change more similar things in a similar way, assuming you think this approach (putting the "out" result as a function argument instead of a function return) is optimal.

@WilkAndy
Copy link
Contributor Author

WilkAndy commented Jan 9, 2026

Hi @loganharbour - could you please confirm that these mutables won't cause problems with multi-threading?

@WilkAndy WilkAndy force-pushed the no_heap_A_32178 branch 2 times, most recently from 3ed96fa to 33defc3 Compare January 9, 2026 07:20
@WilkAndy WilkAndy marked this pull request as ready for review January 9, 2026 07:28
Copy link
Contributor

@GiudGiud GiudGiud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think these are stack variables. They are just pre-allocated, and still on the heap.

I would rename the PR to "Pre-allocate computation arrays for plasticity"

@WilkAndy WilkAndy changed the title Created some stack variables to replace heap variables in plasticity Pre-allocate computation arrays for plasticity Jan 11, 2026
@WilkAndy
Copy link
Contributor Author

By the way, @loganharbour and @GiudGiud , with this PR, the walltime of residual calculations (which are the bulk of compute time in explicit time stepping) is about 0.7x what it was before this PR. This is in realistic models. I'm super pleased with that speedup.

…ely MohrCoulombStressUpdate, TensileStressUpdate and TwoParameterPlasticityStressUpdate, and derived classes from those, such as CappedDruckerPragerStressUpdate and CappedDruckerPragerCosseratStressUpdate, now have no variables that are repeatedly heap-allocated/deallocated in functions that are relevant to explicit time stepping (ie, functions relevant to Jacobian-only calculations have largely been untouched). Testing on realistic models shows this speeds up residual computations by around 1.4x, which is especially important when using explicit time stepping where residual calculations often take the bulk of the simulation runtime. Refs idaholab#32178 .

Most of the strategy boils down to making a few mutable scratch class variables that are allocated just once.  Mutable variables have been used so that const could be retained, which i think makes most sense in this situation.

Most of the changes occur within MultiParameterPlasticityStressUpdate, and do not impact derived classes.  However, the signatures of two important functions - dstress_param_dstress and d2stress_param_dstress - have been changed.  The derived classes within solid_mechanics have all had their signatures updated, but external Apps such as Blackbear should be updated to take advantage of all the speedups associated with no repeated heap alloc/dealloc.  When that is completed, the old versions may be deleted from MultiParameterPlasticityStressUpdate, and I've included notes on how to do that.
@moosebuild
Copy link
Contributor

Job Test, step Results summary on 81747b5 wanted to post the following:

Framework test summary

Compared against 5eee4e5 in job civet.inl.gov/job/3497366.

No change

Modules test summary

Compared against 5eee4e5 in job civet.inl.gov/job/3497366.

No change

@GiudGiud
Copy link
Contributor

Code looks good

There's quite a bit of not-covered code, so we can't really know if the changes break things
https://mooseframework.inl.gov/docs/PRs/32180/coverage_diff/solid_mechanics/

Are these routines tested in combined or in another module than solid mechanics? Or in an app?
The two main parts uncovered are TwoParameterPlasticityStressUpdate
and MultiParameterPlasticityStressUpdate the dstress_dparam parts

@GiudGiud
Copy link
Contributor

with this PR, the walltime of residual calculations (which are the bulk of compute time in explicit time stepping) is about 0.7x what it was before this PR

this is really great. We should take another look at any code allocating tensors on the fly

@loganharbour loganharbour requested a review from GiudGiud January 12, 2026 14:46
@WilkAndy
Copy link
Contributor Author

WilkAndy commented Jan 12, 2026

Code looks good

There's quite a bit of not-covered code, so we can't really know if the changes break things https://mooseframework.inl.gov/docs/PRs/32180/coverage_diff/solid_mechanics/

Are these routines tested in combined or in another module than solid mechanics? Or in an app? The two main parts uncovered are TwoParameterPlasticityStressUpdate and MultiParameterPlasticityStressUpdate the dstress_dparam parts

Hi @GiudGiud - thank you... i want to address the reduced coverage. (But not all of it, as i note that about half the "misses" are on the functions that are now depreciated, so none of solid mechanics should hit those functions.). Could you please:

  1. confirm that PetscJacobianTester is covered by "coverage"? I'm surprised that those misses occur, because i thought i wrote lots of tests for the d2stressparam_dstress stuff.
  2. I can't remember how to build the coverage on my own machine, and couldn't find instructions on mooseframework or in discussions - could you please instruct me how to do it?

@GiudGiud
Copy link
Contributor

confirm that PetscJacobianTester is covered by "coverage"? I'm surprised that those misses occur, because i thought i wrote lots of tests for the d2stressparam_dstress stuff.

Yes unless they are marked "heavy"

I can't remember how to build the coverage on my own machine, and couldn't find instructions on mooseframework or in discussions - could you please instruct me how to do it?

I never do it that way, I just push to civet and civet takes care of generating the coverage report

@GiudGiud GiudGiud self-assigned this Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants