Skip to content

Network tweak for DVs defined in parallel scenarios#229

Merged
kejacobson merged 9 commits intoOpenMDAO:mainfrom
Asthelen:network_tweak
Mar 16, 2026
Merged

Network tweak for DVs defined in parallel scenarios#229
kejacobson merged 9 commits intoOpenMDAO:mainfrom
Asthelen:network_tweak

Conversation

@Asthelen
Copy link
Collaborator

@Asthelen Asthelen commented Jan 27, 2026

My benchmark aeroelastic optimization models, when evaluated with remote components, consistently ran into some issue upon server restart. I think it was due to parallel scenarios' AoA's being defined in a parallel group. Here are a couple excerpts of the error, in case anyone else sees this:

  File "/lustre3/hpnobackup2/asthelen/Codes/Repositories/aeroelastic_optimization_benchmark/optimization_case2/sfe_pyshell_inviscid/mphys/network/server.py", line 414, in _set_design_variables_into_the_server_problem
    self.prob.get_val(key, get_remote=True)
  File "/hpnobackup2/asthelen/Codes/Repositories/OM_test/OpenMDAO_2025jan24/openmdao/core/problem.py", line 526, in get_val
    val = self.model._get_cached_val(name, abs_names, get_remote=get_remote)
UnboundLocalError: local variable 'val' referenced before assignment             
    if val is not _UNDEFINED:

A simple try loop around the prob.get_val, and setting design_change=True in the server if that happens, seems to get around the issue. I don't think this would add any additional cost to the optimization, as long as those (or probably any) DVs differ from the baseline (which they likely would if the server is being restarted during an optimization).

!= input_dict["design_vars"][key]["val"]
).any():
design_changed = True
except Exception as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the flake error should go away if you raise a specific error type like:

except UnboundLocalError:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry--just saw this. It seems to be okay with "except Exception". I don't intend for it to throw an error and stop, but rather continue running and just assume the design has changed (and therefore must be evaluated).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Asthelen , I agree with @timryanb. It's not clear what part of this check you are expecting to throw an exception. It would be nice to catch a specific exception like a KeyError or ValueError and add a comment saying why you might expect this to happen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on this now, with a newer environment... it seems that with the latest OM, this may no longer be an issue. However, the model initialization fails with the current version due to broken connections... and weirdly enough, writing an n2 after setup fixes it. My OM is pretty new but I'll plan on pulling again to see if it's still an issue then.

Is it worth finding the older OM version that gave me the above error and adding this fix (with a better description of the error)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the n2 issue with this model was due to a bug I introduced a few weeks ago. It was trying to connect to inputs that didn't exist, which only gave me an error when the n2 wasn't written before a model evaluation.

Anyway, OM 3.42.1-dev works fine without this fix. 3.39.0, which I think I was using previously, does give me the error without the try loop. I'll have it write out a more helpful warning, in case anyone is on an older OM, but since it's not an issue with the latest OM I don't mind just closing the PR instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current push gives me this warning message when using OM 3.39.0:

SERVER: Unable to get val of pullup.pullup.conditions.angle_of_attack due to the following error:
cannot access local variable 'val' where it is not associated with a value       
This can occur with certain versions of OpenMDAO upon server restart, typically due to IVCs/DVs defined within parallel groups. Assuming design has changed...

@kejacobson kejacobson merged commit 7840e32 into OpenMDAO:main Mar 16, 2026
4 checks passed
@Asthelen Asthelen deleted the network_tweak branch March 17, 2026 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants