Skip to content

Conversation

@gmarciani
Copy link
Contributor

@gmarciani gmarciani commented Jan 8, 2026

Description of changes

Add support for selective ExtraChefAttributes updates.
Accept updates for the Chef attribute cluster/slurm/reconfigure_timeout.
The new Chef attribute is introduced in aws/aws-parallelcluster-cookbook#3087.

Use the new attribute in the existing cluster update executed by test_slurm, to validate that the update would succeed.

User Experience

[UC1] Setting reconfigure_timeout > 300, should succeed

# From
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"

# To
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"
  Cookbook:
    ExtraChefAttributes: |
      { "cluster" : { "slurm" : { "reconfigure_timeout": 600 } } }

# Result
{
  "cluster": {
    "clusterName": "simple-0107-3",
    "cloudformationStackStatus": "UPDATE_IN_PROGRESS",
    "cloudformationStackArn": "arn:aws:cloudformation:us-east-1:319414405305:stack/simple-0107-3/a00668f0-ec0d-11f0-b576-0affc7d57d0d",
    "region": "us-east-1",
    "version": "3.15.0",
    "clusterStatus": "UPDATE_IN_PROGRESS",
    "scheduler": {
      "type": "slurm"
    }
  },
  "changeSet": [
    {
      "parameter": "DevSettings.Cookbook.ExtraChefAttributes",
      "requestedValue": "{ \"cluster\" : { \"slurm\" : { \"reconfigure_timeout\": 600 } } }\n",
      "currentValue": "-"
    }
  ]
}

[UC2] Setting reconfigure_timeout to non integer value, should fail

# From
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"

# To
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"
  Cookbook:
    ExtraChefAttributes: |
      { "cluster" : { "slurm" : { "reconfigure_timeout":"xyz" } } }

# Result
{
  "configurationValidationErrors": [
    {
      "level": "ERROR",
      "type": "ExtraChefAttributesValidator",
      "message": "Invalid value in DevSettings/Cookbook/ExtraChefAttributes: attribute 'cluster/slurm/reconfigure_timeout' must be an integer."
    }
  ],
  "message": "Invalid cluster configuration."
}

[UC3] Setting reconfigure_timeout < 300, should fail

# From
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"

# To
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"
  Cookbook:
    ExtraChefAttributes: |
      { "cluster" : { "slurm" : { "reconfigure_timeout": 200 } } }

# Result
{
  "configurationValidationErrors": [
    {
      "level": "ERROR",
      "type": "ExtraChefAttributesValidator",
      "message": "Invalid value in DevSettings/Cookbook/ExtraChefAttributes: attribute 'cluster/slurm/reconfigure_timeout' must be greater than 300."
    }
  ],
  "message": "Invalid cluster configuration."
}

[UC4] Setting a non existing attribute, should fail

# From
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"

# To
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"
  Cookbook:
    ExtraChefAttributes: |
      { "cluster" : { "slurm" : { "reconfigure_timeoutX": 600 } } }

# Result
{
  "message": "Update failure",
  "updateValidationErrors": [
    {
      "parameter": "DevSettings.Cookbook.ExtraChefAttributes",
      "requestedValue": "{ \"cluster\" : { \"slurm\" : { \"reconfigure-timeoutX\": 600 } } }",
      "message": "The following ExtraChefAttributes fields cannot be updated: cluster.slurm.reconfigure_timeoutX. Revert the non-updatable ExtraChefAttributes fields to their original values.",
      "currentValue": "-"
    }
  ],
  "changeSet": [
    {
      "parameter": "DevSettings.Cookbook.ExtraChefAttributes",
      "requestedValue": "{ \"cluster\" : { \"slurm\" : { \"reconfigure_timeoutX\": 600 } } }",
      "currentValue": "-"
    }
 

[UC5] Setting a non updatable attribute, should fail

# From
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"

# To
DevSettings:
  AmiSearchFilters:
    Owner: "447714826191"
  Cookbook:
    ExtraChefAttributes: |
      { "cluster" : { "slurm" : { "reconfigure_timeout": 600 }, "in_place_update_on_fleet_enabled" : "false"} }

# Result
{
  "message": "Update failure",
  "updateValidationErrors": [
    {
      "parameter": "DevSettings.Cookbook.ExtraChefAttributes",
      "requestedValue": "{ \"cluster\" : { \"slurm\" : { \"reconfigure_timeout\": 600 }, \"in_place_update_on_fleet_enabled\" : \"false\"} }\n",
      "message": "The following ExtraChefAttributes fields cannot be updated: cluster.in_place_update_on_fleet_enabled. Revert the non-updatable ExtraChefAttributes fields to their original values.",
      "currentValue": "-"
    }
  ],
  "changeSet": [
    {
      "parameter": "DevSettings.Cookbook.ExtraChefAttributes",
      "requestedValue": "{ \"cluster\" : { \"slurm\" : { \"reconfigure_timeout\": 600 }, \"in_place_update_on_fleet_enabled\" : \"false\"} }\n",
      "currentValue": "-"
    }
  ]
}

Tests

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@gmarciani gmarciani force-pushed the wip/mgiacomo/3150/slurm-reconfigure-timeout branch from 20cefbd to 63c9068 Compare January 8, 2026 16:32
@gmarciani gmarciani force-pushed the wip/mgiacomo/3150/slurm-reconfigure-timeout branch from 63c9068 to e1f5d14 Compare January 8, 2026 16:34
@gmarciani gmarciani changed the title Wip/mgiacomo/3150/slurm reconfigure timeout [DevSettings] Expose a new Chef attribute cluster/slurm/reconfigure_timeout. Jan 8, 2026
@gmarciani gmarciani changed the title [DevSettings] Expose a new Chef attribute cluster/slurm/reconfigure_timeout. [DevSettings] Add support for selective Chef Attributes updates + mark cluster/slurm/reconfigure_timeout as updatable Jan 8, 2026
@gmarciani gmarciani added skip-changelog-update Disables the check that enforces changelog updates in PRs 3.x labels Jan 8, 2026
@gmarciani gmarciani force-pushed the wip/mgiacomo/3150/slurm-reconfigure-timeout branch 2 times, most recently from 4decced to 76813d4 Compare January 8, 2026 17:13
@gmarciani gmarciani marked this pull request as ready for review January 8, 2026 17:16
@gmarciani gmarciani requested review from a team as code owners January 8, 2026 17:16
@gmarciani gmarciani force-pushed the wip/mgiacomo/3150/slurm-reconfigure-timeout branch from 76813d4 to ac09bdf Compare January 8, 2026 17:23
Accept updates for the attribute `cluster/slurm/reconfigure_timeout`.
@gmarciani gmarciani force-pushed the wip/mgiacomo/3150/slurm-reconfigure-timeout branch from ac09bdf to 6f68322 Compare January 8, 2026 17:36
"""Represent the common schema of Dev Setting for ImageBuilder and Cluster."""

cookbook = fields.Nested(CookbookSchema, metadata={"update_policy": UpdatePolicy.UNSUPPORTED})
cookbook = fields.Nested(CookbookSchema, metadata={"update_policy": UpdatePolicy.IGNORED})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IGNORE policy on a param basically says that the decision to accept/reject an update is on its subsection.
This is required to accept an update from a config where Cookbook section is missing to a config where there is Cookbook/ExtraChefAttributes with updatable attributes.

Example.

# From
DevSettings:
... (no CookbookSection)

# To
DevSettings:
  Cookbook:
    ExtrachefAttributes: |
      {"cluster": {"slurm": {"reconfigure_timeout": 600 }}}

@gmarciani gmarciani merged commit 3427e70 into aws:develop Jan 8, 2026
24 checks passed
@gmarciani gmarciani deleted the wip/mgiacomo/3150/slurm-reconfigure-timeout branch January 8, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.x skip-changelog-update Disables the check that enforces changelog updates in PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants