Conversation
| is_b300 = instance_family == P6_B300 | ||
| efa_enabled = compute_resource.efa and compute_resource.efa.enabled | ||
| interface_type = "efa" if efa_enabled and not is_gb200 else None | ||
| interface_type = "efa" if efa_enabled and not is_gb200 and not is_b300 else None |
There was a problem hiding this comment.
[minor] Documentation: I suggest to make this logic more talkative by either:
- add a comment here specifying that those instance families require the first interface to be ENA even if EFA is enabled.
- define a constant
INSTANCE_TYPES_WITH_FIRST_INTERFACE_ENA = [P6E_GB200,P6_B300]and reference the constant in this logic
| Networking: | ||
| PlacementGroup: | ||
| Enabled: {% if instance not in ["p4d.24xlarge", "p6-b200.48xlarge"] %}true{% else %}false{% endif %} | ||
| Enabled: {% if instance not in ["p4d.24xlarge", "p6-b200.48xlarge", "p6-b300.48xlarge"] %}true{% else %}false{% endif %} |
There was a problem hiding this comment.
[minor] In the test code we have the logic to determine the capacity reservation and placement group. We could improve config readability by putting here the variable placement_group_enabled and set this variable in the test code accordingly.
I know that this approach was here before your PR, so not a blocker for this PR.
| PCLUSTER_BUILD_IMAGE_CLEANUP_ROLE_BOOTSTRAP_TAG_KEY = "parallelcluster:build-image-cleanup-role-bootstrapped" | ||
|
|
||
| P6E_GB200 = "p6e-gb200" | ||
| P6_B300 = "p6-b300" |
There was a problem hiding this comment.
when we introduced GB200 we did for a limited set of OSs. Are we sure we do not have any OS limitation for B300? For instance, according to documentation for NVIDIA 580, AL2 is not mentioned as a supported OS: https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-580-126-09/index.html
There was a problem hiding this comment.
I addded a validator to prevent al2 and b300 from being used together.
| Networking: | ||
| PlacementGroup: | ||
| Enabled: {% if instance not in ["p4d.24xlarge", "p6-b200.48xlarge"] %}true{% else %}false{% endif %} | ||
| Enabled: {% if instance not in ["p4d.24xlarge", "p6-b200.48xlarge", "p6-b300.48xlarge"] %}true{% else %}false{% endif %} |
There was a problem hiding this comment.
[Minor-NON Blocking] Lets change this to check for p family instance instead of the whole instance
| is_b300 = instance_family == P6_B300 | ||
| efa_enabled = compute_resource.efa and compute_resource.efa.enabled | ||
| interface_type = "efa" if efa_enabled and not is_gb200 else None | ||
| interface_type = "efa" if efa_enabled and not is_gb200 and not is_b300 else None |
There was a problem hiding this comment.
[Test] I suggest to cover the logic that determines the type of the interfaces with unit tests. To make it better I suggest to capture the logic into dedicated helper function and cover such function with unit test rather than the resulting template.
CHANGELOG.md
Outdated
| - Add validator that warns against the downsides of disabling in-place updates on compute and login nodes through DevSettings. | ||
| - Upgrade jmespath to ~=1.0 (from ~=0.10). | ||
| - Upgrade tabulate to <=0.9.0 (from <=0.8.10). | ||
| - Add support for p6-b300 instances by having a default network configuration for those instances. |
There was a problem hiding this comment.
We should surface in the changelog that the support is for all OS except for AL2
| " Please use one of the following OS: {2}".format(instance_type, os, SUPPORTED_OSES_FOR_P6E_GB200), | ||
| FailureLevel.ERROR, | ||
| ) | ||
| if instance_type.startswith("p6-b300") and os in UNSUPPORTED_OSES_FOR_P6_B300: |
There was a problem hiding this comment.
[Test] We should capture this change with unit tests.
| " Please use one of the following OS: {2}".format(instance_type, os, SUPPORTED_OSES_FOR_P6E_GB200), | ||
| FailureLevel.ERROR, | ||
| ) | ||
| if instance_type.startswith("p6-b300") and os in UNSUPPORTED_OSES_FOR_P6_B300: |
There was a problem hiding this comment.
[minor] We can refer to the constant P6_B300 rather than redefining the string.
| ) | ||
| if instance_type.startswith("p6-b300") and os in UNSUPPORTED_OSES_FOR_P6_B300: | ||
| self._add_failure( | ||
| "The instance type {0} is not officially supported with OS {1}." |
There was a problem hiding this comment.
I know we use the term "officially" elsewhere. However, I think we should avoid it to prevent misunderstanding. Let;'s simply say that it is not supported.
| None, | ||
| ), | ||
| ( | ||
| "p6-b300.48xlarge", |
There was a problem hiding this comment.
[minor] If you use p6-b300.WHATEVER_SIZE rather than p6-b300.48xlarge it will be clearer that the behavior applies whatever the size is. Otherwiose there could be a doubt that it is something specific to 48xlarge.
You applied this testing best practice in the unit test above. I suggest to apply it here as well.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7210 +/- ##
===========================================
+ Coverage 90.21% 90.27% +0.05%
===========================================
Files 183 183
Lines 16566 16645 +79
===========================================
+ Hits 14945 15026 +81
+ Misses 1621 1619 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description of changes
2.28.91.18.02.4.0Tests
Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.