Skip to content

Conversation

AgustinBettati
Copy link
Member

@AgustinBettati AgustinBettati commented Aug 4, 2025

Description

Link to any related issue(s): CLOUDP-321150

Problem

If a tenant upgrade wants to be done from M0 to an NMVe instance (e.g. M40_NVME) backup must be enabled. Our current implementation does not send backup enabled information in the tenant upgrade request.

Solution

Include backup enabled information in tenant and flex upgrade requests to achieve a success upgrade if NMVe and backup is defined.

Example tenant upgrade request with NVMe and backup enabled is defined:

{
 "name": "SomeCluster",
 "providerBackupEnabled": true,
 "providerSettings": {
  "providerName": "AWS",
  "instanceSizeName": "M40_NVME",
  "regionName": "US_EAST_1"
 }
}

The following cases are considered:

Before (all cases failing)

  • tenant -> (NMVe + backup true): clusters/tenantUpgrade POST: HTTP 400 Bad Request, Detail: Cloud backups must be enabled for deployments with NVMe storage.
  • flex -> (NMVe + backup true): flexClusters:tenantUpgrade POST: HTTP 400 Bad Request, Reason: Cannot create an NVMe cluster without Cloud Backup enabled.
  • tenant → flex → (NVMe + backup true): flexClusters:tenantUpgrade POST: HTTP 400 Bad Request, Reason: Cannot create an NVMe cluster without Cloud Backup enabled.

After

  • [fixed] tenant -> (NMVe + backup true): Now working correctly, captured in adjusted TestAccMockableAdvancedCluster_tenantUpgrade test.
  • [fixed] flex -> (NMVe + backup true): Now working, no test is explicitly capturing this case.
  • [fixed*] tenant → flex → (NVMe + backup true)
    • SDK v2: This case is still failing. Returns a different error Cannot modify disk size for an NVMe cluster. because computed diskSizeGB is sent as part of request.
    • TPF: Now working correctly. Different to SDK v2 implementation, TPF does not includes computed diskSizeGB as part of request. No test is explicitly capturing this case, potentially after 2.0.0 we can adjust testAccAdvancedClusterFlexUpgrade to always upgrade to an NVMe instance with backup.

Type of change:

  • Bug fix (non-breaking change which fixes an issue). Please, add the "bug" label to the PR.
  • New feature (non-breaking change which adds functionality). Please, add the "enhancement" label to the PR. A migration guide must be created or updated if the new feature will go in a major version.
  • Breaking change (fix or feature that would cause existing functionality to not work as expected). Please, add the "breaking change" label to the PR. A migration guide must be created or updated.
  • This change requires a documentation update
  • Documentation fix/enhancement

Required Checklist:

  • I have signed the MongoDB CLA
  • I have read the contributing guides
  • I have checked that this change does not generate any credentials and that they are NOT accidentally logged anywhere.
  • I have added tests that prove my fix is effective or that my feature works per HashiCorp requirements
  • I have added any necessary documentation (if appropriate)
  • I have run make fmt and formatted my code
  • If changes include deprecations or removals I have added appropriate changelog entries.
  • If changes include removal or addition of 3rd party GitHub actions, I updated our internal document. Reach out to the APIx Integration slack channel to get access to the internal document.

Further comments

@github-actions github-actions bot added the bug label Aug 4, 2025
Copy link
Contributor

This PR has gone 7 days without any activity and meets the project’s definition of "stale". This will be auto-closed if there is no new activity over the next 7 days. If the issue is still relevant and active, you can simply comment with a "bump" to keep it open, or add the label "not_stale". Thanks for keeping our repository healthy!

@github-actions github-actions bot added the stale label Aug 10, 2025
@github-actions github-actions bot closed this Aug 13, 2025
@github-actions github-actions bot removed the stale label Aug 27, 2025
@AgustinBettati AgustinBettati changed the title fix: Supporting upgrade with NMVe instance fix: Supporting advanced_cluster upgrade to dedicated with NMVe instance Aug 27, 2025
@AgustinBettati AgustinBettati marked this pull request as ready for review August 27, 2025 09:18
@Copilot Copilot AI review requested due to automatic review settings August 27, 2025 09:18
@AgustinBettati AgustinBettati requested review from a team as code owners August 27, 2025 09:18
Copy link
Contributor

APIx bot: a message has been sent to Docs Slack channel

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where upgrading advanced clusters from tenant (M0) or flex clusters to dedicated NVMe instances would fail due to missing backup configuration in the upgrade request.

  • Adds providerBackupEnabled field to tenant upgrade requests when backup is enabled
  • Adds backupEnabled field to flex-to-dedicated upgrade requests when backup is enabled
  • Updates test configuration to use NVMe instance with backup enabled for tenant upgrade testing

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/service/advancedclustertpf/resource_upgrade.go Adds backup enabled logic to both tenant and flex upgrade request builders
internal/service/advancedcluster/resource_advanced_cluster.go Adds backup enabled logic to legacy tenant upgrade request builder
internal/service/advancedcluster/model_flex.go Adds backup enabled logic to flex-to-dedicated upgrade request builder
internal/testutil/acc/advanced_cluster.go Adds new test configuration for dedicated NVMe cluster with backup enabled
internal/service/advancedcluster/resource_advanced_cluster_test.go Updates tenant upgrade test to use NVMe configuration and adds validation checks
internal/service/advancedcluster/testdata/TestAccMockableAdvancedCluster_tenantUpgrade/02_01_POST__api_atlas_v2_groups_{groupId}_clusters_tenantUpgrade_2023-01-01.json Updates mock request to include backup enabled and NVMe instance size
.changelog/3549.txt Adds changelog entries for the bug fix

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +234 to +235
ebs_volume_type = "PROVISIONED"
node_count = 3
Copy link
Preview

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent. All three fields should use the same indentation level as the surrounding code.

Suggested change
ebs_volume_type = "PROVISIONED"
node_count = 3
ebs_volume_type = "PROVISIONED"
node_count = 3

Copilot uses AI. Check for mistakes.

Comment on lines +56 to +60
backupEnabled := state.BackupEnabled // a flex cluster can already have backup enabled
if patch.BackupEnabled != nil {
backupEnabled = patch.BackupEnabled
}
if backupEnabled != nil && *backupEnabled {
Copy link
Preview

Copilot AI Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The backup enabled logic could be simplified. Consider using a more direct approach: if (state.GetBackupEnabled() || patch.GetBackupEnabled()) { req.BackupEnabled = conversion.Pointer(true) }

Suggested change
backupEnabled := state.BackupEnabled // a flex cluster can already have backup enabled
if patch.BackupEnabled != nil {
backupEnabled = patch.BackupEnabled
}
if backupEnabled != nil && *backupEnabled {
if state.GetBackupEnabled() || patch.GetBackupEnabled() {

Copilot uses AI. Check for mistakes.

backupEnabled := state.BackupEnabled // a flex cluster can already have backup enabled
if patch.BackupEnabled != nil {
backupEnabled = patch.BackupEnabled
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not
req.BackupEnabled = backupEnabled?
Or do we want to avoid setting it if it is false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not make a difference, just thought of keeping the upgrade request compact for the regular cases

Copy link

@antellezr-mdb antellezr-mdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants