Stack up date and rolback failed due to c5.2xLarge server not available on AWS. #3342
-
@robinkwilson wilson Hi Robin, we tried to up-date the stack of our cloud enterprise instance to a larger server c5.2xLarge for an event we have been running. We got the following error message from AWS stating that the server size in not available in our Zone Irland (eu-west-1). This caused the up-date to fail. The stack went into rollback, but the rollback failed as well. Now the stack is stuck in a "UPDATE_ROLLBACK_FAILED". In the cloudformation section the up date uption is not available any more. Only the other stack actions (f.e. Continue update rollback) are there, but I don`t know, how or if I should use those. So we can't change the server size down again. Intrestingly we can still reach our website: www.xrcreatorspaces.com, maybe its running on a back-up instance, but I don Note: this is independt of the other discussion we had of up-dating from personal to an enterprise template by swaping the templates. Thanks for answering that! This current issue happens on a template we directly hosted as an enterprise version. Any idea what we can do here? Greets! Tim Here is the orignal AWS error message:
2020-11-13 21:22:02 UTC+0100 | StreamASG | UPDATE_FAILED | Received 0 SUCCESS signal(s) out of 2. Unable to satisfy 100% MinSuccessfulInstancesPercent requirement |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Ah unfortunately this is an AWS error. They don't have enough c5.2xlarge machines available in your region, you'll need to update to a different instance size. I'd be surprised if your site experienced any downtime during this action > error > rollback to the last working state. The rollback will bring your instance state back to before the update. So if you were running t3.medium instances, it will rollback to t3.medium instances. Everything should be working properly, only you'll need to pick fewer machines or use different instance sizes. I'm curious how many machines are you running? |
Beta Was this translation helpful? Give feedback.
-
Thanks for your suppoer @robinkwilson ! The rollback worked. We where only running 4 servers. (2 for the instance and 2 for the streaming). This was actually the second time that this happend. Its some how strange that AWS does not have enough large servers capacity in Europe. Do you know about the new Gaia-X initiative of the EU? https://en.wikipedia.org/wiki/GAIA-X It would be really cool for many reasons if Mozilla Hubs could be part of that. The roolback error was solved by following your advice. AWS had also written to me. Maybe that is helpfull for other similar cases, so I will share it here: Reviewing the case details, I see that a recent update to your CFN stack "XRcreatorspaces3" failed to update autoscaling resource "AppASG" due to lack of c5.2xlarge capacity in the specified AZ. The rollback process however cleaned up this resource, but this time failed at resource "StreamASG". This StreamASG resource during the rollback process, had initiated a rolling update, setting the MinSize and DesiredCapacity to 4 temporarily. It spun up 2 new instances i-0b55c05accf3cea77 and i-053fb2ac0897ebe1f with the old Launch Config XRcreatorspaces3-StreamLaunchConfiguration-14AJUY2GOFK59 but appears to have failed to signal back to CFN within the specified timeout period causing the rolling update to be stuck midway. As the main stack "XRcreatorspaces3" is currently now in Update Rollback Failed state, it wouldn't let you do further updates to it until we fix the failed resource StreamASG. To do this:
[+] https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-update-rollback-failed/
I also notice that the 5 child stacks managed by "XRcreatorspaces3" are all stack in UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS state which is unstable. While you do the above recommended steps, I'll be reaching out to our service team to flip the status of these child stacks to Update Complete so all the resources of "XRcreatorspaces3" are in sync and in workable state. Once done, your stack will be ready for further updates. I'll keep you posted once the child stacks are fixed from our end. Meanwhile, if you have any further questions, please feel free to revert back to me and I'll be happy to address them for you.
|
Beta Was this translation helpful? Give feedback.
Thanks for your suppoer @robinkwilson ! The rollback worked. We where only running 4 servers. (2 for the instance and 2 for the streaming). This was actually the second time that this happend. Its some how strange that AWS does not have enough large servers capacity in Europe. Do you know about the new Gaia-X initiative of the EU? https://en.wikipedia.org/wiki/GAIA-X It would be really cool for many reasons if Mozilla Hubs could be part of that.
The roolback error was solved by following your advice. AWS had also written to me. Maybe that is helpfull for other similar cases, so I will share it here:
Reviewing the case details, I see that a recent update to your CFN stack "XRcreatorspaces3…