Replies: 8 comments
-
We are sorry to hear about the inconvenience and the issues that you have had due to the move to another app service plan. This should not happen and I will look into the specific issue and verify that all procedures are in place so this won't happen anymore. Thanks for the feature request. I have created a product backlog item for this feature and have noted that we besides mail and portal notification to admin and technical contacts would like to offer a log to list when and possible why a project has been moved to a new app service plan. |
Beta Was this translation helpful? Give feedback.
-
This happened to me again. The same site was moved to yet another server, for whatever reason, and apparently no check is done to verify it restarted properly. It was moved at just before midnight CEST, which is the start of the business day in my time-zone, and has apparently been down for an entire day. The problem was a continual YSOD loop - failing on a line of code using Umbraco.Web.PublishedContentExtensions.IsDocumentType() - which is presumably some sort of cache issue with the DocTypes - and all it took to fix was to restart the environment, but this requires more than a ping-test to ensure the site is up. The fact that it was moved, no notification given that it had been moved, and no check done to ensure it was actually running, is problematic, and the client is obviously less than impressed. If I had received a notification that the site had been moved, then at least I would have known to check it first thing in the morning, and could have avoided it being down for a day. |
Beta Was this translation helpful? Give feedback.
-
Hey @c9mbundy Could you reach out to support about this, so that we can exchange more details and investigate it further? It could just as well be Azure moving stuff around but I'm quite curious about what's happening. Please, link this thread in the ticket if you end up creating one. You're also welcome to mention me by name, as I would like to look into it myself. |
Beta Was this translation helpful? Give feedback.
-
Support ticket was raised and the site observed during reassignment of worker, but unable to reproduce, and quite honestly I'm not surprised by this. Nonetheless I appreciate the efforts made trying to identify and resolve the issue - above and beyond as always. However, I actually wasn't expecting support to be able to reproduce or resolve, as I suspect it's a complex scenario that depends on prevailing conditions when some automatic procedures are implemented by Azure. The issue is a failure to resolve a DocType, and is cleared by restarting the environment, which is suggestive of some type of cache issue rather than code/config, but identifying it becomes fuzzy after that point, and expecting support to resolve an issue I can't identify is a big ask, which is why I'd not done it. Regardless, the bottom line is that this makes the status of the site somewhat unstable, and essentially means I need to check every day to ensure it's functional, and given that all our sites are built on the same baseline, this instability now applies to all of them by association. This is why I was more focused on a portal feature of being able to subscribe to monitoring alerts from Azure, when major events are triggered - such as moving workers - so I am able to check sites when notified and ensure they are functioning as expected, rather than needing to continually check them all 'just in case'. Obviously, Azure provides a myriad of alert subscription options in it's Monitoring services (I've played with some of these in the past) and they can be scripted, but I'd imagine the big trick will be how to turn that capability into a portal notification function for the admin to manage and enable/disable on a per-project and perhaps per-alert basis. |
Beta Was this translation helpful? Give feedback.
-
I've been looking into this case since Dennis escalated it to us and the feature request has shown to be much more complex than I initially thought. I tested moving your project between app service plans myself to try and trigger this but I wasn't able to reproduce it on my own which makes me believe that Azure moving processes between workers works in a completely different way. Azure has a service that warns you before planned maintenance and neither of the 2 dates you gave Dennis are even remotely close to the maintenance dates, so we can safely assume that it's neither the planned maintenance nor us moving projects around but simply Azure moving processes between workers. Azure also doesn't seem to log these are a restart in insights even though I could clearly see in the logs that the project has restarted during both of the timestamps you've provided. For some reason, there's very little information about this part of Azure's infrastructure work and I think that it's actually because App service is a PaaS offering, so they might simply believe that it's not our business. I plan to talk with my team after Easter about it to see if we can get some ideas about this together. For now, I recommend that you set up a ping for your projects I personally use uptimerobot.com and the free account can have up to 50 different pings every 5min. For this you could set up a keyword ping and then choose whether you want to be notified by mail or via the mobile app. |
Beta Was this translation helpful? Give feedback.
-
Just a thought, but maybe some of the health checks that can be set up/integrated with azure app service could be of use in these types of scenarios. |
Beta Was this translation helpful? Give feedback.
-
Hey @c9mbundy I've investigated the feature further and passed it over to the feature team. They'll plan it from there :) |
Beta Was this translation helpful? Give feedback.
-
Sounds promising - and thanks for considering this. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Issue description
I had an issue where a site was moved to a new server 4-days ago, and has been unavailable ever since, with nobody being aware there was a problem.
As far as I can tell, the issue was to do with cache not being cleared when the site re-started on the new server, and resulted in a continual YSOD error being logged - 8000 odd times.
Restarting the environment provided the required 'smack up the side of the head' required to clear the problem.
However, if I'd been alerted to the fact that the site has been moved, I would have checked that everything was ok, and avoided the embarrassing situation of the client discovering their site was throwing errors.
Beta Was this translation helpful? Give feedback.
All reactions