Skip to content

Conversation

@lbec-octopus
Copy link
Contributor

@lbec-octopus lbec-octopus commented Dec 9, 2025

[sc-130509]

This updates the TeamCityCompatibleAgentsScraper.cs to only alert if the build has been waiting for over 30 minutes.

Previously, TeamCityCompatibleAgentsScraper.cs would send info about all builds with no compatible agents and sumologic would alert if any were stuck longer than 30 minutes. However, that strategy did not work because of how sumologic works.

This new strategy is expected to work.

Test

I tested this in Preprod. I set it to alert after 7 minutes of queued build with no compatible agents. After 7 minutes, the scraper correctly found the build that was stuck.

These images include debugging output that has been removed.

image

Output after removing the stuck build.

image

@lbec-octopus lbec-octopus marked this pull request as ready for review December 9, 2025 04:53
@lbec-octopus lbec-octopus requested a review from a team as a code owner December 9, 2025 04:53
@lbec-octopus lbec-octopus force-pushed the fix-no-comaptible-agents-metric branch from 4ed67e2 to 34ce660 Compare December 9, 2025 22:44
Copy link
Contributor

@cailyoung cailyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some non blocking comments. Nice work.

readonly IMetricFactory metricFactory;
readonly IConfiguration configuration;
readonly HashSet<(string buildTypeId, string buildId, string queuedDateTime)> seenBuildsNoAgents = new();
readonly HttpClient httpClient = new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed; sometimes this isn't good, but for right now let's leave it here and not instantiate one per invocation (because I honestly don't remember which way would be better given the request pattern here!)


// Check if this build has the "no compatible agents" wait reason
var noAgentsWaitReason = waitReasonsResponse?.QueuedWaitReasons?.Property
?.FirstOrDefault(p => p.Name == "There are no idle compatible agents which can run this build");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we would extract this kind of 'magic string' out to a const with a useful name like 'NoCompatibleAgentWaitReason' - not blocker, just FYI

}

var currentBuildsNoAgents = buildsNoCompatibleAgents.Select(b => (b.BuildTypeId, b.Id, b.QueuedDate.ToString("yyyy-MM-ddTHH:mm:ssZ"))).ToArray();
seenBuildsNoAgents.UnionWith(currentBuildsNoAgents);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we aren't unioning any more? I think the absent builds won't work without it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, just saw it lower down. Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't impact the logic if it happens above or below. It just made sense to me for it to happen at the end of the function.

Comment on lines +124 to +138
class QueuedWaitReasonsResponse
{
public QueuedWaitReasons QueuedWaitReasons { get; set; }
}

class QueuedWaitReasons
{
public List<WaitReasonProperty> Property { get; set; }
}

class WaitReasonProperty
{
public string Name { get; set; }
public string Value { get; set; }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These might be able to be record instead, which makes all the getter/setter stuff simpler, but that's a later problem (if ever).

Comment on lines +90 to +98
if (noAgentsWaitReason != null && !string.IsNullOrEmpty(noAgentsWaitReason.Value))
{
// Parse the wait time in milliseconds and convert to minutes
if (long.TryParse(noAgentsWaitReason.Value, out var milliseconds))
{
var waitTimeMinutes = Math.Round(milliseconds / (60.0 * 1000.0));

// Only track builds that have been waiting for more than 30 minutes
if (waitTimeMinutes > 30)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we could refactor this as LINQ, but for now keep it if it makes the logic easier to understand.

@lbec-octopus lbec-octopus merged commit 2d5d0c8 into main Dec 11, 2025
4 checks passed
@lbec-octopus lbec-octopus deleted the fix-no-comaptible-agents-metric branch December 11, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants