Refine the doubling logic #418 #420

duyhuynhdev · 2025-12-10T02:18:34Z

Update the doubling logic according to the first approach, as the second approach still has corner cases where a single log event can dominate others if it has a large event count.
Remove the timer callback and use the profiler flush callback instead. In addition, expose the new maxsample setting to support the flush callback.
Update the relevant classes and unit tests to align with the new workflow and doubling logic.
Add new unit tests to cover the updated behaviour.
Fix CI complaints, including unsorted language files.

bwalkerl

This is a big change so we're going to need to test this thoroughly. I've started with comments from a code review, but I haven't tested this personally yet.

My main concerns so far is how this affects task detection and whether we are losing any sample data when we change the structure while combining components.

classes/cron_processor.php

bwalkerl · 2025-12-18T00:50:19Z

classes/cron_processor.php

        // We want to prevent doubling up of processing, so skip if an existing process is still executing.
        // The profile logs will be kept and processed the next time.
+        self::$logs[] = $log;
+        $this->logcount += $log->count();
+        // Doubling sampling period if it reaches the limit.
+        if ($this->logcount >= $this->samplelimit) {
+            $this->on_reach_limit($manager);
+            $this->logcount = $this->logcount - $this->samplelimit;
+        }
        if (self::$alreadyprofiling) {
            debugging('tool_excimer: starting cron_processor::on_interval when previous call has not yet finished');
-            if ($isfinal) {
+            // The final flush call when profiler is destroyed.
+            if ($log->count() < $this->samplelimit) {
                // This should never happen.
                debugging('tool_excimer: alreadyprofiling is true during final on_interval.');
            }
            return;
        }
        self::$alreadyprofiling = true;


The logic to prevent double up of processing may not be needed anymore now that we've switched from timers to flush callbacks (I'm not sure if this is paused during the callbacks). I think it's worth looking into further because it would be great if we could remove $self::logs entirely.

At this stage, the exact underlying scenario isn’t clear because the documentation doesn’t cover this case. However, this likely happens when maxSamples is set too low, causing the process to continue running while callbacks are triggered repeatedly. In this scenario, self::logs can be used for stored new coming samples.

Previously there were edge cases of the 10s interval overlapping, see #377

With your changes these should be much rarer as in the worst case the number of possible samples in on_interval should be very low. I'm not even sure if the profiler would be taking samples here.

I'm interested in what happens when max samples is set too low. The previous issue this fixed shouldn't be a concern if it only happens with bad config, but is there a chance of endless loops if this is set to something ridiculous like 1?

Yes, it should be possible, because we don’t have any constraint on the maxSamples value. That’s why I introduced the $logs array as a waiting list. New logs from the callback are collected in this array until the current process completes. I’m not sure how often overlaps will occur with the new logic, but $logs won’t impact the existing flow. On the contrary, it helps mitigate overlapping if it does occur.

classes/cron_processor.php

classes/sample_set.php

bwalkerl · 2025-12-18T01:35:24Z

classes/web_processor.php

+    public function process($log, manager $manager, bool $isfinal = false) {
        // We want to prevent overlapping of processing, so skip if an existing process is still executing.
        // The profile logs will be kept and processed the next time.
+        self::$logs[] = $log;


See comments about overlapping processing in cron.

bwalkerl · 2025-12-18T01:37:14Z

classes/web_processor.php

+
+        // Doubling sampling period if it reaches the limit.
+        $this->logcount += $log->count();
+        if ($this->partialsave && $this->logcount >= $this->samplelimit) {


Do we want this applied to non-partial saves too? I believe it was previously applied there as well.

We only get sampling once with non non-partial saves which means we won't set the callback for non-partial saves ( as previous) so we the sampling period doubling cannot be applied this case. However, the merge is still applied when we process the samples.

Thanks, that clarifies things and looks correct.

In that case, what happens if partial save is enabled and this is called by the flush callback when the ExcimerProfiler object is destroyed? Restarting it there seems bad.

$this->logcount is used to manage increases to the samplingperiod.Even in the final flush callback, the only affected element is the samplingperiod. So, it should not impact any profile, because the samplerate is no longer derived from the samplingperiod.

I'm less concerned about the profile and more about the harm of restarting the profiler while it's being destroyed. It could be handled gracefully behind the scenes, but we should be able to detect this ourselves and explicitly handle the logic for not restarting it in this case.

Thanks, I separated the on_flush and profile process functions.

bwalkerl

I have a couple more comments from testing. The transformations of the graphs after merging also doesn't look right to me (tested with partial saves to see before and after), so will need to test that more.

bwalkerl · 2025-12-18T03:51:12Z

classes/web_processor.php

-            $manager->get_timer()->setCallback(function () use ($manager) {
+            $manager->get_profiler()->setFlushCallback(function ($log) use ($manager) {
                // Once overlapping has happened once, we prevent all future partial saving.
                if (!$this->hasoverlapped) {
-                    $this->process($manager, false);
+                    $this->process($log, $manager);
                }
-            });
+            }, $this->maxsamples);


With this change partial save never marks the profile as finished.

I'm also thinking that with this patch it might be OK to re-enable partial saves by default (and lighten the old warnings) as the number of samples won't rise quickly when the DB is having issues, which was the main reason we disabled it..

Yeah, it is a bug. Let me fix it

After the update partial save is no longer saving partial profiles every time there is a flush callback.

I will update it in the next commit

Now it's updating (or rather, will once you fix the typo ')'), but it will save the last profile twice. We want to avoid this redundancy - we shouldn't be adding extra DB updates when they're not needed.

Hey Ben, can you explain more about the typo and saving the last profile twice issue?

classes/sample_set.php

bwalkerl · 2025-12-18T04:00:07Z

lang/en/tool_excimer.php

+$string['field_month'] = 'Month';
+$string['field_name'] = 'Name';
+$string['field_numsamples'] = 'Number of samples';
+$string['field_numsamples_value'] = '{$a->samples} samples ({$a->events} events) @ ~{$a->samplerate}ms';


We need to make this consistent with the hover display on the graph, which currently only uses 'samples' for events. We probably need to iron out the terminology since samples will have more meaning than events to most ends users.

I also think it would be better to keep the old display without events when the number of events is 0 (old profiles etc)

number of events should not be 0. If there is no numevents it will use the numsamples instead.

'events' => number_format($data['numevents'] ?? $data['numsamples'], 0, $decsep, $thousandssep),

Old profiles prior to the upgrade are showing as 0 in the database to me, but this isn't a big concern as old profiles will be flushed out eventually. That main issue here is making the terminology consistent with the graph.

bwalkerl

Thanks for the updates @duyhuynhdev

I've left a couple more comments, plus responses to some of your other comments that have actions.

classes/processor.php

bwalkerl · 2025-12-23T00:19:36Z

classes/sample_set.php

+                    $trace = $this->samples[$i + 1]['trace'];
+                }
+                $newsamples[] = [
+                    'eventcount' => ceil(($this->samples[$i]['eventcount'] + $this->samples[$i + 1]['eventcount']) / 2),


Is ceiling preferred here? Always rounding up will increase the margin of error. The number of samples won't be perfect unless we allow decimals, but maybe we can introduce some logic here to keep it more balanced.

I also still think we need more comments about what we're doing here.

Sorry. I've updated it to the round() function. I intended to use the round function but somehow used the ceil() function instead.

Round makes more sense, but it's going to run into the same problems as it will always be .5.

Is rounding to the nearest 0.5 a problem? For example, rounding 3.5 to 4.0 or 3.4 to 3.0 is acceptable to me, because the small difference should not materially affect the analysis.

By itself it's not an issue, but since this will always be rounding up the number of samples is going to be slightly overestimated which can have flow on effects to duration estimates.

I'm not sure how much of a problem this is in practice - if there's only one or two with more samples it's not an issue, but if say a tenth of the samples have this then it could have a larger impact.

classes/web_processor.php

duyhuynhdev force-pushed the refine-the-doubling-logic-#418 branch 3 times, most recently from 67e5801 to 70663e9 Compare December 10, 2025 06:52

Refine the doubling logic in sample set #418

88b78dd

bwalkerl requested changes Dec 18, 2025

View reviewed changes

bwalkerl reviewed Dec 18, 2025

View reviewed changes

duyhuynhdev force-pushed the refine-the-doubling-logic-#418 branch from 70663e9 to 6137ab7 Compare December 22, 2025 10:43

Refine the doubling logic in sample set #418

52cfb71

duyhuynhdev force-pushed the refine-the-doubling-logic-#418 branch from 6137ab7 to 52cfb71 Compare December 22, 2025 22:33

bwalkerl reviewed Dec 23, 2025

View reviewed changes

Refine the doubling logic in sample set #418

ecee240

duyhuynhdev force-pushed the refine-the-doubling-logic-#418 branch from d246be2 to ecee240 Compare December 29, 2025 23:04

Refine the doubling logic #418 #420

Are you sure you want to change the base?

Refine the doubling logic #418 #420

Uh oh!

Conversation

duyhuynhdev commented Dec 10, 2025

Uh oh!

bwalkerl left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwalkerl Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwalkerl Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwalkerl Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwalkerl left a comment

Choose a reason for hiding this comment

Uh oh!

bwalkerl Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bwalkerl Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwalkerl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwalkerl left a comment •

edited

Loading

bwalkerl Dec 18, 2025 •

edited

Loading

bwalkerl Dec 19, 2025 •

edited

Loading

bwalkerl Dec 24, 2025 •

edited

Loading

bwalkerl Dec 18, 2025 •

edited

Loading

bwalkerl Dec 18, 2025 •

edited

Loading

bwalkerl Dec 28, 2025 •

edited

Loading