Skip to content

lmtpd SEGV assertion failed prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER #5604

@MASHtm

Description

@MASHtm

lmtpd on a 3.8.6 frontend dumped core after receiving a message with size of 26892851 bytes and 83 recipients. The recipients were distributed over 14 backends. Delivery was successful, but in the end lmtpd crashed with log

Sep  2 09:39:51 jarvis Flmtp[3303721]: FATAL: Internal error: assertion failed: imap/prometheus.c: 300: prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER

The backtrace itself is interesting because it has 24353 frames... the outer most show

#24339 0x000055896ead2d19 in fatal (
    s=s@entry=0x7ffec351e140 "Internal error: assertion failed: lib/mappedfile.c: 243: mf->lock_status == MF_UNLOCKED", 
    code=code@entry=70) at imap/lmtpd.c:978
#24340 0x00007fec84207509 in assertionfailed (file=file@entry=0x7fec84522898 "lib/mappedfile.c", line=line@entry=243, 
    expr=expr@entry=0x7fec84522790 "mf->lock_status == MF_UNLOCKED") at lib/assert.c:57
#24341 0x00007fec8450f918 in mappedfile_writelock (mf=0x55896f692900) at lib/mappedfile.c:258
#24342 0x00007fec84a933d9 in prometheus_apply_delta (metric_id=metric_id@entry=CYRUS_LMTP_ACTIVE_CONNECTIONS, delta=delta@entry=-1)
    at imap/prometheus.c:289
#24343 0x000055896ead2d19 in fatal (
    s=s@entry=0x7ffec351e6a0 "Internal error: assertion failed: lib/mappedfile.c: 243: mf->lock_status == MF_UNLOCKED", 
    code=code@entry=70) at imap/lmtpd.c:978
#24344 0x00007fec84207509 in assertionfailed (file=file@entry=0x7fec84522898 "lib/mappedfile.c", line=line@entry=243, 
    expr=expr@entry=0x7fec84522790 "mf->lock_status == MF_UNLOCKED") at lib/assert.c:57
#24345 0x00007fec8450f918 in mappedfile_writelock (mf=0x55896f692900) at lib/mappedfile.c:258
#24346 0x00007fec84a933d9 in prometheus_apply_delta (metric_id=metric_id@entry=CYRUS_LMTP_ACTIVE_CONNECTIONS, delta=delta@entry=-1)
    at imap/prometheus.c:289
#24347 0x000055896ead12c1 in shut_down (code=code@entry=70) at imap/lmtpd.c:1043
#24348 0x000055896ead2d93 in fatal (
    s=s@entry=0x7ffec351ec30 "Internal error: assertion failed: imap/prometheus.c: 300: prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER", code=code@entry=70) at imap/lmtpd.c:998
#24349 0x00007fec84207509 in assertionfailed (file=file@entry=0x7fec84aee920 "imap/prometheus.c", line=line@entry=300, 
    expr=expr@entry=0x7fec84aee818 "prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER") at lib/assert.c:57
#24350 0x00007fec84a93524 in prometheus_apply_delta (metric_id=metric_id@entry=CYRUS_LMTP_TRANSMITTED_BYTES_TOTAL, 
    delta=<optimized out>) at imap/prometheus.c:300
#24351 0x000055896ead56c7 in lmtpmode (func=func@entry=0x55896ed2e260 <mylmtp>, pin=0x55896f694bc0, pout=0x55896f725ba0, 
    fd=fd@entry=0) at imap/lmtpengine.c:1174
#24352 0x000055896ead1403 in service_main (argc=1, argv=0x55896f678f10, envp=<optimized out>) at imap/lmtpd.c:289
#24353 0x000055896ead0564 in main (argc=<optimized out>, argv=<optimized out>, envp=0x7ffec3521bd8) at master/service.c:647

... and then it loops like 24342-24339 crashing in the end. So the SEGV seems to be caused by the endless loop.

(gdb) f 24351
#24351 0x000055896ead56c7 in lmtpmode (func=func@entry=0x55896ed2e260 <mylmtp>, pin=0x55896f694bc0, pout=0x55896f725ba0, 
    fd=fd@entry=0) at imap/lmtpengine.c:1174
1174	                prometheus_apply_delta(CYRUS_LMTP_TRANSMITTED_BYTES_TOTAL, delivered * msg->size);
(gdb) p msg->size
$4 = 26892851

looks ok so far. But

(gdb) f 24350
#24350 0x00007fec84a93524 in prometheus_apply_delta (metric_id=metric_id@entry=CYRUS_LMTP_TRANSMITTED_BYTES_TOTAL, 
    delta=<optimized out>) at imap/prometheus.c:300
300	        assert(prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER);
(gdb) p delta
$5 = <optimized out>

looks odd since

(gdb) l
295	
296	    offset = offsetof(struct prom_stats, metrics) + metric_id * sizeof(metric);
297	    memcpy(&metric, mappedfile_base(promhandle->mf) + offset, sizeof(metric));
298	    if (delta < 0) {
299	        /* counters must not be decremented */
300	        assert(prom_metric_descs[metric_id].type != PROM_METRIC_COUNTER);
301	    }
302	    metric.value = metric.value + delta;
303	    metric.last_updated = now_ms();
304	

the assertion is triggered by "if (delta < 0)".

metric looks fine so far

(gdb) p metric
$6 = {value = 122116, last_updated = 1756798771350}

But why is delta "optimized out"? And delta is defined as double, but msg->size is int? No cast, no warning?

The loop is most likely caused by the fact that the metric file is not unlocked on assert and fatal() itself wants to update prometheus what causes another assert->fatal->loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions