[fix] Generate sequenceID at send time to ensure sequential ordering #1411

intkuroky · 2025-08-27T07:43:41Z

Motivation

This PR fixes an issue where sequenceID was generated at message creation time instead of send time, causing out-of-order sequence IDs when batching is enabled.
The sequenceID lifecycle (generation -> local storage -> network transmission) must be serialized to ensure consistency.

Modifications

sequenceID generation is deferred until messages are actually processed (local storage & network transmission), and the generation method is changed to non-thread-safe since concurrent access is neither needed nor supported.
During message batching, messages must be stored first with size estimation, then sequenceIDs are generated and assigned in batch before final processing. This is a direct consequence of the deferred sequenceID generation.

RobertIndie

Are you able to reproduce the ordering issue?
Flushing batch messages and publishing a single message should happen in the same event loop:

pulsar-client-go/pulsar/producer_partition.go

Lines 563 to 591 in 7a9a33c

    
           func (p *partitionProducer) runEventsLoop() { 
        
           	for { 
        
           		select { 
        
           		case data, ok := <-p.dataChan: 
        
           			// when doClose() is call, p.dataChan will be closed, data will be nil 
        
           			if !ok { 
        
           				return 
        
           			} 
        
           			p.internalSend(data) 
        
           		case cmd, ok := <-p.cmdChan: 
        
           			// when doClose() is call, p.dataChan will be closed, cmd will be nil 
        
           			if !ok { 
        
           				return 
        
           			} 
        
           			switch v := cmd.(type) { 
        
           			case *flushRequest: 
        
           				p.internalFlush(v) 
        
           			case *closeProducer: 
        
           				p.internalClose(v) 
        
           				return 
        
           			} 
        
           		case connectionClosed := <-p.connectClosedCh: 
        
           			p.log.Info("runEventsLoop will reconnect in producer") 
        
           			p.reconnectToBroker(connectionClosed) 
        
           		case <-p.batchFlushTicker.C: 
        
           			p.internalFlushCurrentBatch() 
        
           		} 
        
           	} 
        
           }

How could the out-of-order issue happen?

intkuroky · 2025-09-09T03:05:10Z

How could the out-of-order issue happen?

@RobertIndie
Yes, the sending operation is indeed executed within the same event loop. However, the generation of sequenceID does not occur within this event loop.
This PR is precisely to move the sequenceID generation step into the same event loop. It is precisely because the generation of sequenceID and message sending are not both in the same event loop that this leads to potential out-of-order issues.
For example:
`

producer, err := client.CreateProducer(pulsar.ProducerOptions{
	BatchingMaxPublishDelay: time.Second, // Let the batch operation occur after the single message operation
})
producer.SendAsync(context.Background(), &pulsar.ProducerMessage{
	Payload: []byte("batch messages"),
}, func(id pulsar.MessageID, message *pulsar.ProducerMessage, err error) {
	// id: -1:-1:0
})

time.Sleep(5e8) // Ensure that genSingleMessageMetadataInBatch is executed first with sequenceID set to 1

// 1. genSingleMessageMetadataInBatch -> updateSingleMessageMetadataSeqID -> sequenceID = 1

producer.SendAsync(context.Background(), &pulsar.ProducerMessage{
	Payload:   []byte("single message"),
	DeliverAt: time.Now(),
}, func(id pulsar.MessageID, message *pulsar.ProducerMessage, err error) {
	//
})
// 2. updateMetaData -> sequenceID = 2
// 3. publishing a single message (payload:"single message" sequenceID:2)
// 4. 1s interval ticker fires -> Flushing batch messages (payload:"batch messages" sequenceID:1)`

fix: generate sequenceID at send time to ensure sequential ordering

25dcf2d

RobertIndie reviewed Sep 8, 2025

View reviewed changes

xiezhiwei added 2 commits September 9, 2025 21:00

fix ci

c76098f

Merge branch 'master' into fix/sequenceid-ordering

3a0fa8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] Generate sequenceID at send time to ensure sequential ordering #1411

[fix] Generate sequenceID at send time to ensure sequential ordering #1411

Uh oh!

intkuroky commented Aug 27, 2025

Uh oh!

RobertIndie left a comment

Uh oh!

intkuroky commented Sep 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	func (p *partitionProducer) runEventsLoop() {
	for {
	select {
	case data, ok := <-p.dataChan:
	// when doClose() is call, p.dataChan will be closed, data will be nil
	if !ok {
	return
	}
	p.internalSend(data)
	case cmd, ok := <-p.cmdChan:
	// when doClose() is call, p.dataChan will be closed, cmd will be nil
	if !ok {
	return
	}
	switch v := cmd.(type) {
	case *flushRequest:
	p.internalFlush(v)
	case *closeProducer:
	p.internalClose(v)
	return
	}
	case connectionClosed := <-p.connectClosedCh:
	p.log.Info("runEventsLoop will reconnect in producer")
	p.reconnectToBroker(connectionClosed)
	case <-p.batchFlushTicker.C:
	p.internalFlushCurrentBatch()
	}
	}
	}

[fix] Generate sequenceID at send time to ensure sequential ordering #1411

Are you sure you want to change the base?

[fix] Generate sequenceID at send time to ensure sequential ordering #1411

Uh oh!

Conversation

intkuroky commented Aug 27, 2025

Motivation

Modifications

Uh oh!

RobertIndie left a comment

Choose a reason for hiding this comment

Uh oh!

intkuroky commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

intkuroky commented Sep 9, 2025 •

edited

Loading