Skip to content

Commit 263c86b

Browse files
committed
send queue: use being_sent as a lock for touching storage
There were two disconnected sources of truth for the state of event to be sent: - it can or cannot be in the in-memory `being_sent` map - it can or cannot be in the database Unfortunately, this led to subtle race conditions when it comes to editing/aborting. The following sequence of operations was possible: - try to send an event: a local echo is added to storage, but it's not marked as being sent yet - the task wakes up, finds the local echo in the storage,... - try to edit/abort the event: the event is not marked as being sent yet, so we think we can edit/abort it - ... having found the local echo, it is marked as being sent. This would result in the event misleadlingly not being aborted/edited, while it should have been. Now, there's already a lock on the `being_sent` map, so we can hold onto it while we're touching storage, making sure that there aren't two callers trying to manipulate storage *and* `being_sent` at the same time. This is pretty tricky to test properly, since this requires super precise timing control over the state store, so there's no test for this. I can confirm this avoids some weirdness I observed with `multiverse` though.
1 parent 2f125e9 commit 263c86b

File tree

1 file changed

+20
-9
lines changed

1 file changed

+20
-9
lines changed

crates/matrix-sdk/src/send_queue.rs

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -591,6 +591,8 @@ struct QueueStorage {
591591
room_id: OwnedRoomId,
592592

593593
/// All the queued events that are being sent at the moment.
594+
///
595+
/// It also serves as an internal lock on the storage backend.
594596
being_sent: Arc<RwLock<BTreeSet<OwnedTransactionId>>>,
595597
}
596598

@@ -624,6 +626,9 @@ impl QueueStorage {
624626
/// It is required to call [`Self::mark_as_sent`] after it's been
625627
/// effectively sent.
626628
async fn peek_next_to_send(&self) -> Result<Option<QueuedEvent>, RoomSendQueueStorageError> {
629+
// Keep the lock until we're done touching the storage.
630+
let mut being_sent = self.being_sent.write().await;
631+
627632
let queued_events = self
628633
.client
629634
.get()
@@ -633,7 +638,7 @@ impl QueueStorage {
633638
.await?;
634639

635640
if let Some(event) = queued_events.iter().find(|queued| !queued.is_wedged) {
636-
self.being_sent.write().await.insert(event.transaction_id.clone());
641+
being_sent.insert(event.transaction_id.clone());
637642

638643
Ok(Some(event.clone()))
639644
} else {
@@ -655,7 +660,9 @@ impl QueueStorage {
655660
&self,
656661
transaction_id: &TransactionId,
657662
) -> Result<(), RoomSendQueueStorageError> {
658-
self.mark_as_not_being_sent(transaction_id).await;
663+
// Keep the lock until we're done touching the storage.
664+
let mut being_sent = self.being_sent.write().await;
665+
being_sent.remove(transaction_id);
659666

660667
Ok(self
661668
.client
@@ -672,7 +679,9 @@ impl QueueStorage {
672679
&self,
673680
transaction_id: &TransactionId,
674681
) -> Result<(), RoomSendQueueStorageError> {
675-
self.mark_as_not_being_sent(transaction_id).await;
682+
// Keep the lock until we're done touching the storage.
683+
let mut being_sent = self.being_sent.write().await;
684+
being_sent.remove(transaction_id);
676685

677686
let removed = self
678687
.client
@@ -699,9 +708,10 @@ impl QueueStorage {
699708
&self,
700709
transaction_id: &TransactionId,
701710
) -> Result<bool, RoomSendQueueStorageError> {
702-
// Note: since there's a single caller (the room sending task, which processes
703-
// events to send linearly), there's no risk for race conditions here.
704-
if self.being_sent.read().await.contains(transaction_id) {
711+
// Keep the lock until we're done touching the storage.
712+
let being_sent = self.being_sent.read().await;
713+
714+
if being_sent.contains(transaction_id) {
705715
return Ok(false);
706716
}
707717

@@ -728,9 +738,10 @@ impl QueueStorage {
728738
transaction_id: &TransactionId,
729739
serializable: SerializableEventContent,
730740
) -> Result<bool, RoomSendQueueStorageError> {
731-
// Note: since there's a single caller (the room sending task, which processes
732-
// events to send linearly), there's no risk for race conditions here.
733-
if self.being_sent.read().await.contains(transaction_id) {
741+
// Keep the lock until we're done touching the storage.
742+
let being_sent = self.being_sent.read().await;
743+
744+
if being_sent.contains(transaction_id) {
734745
return Ok(false);
735746
}
736747

0 commit comments

Comments
 (0)