Skip to content

Commit 615e017

Browse files
committed
#903 Make cancellation thread-safe
1 parent bcf2eaa commit 615e017

33 files changed

+1196
-618
lines changed

devdoc/jdp/jdp-2026-01-additional-locking-for-sending-in-wire-protocol.adoc

Lines changed: 49 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55

66
== Status
77

8-
* Draft
9-
* Proposed for: Jaybird 7
8+
* Published: 2026-01-10
9+
* Implemented in: Jaybird 7
1010

1111
== Type
1212

@@ -18,13 +18,13 @@ With Jaybird 3, we implemented support for protocol 12 and cancellation.
1818
We wanted to avoid an overly complex solution with overlapping locks to perform separate locking for sending and receiving, or switching to NIO.
1919

2020
We decided to take the chance and implement cancellation in a way that is not thread-safe, by sending the cancellation message without taking out the connection lock.
21-
(The connection lock covers sending and receiving, making cancellation only occur after the operation if you try it under that lock.)
21+
(The connection lock covers sending and receiving, making cancellation only occur after the blocking operation if you try it under that lock.)
2222
This can lead to the cancellation message being sent in the middle of a normal message.
2323
This then results in the server receiving a corrupt message, with undefined results (e.g. errors, the server doing the wrong thing, or the connection being terminated).
24-
We tried to reduce this risk by first constructing the message on a separate `XdrOutputStream`, and then sending the resulting byte array at once.
24+
We tried to reduce this risk by first constructing the message on a separate `XdrOutputStream`, and then sending the resulting byte array at once, skipping the buffer of `XdrOutputStream`.
2525

26-
We've now received reports that indicate this can occur when performing rapid cancellation, where the cancellation might overlap with the statement execute (or another action).
27-
We've been able to produce a broken connection with rapid cancellation;
26+
We've now received reports that problems can occur when performing rapid cancellation, where the cancellation might overlap with the statement execute (or another action).
27+
We have been able to produce a broken connection with rapid cancellation;
2828
this was a different issue than reported, though.
2929

3030
The native implementation is not affected (and if it were, it is probably not an issue we can address within Jaybird).
@@ -33,60 +33,75 @@ The native implementation is not affected (and if it were, it is probably not an
3333

3434
As we now have a clear case that our not thread-safe solution can actually fail in production, we will need to address the thread-safety of cancellation.
3535

36-
For the wire protocol, we will add a "`transmit`"-lock which should _only_ cover transmitting (sending) messages to the server.
37-
This lock should _not_ cover receiving messages, and preferably not cover other actions unrelated to sending messages to the server.
38-
The "`transmit`"-lock should be held for the sending of an entire message.
36+
For the wire protocol, we will add a "`transmit`" lock which should _only_ cover transmitting (sending) messages to the server.
37+
This lock *must not* cover receiving messages, and preferably should not cover other actions unrelated to sending messages to the server.
38+
39+
The transmit lock *must* be held for the sending of an entire message.
3940
When multiple messages are batched without an interleaved receive of responses, it should be considered carefully if the lock should cover individual messages or the entire batch.
41+
When deferred response are registered, this *must* be done outside the transmit lock, as registering a deferred response can trigger a blocking read from the server if too many messages are pending.
42+
43+
Flushing *must* be done under transmit lock, but can be done separately if needed.
4044

41-
For normal operations, the "`transmit`"-lock should be taken out while the normal connection lock is held (i.e. within a try-with-resources calling `withLock()`).
42-
For the cancellation operation, and possibly other "`out-of-band`" transmissions, only the "`transmit`"-lock should be taken out;
43-
this can probably be done within implementations of `FbWireOperations.writeDirect(byte[])`.
45+
For normal operations, the transmit lock *must* be obtained, held, and released while the normal connection lock is held (i.e. within a try-with-resources calling `withLock()`).
46+
For the cancellation operation -- and other "`out-of-band`" transmissions -- only the transmit lock should be obtained.
4447

45-
The lock will use a `java.util.concurrent.locks.ReentrantLock` (see also <<rejected>>).
48+
The lock used is a `java.util.concurrent.locks.ReentrantLock` (see also <<rejected>>).
4649

47-
Exact details of use will be fleshed out during implementation, but `XdrStreamAccess` is probably the right place to provide access to this lock.
48-
We will provide two methods of taking out the lock:
50+
The transmit lock is located within the `XdrStreamAccess` implementation of `WireConnection`, and its API provides access to the lock.
51+
We provide one method for taking out the lock:
4952

50-
. `LockCloseable withTransmitLock()` -- for use with try-with-resources in the same fashion as the `withLock()` method(s) for the connection lock
51-
. `void withTransmitLock(TransmitAction transmitAction) throws SQLException, IOException` where `TransmitAction` is a functional interface with a method `void transmit(XdrOutputStream xdrOut) throws SQLException, IOException`
52-
+
53-
If possible, the `IOException` should be left out of the throws lists of `withTransmitLock(TransmitAction)` and maybe `transmit(XdrOutputStream)` by using `FbExcepionBuilder.ioWriteError(IOException)`, but right now we can't oversee if that is always possible.
54-
On the other hand, we could always fall back on the first option if there are cases where an `IOException` must be thrown.
53+
`void withTransmitLock(TransmitAction transmitAction) throws SQLException, IOException` where `TransmitAction` is a functional interface with a method `void transmit(XdrOutputStream xdrOut) throws SQLException, IOException`.
5554

56-
If needed, implementation classes (but not the `FbWireXXX` interfaces) may provide the same methods (e.g. as `protected`) for purposes of code simplication, as long as they ultimately call the actual locking methods.
55+
Implementation classes -- but not the `FbWireXXX` interfaces -- may provide the same method (e.g. as `protected`) for purposes of code simplication, as long as they ultimately call the actual locking method.
5756

58-
The latter should also be viewed as an experiment if in the future we can use a similar alternative for `withLock()`.
59-
If during implementation this turns out to be too cumbersome, this may need to be revisited, and maybe we'll only implement the first option.
57+
The `withTransmitLock(TransmitAction)` method should also be viewed as an experiment if in the future we can use a similar alternative for the `withLock()` method for obtaining the connection lock.
6058

61-
During implementation, this will be done first in the `FbWireStatement` implementations and the `FbWireOperations.writeDirect(byte[])` method as a proof of concept.
62-
If this proof of concept fails to resolve the broken connection issue we can reproduce, we will need to rethink this approach.
59+
The `writeDirect(byte[])` methods skipping the buffer are removed: out-of-band operations must be written through the normal `XdrOutputStream` under the transmit lock.
6360

6461
Given the invasiveness of this change, it will not be backported to Jaybird 5 and 6.
65-
We will consider a reduced solution, only covering `FbWireStatement` implementations, to address the immediate thread-safety problem of statement cancellation.
62+
We will consider a reduced solution, for example only covering `FbWireStatement` implementations, to address the immediate thread-safety problem of statement cancellation.
6663
Such a reduced solution could still cause problems if statement cancellation is done while the connection is performing a non-statement operation.
6764
If we decide to go for such a solution for Jaybird 5 and/or 6, it will be covered by a separate JDP.
6865

6966
[#rejected]
7067
=== Rejected options
7168

69+
==== Non-reentrant lock
70+
7271
As the lock is generally uncontested, we considered using a simple spinlock on an `AtomicBoolean` or a volatile `int` field (with help of `AtomicIntegerFieldUpdater`) instead of a full-blown lock.
73-
Unfortunately, there may be cases where we need reentrancy of the lock due to implementation of protocol versioning (e.g. an overloaded method calling its parent and then doing something that should be done within the "`transmit`"-lock).
72+
Such a lock would not be reentrant.
73+
Unfortunately, there may be cases where we need reentrancy due to implementation of protocol versioning (e.g. overloads calling different methods that also obtain the transmit lock while it is already held).
74+
75+
[NOTE]
76+
====
77+
Actual implementation has shown that reentrancy can be avoided.
78+
However, switching to a non-reentrant lock comes with risk of deadlock if we do -- intentionally or accidentally -- nest calls that (try to) obtain the transmit lock.
79+
We're also no convinced such a simpler lock would actually perform better in practice.
80+
81+
In short, we'll stick with the reentrant lock as that is less brittle, and less surprising.
82+
====
83+
84+
==== Providing a `withTransmitLock` method returning a `LockCloseable`
85+
86+
Initially, we proposed adding a second method to `XdrStreamAccess`, `LockCloseable withTransmitLock()` which works similar as the `withLock()` methods for obtaining the connection lock.
7487

75-
If during implementation it turns out that this is not a problem (e.g. because the lock can be taken out by the callers of such overloaded methods), we can revisit this decision.
88+
During implementation, we discovered that we don't really needed this method, and providing only the `withTransmitLock(TransmitAction)` methods provides a simpler, intention-revealing API.
89+
So, we removed this method from the design described under <<_decision>>.
7690

7791
== Consequences
7892

79-
The entire wire protocol implementation is revised to take out the "`transmit`"-lock during sending.
93+
The entire wire protocol implementation is revised to obtain the transmit lock for the duration of sending a message.
8094

81-
The javadoc of `FbWireOperations.writeDirect(byte[])` is revised to remove mention of race conditions.
82-
The javadoc of other methods, etc., are revised as needed.
95+
Various methods in the implementation classes change signature, or are removed or renamed for consistency.
96+
(Some more details are provided in the Jaybird 7 release notes.)
8397

8498
"`Normal`" users of the JDBC driver and the internal GDS-ng API, are not affected -- there is no change in the methods they call, or the operation or behaviour of the driver (except cancellation is now thread-safe).
8599

86-
Implementers of Jaybird forks or alternative wire protocol versions building on the `org.firebirdsql.gds.ng.wire` package and related packages and classes will need to change their code to correctly take out the "`transmit`"-lock during transmission to the server.
100+
Implementers of Jaybird forks or alternative wire protocol versions building on classes in `org.firebirdsql.gds.ng.wire` package and "`sub`"-packages will need to change their code to correctly obtain the transmit lock during transmission to the server.
87101

88-
It is possible that this change will not fully address cancellation issues (e.g. we suspect deferred response handling might need additional work, and result sets may need to be invalidated/closed by cancellation);
89-
this will be subject to further investigation after this thread-safety issue has been addressed.
102+
It is possible that this change will not fully address cancellation issues.
103+
We suspect deferred response handling might need additional work, and result sets may need to be invalidated/closed by cancellation.
104+
This will be subject to further investigation after this thread-safety issue has been addressed.
90105

91106
[appendix]
92107
== License Notice

src/docs/asciidoc/release_notes.adoc

Lines changed: 50 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@
1717
:fb-canonical-html: https://firebirdsql.org/docs/drivers/java/7.0.x/release_notes.html
1818

1919
////
20-
SPDX-FileCopyrightText: Copyright 2021-2025 Firebird development team and individual contributors
20+
SPDX-FileCopyrightText: Copyright 2021-2026 Firebird development team and individual contributors
2121
SPDX-FileCopyrightText: Copyright 2002 David Jencks
2222
SPDX-FileCopyrightText: Copyright 2002-2003 Rick Fincher
2323
SPDX-FileCopyrightText: Copyright 2004-2010 Roman Rokytskyy
24-
SPDX-FileCopyrightText: Copyright 2012-2025 Mark Rotteveel
24+
SPDX-FileCopyrightText: Copyright 2012-2026 Mark Rotteveel
2525
SPDX-License-Identifier: LicenseRef-PDL-1.0
2626
////
2727

@@ -58,7 +58,7 @@ Possible workarounds:
5858
--
5959
+
6060
This is caused by https://github.com/FirebirdSQL/firebird/issues/4971[firebird#4971]
61-
* In Java 24, the mapping of some named `WITH TIME ZONE` values changed as Java's mapping of short IDs changed.
61+
* In Java 24 and higher, the mapping of some named `WITH TIME ZONE` values changed as Java's mapping of short IDs changed.
6262
This should only affect cases where you explicitly obtain `ZonedDateTime` or `ZonedTime` instances with these named zones.
6363
+
6464
--
@@ -627,6 +627,13 @@ For further details, see https://github.com/FirebirdSQL/jaybird/blob/master/devd
627627
628628
* Changed the documentation of `FirebirdDatabaseMetaData` methods `getProcedureSourceCode`, `getTriggerSourceCode` and `getViewSourceCode` that the object not being found is reported by a `null` value, not a `SQLException`.
629629
With this change, the documentation reflects the actual behaviour.
630+
* Fixed: race condition during cancellation (https://github.com/FirebirdSQL/jaybird/issues/903[#903])
631+
+
632+
This change introduce an additional lock during transmit (sending to the server).
633+
Maintainers of Jaybird forks, or implementers of custom wire protocol implementations subclassing classes in `org.firebirdsql.gds.ng.wire` and related packages will need to take out this transmit lock when sending to the server.
634+
The change also lead to incompatible changes to protected methods in the various protocol implementation classes, like adding an `XdrOutputStream` parameter to various methods, and methods renamed for consistency.
635+
+
636+
See also https://github.com/FirebirdSQL/jaybird/blob/master/devdoc/jdp/jdp-2026-01-additional-locking-for-sending-in-wire-protocol.adoc[jdp-2026-01: Additional locking for sending in wire protocol].
630637
* ...
631638
632639
[#compatibility-changes]
@@ -812,12 +819,49 @@ If you are confronted with such a change, let us know on {firebird-java}[firebir
812819
813820
* `FbAttachment`
814821
** The `close()` method should no longer throw an exception if already closed, or not connected/attached.
822+
* `WireConnection`
823+
** Method `writeDirect(byte[])` was removed;
824+
use the `XdrOutputStream` under the transmit lock.
815825
* `FbWireAsynchronousChannel`
816826
** `connect(String, int, int)` was replaced by `connect(String, int)`
817827
* `FbWireOperations`
818828
** The `ProcessAttachCallback` parameter of `authReceiveResponse` was removed, as all implementations did nothing, and since protocol 13, it wasn't only called for the attach response
819829
** Interface `ProcessAttachCallback` was removed
830+
** Method `writeDirect(byte[])` was removed;
831+
use the `XdrOutputStream` under the transmit lock.
820832
* The internal interface `org.firdsql.jdbc.field.BlobListenableField` was renamed to `BlobField` due to increased responsibilities
833+
* `XdrOutputStream`
834+
* `writeDirect(byte[])` was removed;
835+
use the `XdrOutputStream` under the transmit lock.
836+
* `XdrStreamAccess`
837+
** Method `withTransmitLock(TransmitAction)` was added for using the `XdrOutputStream` under the transmit lock.
838+
This method should be preferred over calling `getXdrOut()`.
839+
`TransmitAction` is a functional interface with method `transmit(XdrOutputStream xdrOut) throws IOException, SQLException`.
840+
* Various wire protocol implementation classes (in `org.firebirdsql.gds.ng.wire` and "`sub`"-packages)
841+
+
842+
See also https://github.com/FirebirdSQL/jaybird/blob/master/devdoc/jdp/jdp-2026-01-additional-locking-for-sending-in-wire-protocol.adoc[jdp-2026-01: Additional locking for sending in wire protocol].
843+
+
844+
** Methods `protected XdrOutputStream getXdrOut()` were removed;
845+
the replacement is `protected void withTransmitLock(TransmitAction)`.
846+
This was done to enforce use of the transmit lock.
847+
** When sending data to the server, the transmit lock *must* be taken out, and it *must* be released before performing another blocking action like reading (receiving) data from the server.
848+
For this purpose, use `withTransmitLock(TransmitAction)`.
849+
** Some protected methods for sending messages to the server were renamed for consistency;
850+
in general, methods of the form `sendXxx`, `sendXxxPacket`, `sendXxxMessage`, or `doXxxPacket`, or similar, to `sendXxxMsg` (where _Xxx_ is a name of the message action or type).
851+
** Some protected methods for sending messages or (partial) data to the server received an additional `XdrOutputStream` parameter.
852+
+
853+
Usually, callers of these methods are responsible for taking out the transmit lock.
854+
** The protected methods for sending messages to the server (the `sendXxxMsg` methods mentioned earlier) now consistently do not call `flush()` on the `XdrOutputStream` to simplify protocol versioning of the messages.
855+
The caller is now responsible for flushing (if needed).
856+
** Additional protected methods were added (or were changed from private) for sending messages to the server.
857+
* `V10Statement`
858+
** `sendPrepareMsg` (previously `sendPrepare`) no longer calls `switchState(StatementState.PREPARING)`;
859+
this is now done by its callers.
860+
+
861+
Subclasses overriding this method must ensure they do not switch state to `PREPARING` themselves, and callers must ensure they switch state before calling this method.
862+
* `V16Statement`
863+
** the `sendBatchMsg` protected method -- accepting `RowDescriptor` and `Collection<RowValue>` -- was made private.
864+
The new protected method of the same name is _only_ responsible for sending the `op_batch_msg` itself, not the subsequent row data.
821865
822866
[#breaking-changes-unlikely]
823867
=== Unlikely breaking changes
@@ -841,7 +885,7 @@ The following methods will be removed in Jaybird 8:
841885
842886
* `WireConnection.getProtocolMinimumType()` -- use `WireConnection.getProtocolType()`
843887
+
844-
Might still be removed before Jaybird 7 final release as this is internal API, and unlikely to be used in user code.
888+
This method might still be removed before Jaybird 7 final release as this is internal API, and unlikely to be used in user code.
845889
846890
[#removal-of-deprecated-classes-8]
847891
===== Removal of deprecated classes
@@ -904,9 +948,9 @@ you may only use this Documentation if you comply with the terms of this License
904948
A copy of the License is available at https://firebirdsql.org/en/public-documentation-license/.
905949
906950
The Original Documentation is "`Jaybird {version_wo_target} Release Notes`".
907-
The Initial Writer of the Original Documentation is Mark Rotteveel, Copyright © 2012-2025.
951+
The Initial Writer of the Original Documentation is Mark Rotteveel, Copyright © 2012-2026.
908952
All Rights Reserved.
909-
(Initial Writer contact(s): _unknown_).
953+
(Initial Writer contact(s): mark (at) lawinegevaar (dot) nl).
910954
911955
Contributor(s): David Jencks, Rick Fincher, Roman Rokytskyy. +
912956
Portions created by David Jencks are Copyright © 2002.

src/main/org/firebirdsql/gds/impl/wire/XdrOutputStream.java

Lines changed: 2 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
SPDX-FileCopyrightText: Copyright 2003 Ryan Baldwin
88
SPDX-FileCopyrightText: Copyright 2005 Steven Jardine
99
SPDX-FileCopyrightText: Copyright 2015 Hajime Nakagami
10-
SPDX-FileCopyrightText: Copyright 2011-2024 Mark Rotteveel
10+
SPDX-FileCopyrightText: Copyright 2011-2026 Mark Rotteveel
1111
SPDX-License-Identifier: LGPL-2.1-or-later
1212
*/
1313
package org.firebirdsql.gds.impl.wire;
@@ -44,7 +44,7 @@ public final class XdrOutputStream extends BufferedOutputStream implements Encry
4444
public static final int SPACE_BYTE = 0x20;
4545
public static final int NULL_BYTE = 0x0;
4646
private static final int TEXT_PAD_LENGTH = BUF_SIZE;
47-
private static final byte[] TEXT_PAD = createPadding(BUF_SIZE, SPACE_BYTE);
47+
private static final byte[] TEXT_PAD = createPadding(TEXT_PAD_LENGTH, SPACE_BYTE);
4848
private static final int ZERO_PAD_LENGTH = 3;
4949
private static final byte[] ZERO_PADDING = new byte[ZERO_PAD_LENGTH];
5050

@@ -363,19 +363,6 @@ public void setCipher(Cipher cipher) throws IOException {
363363
encrypted = true;
364364
}
365365

366-
/**
367-
* Writes directly to the {@code OutputStream} of the underlying socket.
368-
*
369-
* @param data
370-
* data to write
371-
* @throws IOException
372-
* for errors writing to the socket
373-
*/
374-
public void writeDirect(byte[] data) throws IOException {
375-
out.write(data);
376-
out.flush();
377-
}
378-
379366
@SuppressWarnings("java:S2177")
380367
private void flushBuffer() throws IOException {
381368
if (count > 0) {

0 commit comments

Comments
 (0)