Add rollback action for CTAS destination table#27702
Conversation
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
022746f to
d099899
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/DefaultJdbcMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
d099899 to
fb25fcf
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
fb25fcf to
b3142de
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
b3142de to
540828e
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
540828e to
a1e5a0c
Compare
plugin/trino-ignite/src/main/java/io/trino/plugin/ignite/IgniteMetadata.java
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/DefaultJdbcMetadata.java
Show resolved
Hide resolved
plugin/trino-ignite/src/main/java/io/trino/plugin/ignite/IgniteClient.java
Outdated
Show resolved
Hide resolved
a1e5a0c to
88664e3
Compare
plugin/trino-ignite/src/main/java/io/trino/plugin/ignite/IgniteClient.java
Outdated
Show resolved
Hide resolved
88664e3 to
5ae171b
Compare
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/BaseJdbcClient.java
Outdated
Show resolved
Hide resolved
5ae171b to
007ac15
Compare
While CREATE TABLE is an atomic operation, CREATE TABLE AS SELECT statement happens in two phases for JDBC connectors from Trino: - creating the table - inserting content into the table If the insertion of the data fails, the CTAS may leave residual empty tables behind. The JDBC connectors, in dealing with CREATE TABLE AS SELECT scenarios, remove the destination table in case of dealing with exceptions while inserting the content.
007ac15 to
112deed
Compare
|
Adding the release note into |
| } | ||
| } | ||
|
|
||
| protected void rollbackCreateDestinationTable(ConnectorSession session, RemoteTableName remoteTableName) |
There was a problem hiding this comment.
rollbackTableCreation / rollbackCreateTable
| } | ||
| JdbcOutputTableHandle handle = jdbcClient.beginCreateTable(session, tableMetadata); | ||
| JdbcOutputTableHandle handle = jdbcClient.beginCreateTable(session, tableMetadata, rollbackActions::add); | ||
| rollbackActions.add(() -> jdbcClient.rollbackCreateTable(session, handle)); |
There was a problem hiding this comment.
we had rollback here. do we rollback twice now?
There was a problem hiding this comment.
the rollbackCreateTable is mainly for the clean ups of the temporary tables
There was a problem hiding this comment.
the responsibilities are now blurred, aren't they?
maybe we can clarify them with better naming or something?
There was a problem hiding this comment.
the responsibilities are now blurred, aren't they?
now, yes. But I think the name is ok since we are inside the "createTable" method, that's is fair to have a rollback method like "rollbackCreateTable", it just our implementation only do cleanups of temporary table. Ideally if we could do the cleanup of target table inside this method as well
@findinpath #27848 the method is going to split such cleanups into two methods(we already do), but finally, I think that's more elegant that we should only has one rollback methods for "createTable", caller don't care or needs to know the cleanups table is "temporary" or not
There was a problem hiding this comment.
While rolling back the creation of the destination table is characteristic only for CTAS statements, rolling back the creation of a temporary table can occur on any DML statement.
I see a good rationale on having two separate methods for doing rollback for table creation.
| throw new TrinoException(NOT_SUPPORTED, "This connector does not support replacing tables"); | ||
| } | ||
| igniteClient.beginCreateTable(session, tableMetadata); | ||
| igniteClient.beginCreateTable(session, tableMetadata, _ -> {}); |
There was a problem hiding this comment.
document why it's ok to ignore a rollback action
(it should be actually safer not to ignore -- this is CREATE TABLE flow. if nothing follows CREATE TABLE, nothing fails later and no rollback will take place)
| void createTable(ConnectorSession session, ConnectorTableMetadata tableMetadata); | ||
|
|
||
| JdbcOutputTableHandle beginCreateTable(ConnectorSession session, ConnectorTableMetadata tableMetadata); | ||
| JdbcOutputTableHandle beginCreateTable(ConnectorSession session, ConnectorTableMetadata tableMetadata, Consumer<Runnable> rollbackActionConsumer); |
There was a problem hiding this comment.
rollbackActionCollector would be a better name
more importantly, this is a rather odd abstraction and doesn't seem necessary.
so far it was JdbcMetadata who was respomnsible for instantiating and remembering rollback actions. let's not change that, if we don't have to
There was a problem hiding this comment.
rollbackActionCollector would be a better name
+1
so far it was JdbcMetadata who was respomnsible for instantiating and remembering rollback actions
We are entering a situation where it is unclear when it is safe to roll back the target table during CTAS. For example, in BaseJdbcClient#beginCreateTable, when FTE is enabled, both the target table and a temporary table are created in the method, during rollback in DefaultJdbcMetadata, we cannot reliably determine which tables should be cleaned up
let's not change that, if we don't have to
For now, this works as a solution, but I believe we could evolve the API to shift the responsibility back to DefaultJdbcMetadata. For example, we could change the method's return value so that DefaultJdbcMetadata knows exactly which tables need to be cleaned up
There was a problem hiding this comment.
rollbackActionCollector would be a better name
+1
Included in
however, the name may still be imperfect -- #27702 (comment)
in
BaseJdbcClient#beginCreateTable, when FTE is enabled, both the target table and a temporary table are created in the method
I made the same mistake. It's not FTE-related despite the code making it look so.
during rollback in DefaultJdbcMetadata, we cannot reliably determine which tables should be cleaned up
It's unclear why destination table is created there. I think it might be possible to defer that statement until later.
let's not change that, if we don't have to
For now, this works as a solution, but I believe we could evolve the API to shift the responsibility back to
DefaultJdbcMetadata. For example, we could change the method's return value so thatDefaultJdbcMetadataknows exactly which tables need to be cleaned up
That was also my intention.
I stopped short from execution on it in
when i saw the beginMerge flow. We create high number of tables there, and someone needs to clean them up if we fail mid-way.
|
Following up the open discussions in #27848 |
Description
While
CREATE TABLEis an atomic operation,CREATE TABLE AS SELECTstatement happens in two phasesfor JDBC connectors from Trino:
If the insertion of the data fails, the CTAS may leave residual empty tables behind.
The JDBC connectors, in dealing with
CREATE TABLE AS SELECTscenarios,remove the destination table in case of dealing with exceptions
while inserting the content.
NOTE that introducing the
rollbackConsumerin thebeginCreateTablemethod may come with potential changes in the functionality of the connectors that extend fromtrino-base-jdbcmodule.Additional context and related issues
Here is a relevant stacktrace gathered from CI that showcases the issue of leaving behind empty tables after a CTAS
Relevant code from the method
testRowLevelDelete()inBaseConnectorSmokeTest.java:trino/testing/trino-testing/src/main/java/io/trino/testing/BaseConnectorSmokeTest.java
Line 286 in d8c8057
Relevant issues and PRs
Inspired from the discussions done during #27677
Release notes
() This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: