Skip to content

Conversation

@quaff
Copy link
Contributor

@quaff quaff commented Mar 27, 2025

  1. Use SQL order by clause instead of Java Comparator
  2. Limit result set size to 1

@quaff
Copy link
Contributor Author

quaff commented Mar 27, 2025

Alternatively, we could use SE.STEP_EXECUTION_ID IN (SELECT MAX(STEP_EXECUTION_ID) FROM ...) to align with other queries, but there is a risk to break existing behavior since CREATE_TIME is not considered for comparing.

https://github.com/spring-projects/spring-batch/compare/main...quaff:spring-batch:patch-16?expand=1

FROM %PREFIX%JOB_EXECUTION JE
JOIN %PREFIX%STEP_EXECUTION SE ON SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
WHERE JE.JOB_INSTANCE_ID = ? AND SE.STEP_NAME = ?
ORDER BY SE.CREATE_TIME DESC, SE.JOB_EXECUTION_ID DESC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ordering by job execution id while the java comparator was based on the step execution id: .thenComparing(StepExecution::getId, Comparator.reverseOrder());. Is that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, fixed now.

@fmbenhassine fmbenhassine added the status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter label Jun 3, 2025
@fmbenhassine fmbenhassine added status: feedback-provided Issues for which the feedback requested from the reporter was provided and removed status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter labels Jun 4, 2025
@hpoettker
Copy link
Contributor

hpoettker commented Jun 4, 2025

The suggested change is basically a revert of the fix for #4657.

I'm not sure whether the performance gain is worth it, compared to the risk of re-introducing the previous issue.

@quaff
Copy link
Contributor Author

quaff commented Jun 5, 2025

The suggested change is basically a revert of the fix for #4657.

I'm not sure whether the performance gain is worth it, compared to the risk of re-introducing the previous issue.

This commit will not query all matched rows but only first row.

statement.setMaxRows(1); // limit max rows to 1

if (rs.next()) {
// only first row is used.
}

@quaff
Copy link
Contributor Author

quaff commented Jun 5, 2025

Our DBA has not been able to find a solution to improve the performance with an additional index.
What we did find out is that, by removing the ORDER BY, we get the result (typically just a single row in the normal case) in a matter of milliseconds. But with the ORDER BY, it takes ~ 60 seconds.

It's weird, but I'm OK if this PR is rejected.

@fmbenhassine fmbenhassine added this to the 6.0.0-RC1 milestone Sep 11, 2025
@fmbenhassine
Copy link
Contributor

I am not a big fan of sorting things on the application side if we can do it on the database side, so this PR has its added value if we do not include any regression or performance degradation as reported in #4657. So basically we moved this from Java to the DB in #891 with 62a8f44, then from the DB to Java in #4657 with 5a62de9, and now back again to the DB with this PR.

I could not find the reason we merged #4657, but I guess it was due to this #4657 (comment). I suggested to use LIMIT 1 or equivalent, but in fact the syntax could differ between DB providers which is more complex. Probably since we were targeting that fix for a patch release from what I see 5.1.3, we accepted to move the logic back from the DB to Java. But it's probably time to introduce a more sophisticated / proper approach to do it on the DB side.

This commit will not query all matched rows but only first row.

Is using statement.setMaxRows(1); // limit max rows to 1 somehow equivalent to something like LIMIT 1? I mean if we can avoid having to provide a different limiting query depending on the DB (TOP 1 or FETCH FIRST 1 ROW ONLY or WHERE ROWNUM <= 1), that would be better.

Thoughts?

I'm not sure whether the performance gain is worth it, compared to the risk of re-introducing the previous issue.

@hpoettker I agree. I don't know @quaff if you had a chance to benchmark this change? It would be great if we have a baseline to measure the performance improvement/degradation and decide if this change is worth it.

@fmbenhassine fmbenhassine added status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter and removed status: feedback-provided Issues for which the feedback requested from the reporter was provided labels Oct 21, 2025
@fmbenhassine fmbenhassine removed this from the 6.0.0-RC1 milestone Oct 21, 2025
1. Use SQL order by clause instead of Java Comparator
2. Limit result set size to 1

Signed-off-by: Yanming Zhou <[email protected]>
@quaff
Copy link
Contributor Author

quaff commented Oct 22, 2025

Is using statement.setMaxRows(1); // limit max rows to 1 somehow equivalent to something like LIMIT 1?

I think most JDBC drivers honer the max rows.

I don't know @quaff if you had a chance to benchmark this change? It would be great if we have a baseline to measure the performance improvement/degradation and decide if this change is worth it.

I didn't, @jpraet could you verify this patch works for you since you reported #4657?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants