Skip to content

Commit 40d18df

Browse files
sollhuiYour Name
authored andcommitted
[enhance](job) optimize auto resume rule to adapt VCG failover (#59421)
### What problem does this PR solve? In #52515 introduces VCG(Virtual Compute Group) to be used for multi availability zone disaster recovery. But routine load job do not adapt it perfectly: If a cluster in an availability zone crashes, VCG provides disaster recovery capabilities, but the job will not be automatically resume. So this PR removed the `dead BE count` calculation when judge `isNeedAutoSchedule`. ### Release note None
1 parent d991ff2 commit 40d18df

File tree

2 files changed

+0
-16
lines changed

2 files changed

+0
-16
lines changed

fe/fe-common/src/main/java/org/apache/doris/common/Config.java

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1381,12 +1381,6 @@ public class Config extends ConfigBase {
13811381
@ConfField
13821382
public static boolean check_java_version = true;
13831383

1384-
/**
1385-
* it can't auto-resume routine load job as long as one of the backends is down
1386-
*/
1387-
@ConfField(mutable = true, masterOnly = true)
1388-
public static int max_tolerable_backend_down_num = 0;
1389-
13901384
/**
13911385
* a period for auto resume routine load
13921386
*/

fe/fe-core/src/main/java/org/apache/doris/load/routineload/ScheduleRule.java

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -59,16 +59,6 @@ public static boolean isNeedAutoSchedule(RoutineLoadJob jobRoutine) {
5959
&& jobRoutine.pauseReason.getCode() != InternalErrorCode.MANUAL_PAUSE_ERR
6060
&& jobRoutine.pauseReason.getCode() != InternalErrorCode.TOO_MANY_FAILURE_ROWS_ERR
6161
&& jobRoutine.pauseReason.getCode() != InternalErrorCode.CANNOT_RESUME_ERR) {
62-
int dead = deadBeCount();
63-
if (dead > Config.max_tolerable_backend_down_num) {
64-
if (LOG.isDebugEnabled()) {
65-
LOG.debug("dead backend num {} is larger than config {}, "
66-
+ "routine load job {} can not be auto rescheduled",
67-
dead, Config.max_tolerable_backend_down_num, jobRoutine.id);
68-
}
69-
return false;
70-
}
71-
7262
if (jobRoutine.latestResumeTimestamp == 0) { //the first resume
7363
jobRoutine.latestResumeTimestamp = System.currentTimeMillis();
7464
jobRoutine.autoResumeCount = 1;

0 commit comments

Comments
 (0)