Skip to content

Commit d535f2b

Browse files
authored
[enhancement](recycle bin) optimize the recycle bin to reduce the potential of FE hang (#55753)
### What problem does this PR solve? I found when there are large amount of garbage(about 90000 partitions) in recycle bin, the Fe's table lock will be hold for long time by DynamicPartitionScheduler thread, the stack is like: ``` "recycle bin" #28 daemon prio=5 os_prio=0 cpu=73880509.81ms elapsed=96569.50s allocated=9212M defined_classes=9 tid=0x00007f0b545c1800 nid=0x2f4540 runnable [0x00007f0b251fd000] java.lang.Thread.State: RUNNABLE at org.apache.doris.catalog.CatalogRecycleBin.getSameNamePartitionIdListToErase(CatalogRecycleBin.java:539) - locked <0x000000020d6d6130> (a org.apache.doris.catalog.CatalogRecycleBin) at org.apache.doris.catalog.CatalogRecycleBin.erasePartitionWithSameName(CatalogRecycleBin.java:556) - locked <0x000000020d6d6130> (a org.apache.doris.catalog.CatalogRecycleBin) at org.apache.doris.catalog.CatalogRecycleBin.erasePartition(CatalogRecycleBin.java:510) - locked <0x000000020d6d6130> (a org.apache.doris.catalog.CatalogRecycleBin) at org.apache.doris.catalog.CatalogRecycleBin.runAfterCatalogReady(CatalogRecycleBin.java:1012) at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) at org.apache.doris.common.util.Daemon.run(Daemon.java:119) Locked ownable synchronizers: - None "DynamicPartitionScheduler" #41 daemon prio=5 os_prio=0 cpu=115405.50ms elapsed=87942.53s allocated=16637M defined_classes=96 tid=0x00007f0b545cc800 nid=0x2f4545 waiting for monitor entry [0x00007f0b247fe000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.doris.catalog.CatalogRecycleBin.recyclePartition(CatalogRecycleBin.java:187) - waiting to lock <0x000000020d6d6130> (a org.apache.doris.catalog.CatalogRecycleBin) at org.apache.doris.catalog.OlapTable.dropPartition(OlapTable.java:1164) at org.apache.doris.catalog.OlapTable.dropPartition(OlapTable.java:1207) at org.apache.doris.datasource.InternalCatalog.dropPartitionWithoutCheck(InternalCatalog.java:1895) at org.apache.doris.datasource.InternalCatalog.dropPartition(InternalCatalog.java:1884) at org.apache.doris.catalog.Env.dropPartition(Env.java:3212) at org.apache.doris.clone.DynamicPartitionScheduler.executeDynamicPartition(DynamicPartitionScheduler.java:605) at org.apache.doris.clone.DynamicPartitionScheduler.runAfterCatalogReady(DynamicPartitionScheduler.java:729) at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) at org.apache.doris.clone.DynamicPartitionScheduler.run(DynamicPartitionScheduler.java:688) ``` The DynamicPartitionScheduler thread is waiting the CatalogRecycleBin thread while the table write lock is holding by itself . In Fe log, you can see the CatalogRecycleBin thread is running something big and cost almost 5~10 mins every run: ``` fe.log.20250907-2:2025-09-07 04:15:50,740 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 375503ms fe.log.20250907-2:2025-09-07 04:23:14,109 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 413369ms fe.log.20250907-2:2025-09-07 04:30:01,187 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 377077ms fe.log.20250907-2:2025-09-07 04:38:22,769 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 471581ms fe.log.20250907-2:2025-09-07 04:45:42,552 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 409782ms fe.log.20250907-2:2025-09-07 04:54:30,825 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 498272ms fe.log.20250907-2:2025-09-07 05:01:36,311 INFO (recycle bin|28) [CatalogRecycleBin.erasePartition():516] erasePartition eraseNum: 0 cost: 395485ms ``` The most costly task of the CatalogRecycleBin thread is erasing the partition with same name: ``` 2025-09-07 04:16:20,884 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[62638463] name: p_2019051116000 0_20190511170000 from table[32976073] from db[682022] 2025-09-07 04:16:20,994 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[62640651] name: p_2019043016000 0_20190430170000 from table[32976073] from db[682022] 2025-09-07 04:16:21,438 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[60264769] name: p_2019051721000 0_20190517220000 from table[32976073] from db[682022] 2025-09-07 04:16:21,787 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[62651922] name: p_2019051015000 0_20190510160000 from table[32976073] from db[682022] 2025-09-07 04:16:21,893 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[59222503] name: p_2019052708000 0_20190527090000 from table[32976073] from db[682022] 2025-09-07 04:16:22,204 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[62656398] name: p_2019051109000 0_20190511100000 from table[32976073] from db[682022] 2025-09-07 04:16:22,430 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[59228497] name: p_2019051812000 0_20190518130000 from table[32976073] from db[682022] 2025-09-07 04:16:22,493 INFO (recycle bin|28) [CatalogRecycleBin.erasePartitionWithSameName():569] erase partition[62658335] name: p_2019051217000 0_20190512180000 from table[32976073] from db[682022] ... ``` This may leads to whole Fe hang because the table lock is used for many threads. <img width="1230" height="438" alt="Clipboard_Screenshot_1757283600" src="https://github.com/user-attachments/assets/59ec8707-82f8-4daf-8dae-b9ebea2b2959" /> This commit mainly optimize the logic of recycling the same name meta, adding caches to reduce the time complexity. ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [x] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
1 parent 1b81208 commit d535f2b

File tree

2 files changed

+1041
-139
lines changed

2 files changed

+1041
-139
lines changed

0 commit comments

Comments
 (0)