Commit 3f58c32
Wen Lin
Use ExclusiveLock for accessing the table pg_resgroupcapability when CREATE/ALTER resouce group.
In some scenarios, the AccessExclusiveLock for table pg_resgroupcapability may cause database setup/recovery pending. Below is why we need change the AccessExclusiveLock to ExclusiveLock.
This lock on table pg_resgroupcapability is used to concurrent update this table when run "Create/Alter resource group" statement. There is a CPU limit, after modify one resource group, it has to check if the whole CPU usage of all resource groups doesn't exceed 100%.
Before this fix, AccessExclusiveLock is used. Suppose one user is running "Alter resource group" statement, QD will dispatch this statement to all QEs, so it is a two phase commit(2PC) transaction. When QD dispatched "Alter resource group" statement and QE acquire the AccessExclusiveLock for table pg_resgroupcapability. Until the 2PC distributed transaction committed, QE can release the AccessExclusiveLock for this table.
In the second phase, QD will call function doNotifyingCommitPrepared to broadcast "commit prepared" command to all QEs, QE has already finish prepared, this transation is a prepared transaction. Suppose at this point, there is a primary segment down and a mirror will be promoted to primary.
The mirror got the "promoted" message from coordinator, and will recover based on xlog from primary, in order to recover the prepared transaction, it will read the prepared transaction log entry and acquire AccessExclusiveLock for table pg_resgroupcapability. The callstack is:
#0 lock_twophase_recover (xid=, info=, recdata=, len=) at lock.c:4697
#1 ProcessRecords (callbacks=, xid=2933, bufptr=0x1d575a8 "") at twophase.c:1757
#2 RecoverPreparedTransactions () at twophase.c:2214
#3 StartupXLOG () at xlog.c:8013
#4 StartupProcessMain () at startup.c:231
#5 AuxiliaryProcessMain (argc=argc@entry=2, argv=argv@entry=0x7fff84b94a70) at bootstrap.c:459
#6 StartChildProcess (type=StartupProcess) at postmaster.c:5917
#7 PostmasterMain (argc=argc@entry=7, argv=argv@entry=0x1d555b0) at postmaster.c:1581
#8 main (argc=7, argv=0x1d555b0) at main.c:240
After that, the database instance will start up, all related initialization functions will be called. However, there is a function named "InitResGroups", it will acquire AccessShareLock for table pg_resgroupcapability and do some initialization stuff. The callstack is:
#6 WaitOnLock (locallock=locallock@entry=0x1c7f248, owner=owner@entry=0x1ca0a40) at lock.c:1999
#7 LockAcquireExtended (locktag=locktag@entry=0x7ffd15d18d90, lockmode=lockmode@entry=1, sessionLock=sessionLock@entry=false, dontWait=dontWait@entry=false, reportMemoryError=reportMemoryError@entry=true, locallockp=locallockp@entry=0x7ffd15d18d88) at lock.c:1192
#8 LockRelationOid (relid=6439, lockmode=1) at lmgr.c:126
#9 relation_open (relationId=relationId@entry=6439, lockmode=lockmode@entry=1) at relation.c:56
#10 table_open (relationId=relationId@entry=6439, lockmode=lockmode@entry=1) at table.c:47
#11 InitResGroups () at resgroup.c:581
#12 InitResManager () at resource_manager.c:83
#13 initPostgres (in_dbname=, dboid=dboid@entry=0, username=username@entry=0x1c5b730 "linw", useroid=useroid@entry=0, out_dbname=out_dbname@entry=0x0, override_allow_connections=override_allow_connections@entry=false) at postinit.c:1284
#14 PostgresMain (argc=1, argv=argv@entry=0x1c8af78, dbname=0x1c89e70 "postgres", username=0x1c5b730 "linw") at postgres.c:4812
#15 BackendRun (port=, port=) at postmaster.c:4922
#16 BackendStartup (port=0x1c835d0) at postmaster.c:4607
#17 ServerLoop () at postmaster.c:1963
#18 PostmasterMain (argc=argc@entry=7, argv=argv@entry=0x1c595b0) at postmaster.c:1589
#19 in main (argc=7, argv=0x1c595b0) at main.c:240
The AccessExclusiveLock is not released, and it is not compatible with any other locks, so the startup process will be pending on this lock. So the mirror can't become primary successfully.
Even users run "gprecoverseg" to recover the primary segment. the result is similar. The primary segment will recover from xlog, it will recover prepared transactions and acquire AccessExclusiveLock for table pg_resgroupcapability. Then the startup process is pending on this lock. Unless users change the resource type to "queue", the function InitResGroups will not be called, and won't be blocked, then the primary segment can startup normally.
After this fix, ExclusiveLock is acquired when alter resource group. In above case, the startup process acquires AccessShareLock, ExclusiveLock and AccessShareLock are compatible. The startup process can run successfully. After startup, QE will get RECOVERY_COMMIT_PREPARED command from QD, it will finish the second phase of this distributed transaction and release ExclusiveLock on table pg_resgroupcapability. The callstack is:
#0 lock_twophase_postcommit (xid=, info=, recdata=0x3303458, len=) at lock.c:4758
#1 ProcessRecords (callbacks=, xid=, bufptr=0x3303458 "") at twophase.c:1757
#2 FinishPreparedTransaction (gid=gid@entry=0x323caf5 "25", isCommit=isCommit@entry=true, raiseErrorIfNotFound=raiseErrorIfNotFound@entry=false) at twophase.c:1704
#3 in performDtxProtocolCommitPrepared (gid=gid@entry=0x323caf5 "25", raiseErrorIfNotFound=raiseErrorIfNotFound@entry=false) at cdbtm.c:2107
#4 performDtxProtocolCommand (dtxProtocolCommand=dtxProtocolCommand@entry=DTX_PROTOCOL_COMMAND_RECOVERY_COMMIT_PREPARED, gid=gid@entry=0x323caf5 "25", contextInfo=contextInfo@entry=0x10e1820 ) at cdbtm.c:2279
#5 exec_mpp_dtx_protocol_command (contextInfo=0x10e1820 , gid=0x323caf5 "25", loggingStr=0x323cad8 "Recovery Commit Prepared", dtxProtocolCommand=DTX_PROTOCOL_COMMAND_RECOVERY_COMMIT_PREPARED) at postgres.c:1570
#6 PostgresMain (argc=, argv=argv@entry=0x3268f98, dbname=0x3267e90 "postgres", username=) at postgres.c:5482
The test case of this commit simulates a repro of this bug.1 parent 2fa7c06 commit 3f58c32
File tree
4 files changed
+187
-5
lines changed- src
- backend/commands
- test/isolation2
- expected/resgroup
- sql/resgroup
4 files changed
+187
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
| 139 | + | |
140 | 140 | | |
141 | 141 | | |
142 | | - | |
| 142 | + | |
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| |||
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
431 | | - | |
432 | | - | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
433 | 440 | | |
434 | 441 | | |
435 | | - | |
| 442 | + | |
436 | 443 | | |
437 | 444 | | |
438 | 445 | | |
| |||
Lines changed: 120 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
Lines changed: 54 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
0 commit comments