Replies: 4 comments
-
|
These logs suggest that although the gpinitsystem process completed, it failed to properly register the mirrors for the specified content IDs. |
Beta Was this translation helpful? Give feedback.
-
|
I have some interesting information to share related to what @wutaco is observing. I will share later today as I gather additional contextual information. |
Beta Was this translation helpful? Give feedback.
-
gpinitsystem Log Message IssuesThis document outlines three issues discovered in Base Issue: Confusing Mirror Registration Success Messages (Original Reported Issue)Description: During mirror registration, contradictory log messages are generated that combine "Successfully completed" with "failed to register mirror". Location: Log Evidence: Status: Confirmed - This behavior has been reproduced and observed during testing. Impact: Creates confusion when reviewing logs - messages appear to indicate both success and failure simultaneously. Despite these messages, mirrors register successfully and function correctly. Issue 1: Misleading WARN Messages for Default ConfigurationsDescription: Normal default configuration usage is logged as WARN instead of INFO level, triggering end-of-installation warnings. Location: Code: if [ x"$ETCD_HOST_CONFIG" = x"" ] || [ x"$CLUSTER_BOOT_MODE" = x"DEMO" ]; then
LOG_MSG "[WARN]:-No ETCD cluster host config provided, use default configuration."
else
# ... validation logic ...
fi
if [ x"$FTS_HOST_CONFIG" = x"" ] || [ x"$CLUSTER_BOOT_MODE" = x"DEMO" ]; then
LOG_MSG "[WARN]:-No FTS cluster host config provided, use default configuration."
else
# ... validation logic ...
fiLog Evidence: Impact:
Fix: Change Issue 2: Insufficient Timeout for Mirror SynchronizationDescription: The Location: Code: FORCE_FTS_PROBE () {
# loop on gp_segment_configuration to make sure primary/mirror pairs are up and in sync
for i in {1..60}; do
if [ $i == 60 ]; then
# Log segment status and generate warnings
LOG_MSG "[WARN]:" 1
LOG_MSG "[WARN]:-Failed to start Cloudberry instance; please review gpinitsystem log to determine failure." 1
break;
fi
RESULT=$( $PSQL -p $GP_PORT -d "$DEFAULTDB" -X -A -t -c "select count(*) > 0 from gp_segment_configuration where (mode = 'n' or status = 'd') and content != -1;" )
if [ x"$RESULT" == x"f" ]; then
break
fi
done
}Timeline Evidence: Testing: Adding a 5-second sleep in each iteration shows mirrors successfully sync after ~12 cycles (60 seconds), confirming that mirrors need more time to transition from "down" to "up" state than the current 60-iteration loop provides. Impact:
Root Cause: The 60-iteration loop completes in under 1 second, which is insufficient time for normal mirror synchronization process. Overall ImpactAll three issues contribute to a poor user experience where End-User Experience: At the completion of This creates unnecessary alarm and confusion, leading users to believe their installation may have failed when it has actually succeeded. This is particularly problematic for:
RecommendationThese logging issues should be addressed to provide accurate feedback about system status and reduce false alarms during normal operations. |
Beta Was this translation helpful? Give feedback.
-
|
One more note - this same WARN behavior is also observed in the CI runs. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
gpinitsystem mirror create error ,But after executing gprecoverseg, it can be normal
gpinitsystem log as
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Successfully completed failed to register mirror for contentid=0
20250923:11:53:43:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Successfully completed failed to register mirror for contentid=1
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Successfully completed failed to register mirror for contentid=2
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Start Function ERROR_CHK
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-Successfully completed failed to register mirror for contentid=3
20250923:11:53:44:361565 gpinitsystem:cdw:gpadmin-[INFO]:-End Function ERROR_CHK
Beta Was this translation helpful? Give feedback.
All reactions