Skip to content

Conversation

@yjhjstz
Copy link
Member

@yjhjstz yjhjstz commented Aug 7, 2025

In do_analyze_rel, the function ComputeExtStatisticsRows calculates the minimum number of sample rows needed for extended statistics (e.g., dependencies, ndistinct).

This calculation is only meaningful and required on the Query Dispatcher (QD), since only the QD is responsible for coordinating the final extended statistics generation.

Previously, all segments (including QEs) executed this logic, resulting in excessive sampling. For large tables, this caused the QD to receive more rows than it can handle, leading to the error:

ERROR: too many sample rows received from gp_acquire_sample_rows

Fixes #1293

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@yjhjstz yjhjstz marked this pull request as ready for review August 7, 2025 00:56
@yjhjstz yjhjstz requested a review from jiaqizho August 7, 2025 00:56
@yjhjstz yjhjstz requested a review from jiaqizho August 8, 2025 13:35
@yjhjstz yjhjstz requested a review from gfphoenix78 August 12, 2025 19:19
…sticsRows to QD

In `do_analyze_rel`, the function `ComputeExtStatisticsRows` calculates the minimum
number of sample rows needed for extended statistics (e.g., dependencies, ndistinct).

This calculation is only meaningful and required on the Query Dispatcher (QD), since
only the QD is responsible for coordinating the final extended statistics generation.

Previously, all segments (including QEs) executed this logic, resulting in excessive
sampling. For large tables, this caused the QD to receive more rows than it can handle,
leading to the error:

    ERROR: too many sample rows received from gp_acquire_sample_rows
@yjhjstz yjhjstz force-pushed the yjh/fix_sample_rows branch from 56b520b to 966bc3c Compare August 13, 2025 14:45
@yjhjstz yjhjstz merged commit 3ead998 into apache:main Aug 13, 2025
27 checks passed
@yjhjstz yjhjstz deleted the yjh/fix_sample_rows branch November 17, 2025 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] ANALYZE fails with extended statistics

3 participants