Skip to content

branch-4.0: [fix](test) fix unstable test cases#63548

Open
morningman wants to merge 10 commits into
apache:branch-4.0from
morningman:fix-cases-20260522
Open

branch-4.0: [fix](test) fix unstable test cases#63548
morningman wants to merge 10 commits into
apache:branch-4.0from
morningman:fix-cases-20260522

Conversation

@morningman
Copy link
Copy Markdown
Contributor

@morningman morningman commented May 22, 2026

mymeiyi and others added 7 commits May 22, 2026 08:38
Temporary tables will be deleted periodically, this case does not
require it
Partial pick of upstream commit 92f75f3 [feature](memory) Global mem
control on scan nodes apache#61271. Only the test groovy and its .out baseline
are picked here; the rest of that PR is a large BE memory-control feature
we do not want on branch-4.0.

Adds v1..v5 to the ORDER BY in inspectRows so that partial-update
multi-version rows exposed by skip_delete_sign have a deterministic
ordering across runs.
…_with_s3data (apache#62488)

The regression test
`datatype_p0/nested_types/base_cases/one_level_nestedtypes_with_s3data`
was failing with a `CHAR result mismatch` on the `order_qt_sql_s3` tag.

Root-cause analysis:

1. The S3 source files (`one_level_array.parquet`, `.orc`, `.csv`) in
`oss://doris-regression-hk/regression/datalake/` have **not changed
since 2024-07-27** (confirmed via `ossutil stat`). The parquet `c_bool`
column is `list<bool>` (verified with pyarrow). The column type
(`array<boolean>`) in the test plugin has also never changed.
2. On **2025-09-12**, commit `074d88b` (PR apache#55896, "fix cases from s3")
added `WHERE k1 IS NOT NULL` to the query and regenerated the `.out`. At
that time, Doris had a bug in reading parquet `list<bool>` values inside
nested arrays, producing incorrect boolean values (e.g. `[0, 0, 1, 0,
...]` instead of the correct `[0, 0, 0, 1, 1, ...]`).
3. On **2025-12-16**, commit `0031179b1e6` (PR apache#58785, "fix parquet topn
lazy mat complex data error result") refactored `ColumnChunkReader` to
use `IN_COLLECTION`/`OFFSET_INDEX` template parameters, giving
nested-array columns a distinct and correct read path. This fix
incidentally corrected the boolean array reading for columns like
`c_bool`.
4. After PR apache#58785 landed, Doris now reads parquet `list<bool>`
correctly (matching pyarrow), but the `.out` file was never updated,
causing the test to fail.

Fix: force-regenerate `one_level_nestedtypes_with_s3data.out` using the
current correct Doris behavior against the unchanged S3 data.
…_v3 (apache#62659)

## Summary
- Add `DROP TABLE IF EXISTS` for tables `t1` and `t2` before `CREATE
TABLE` in `test_inverted_index_v3`
- Without this, the test fails with "Table 't1' already exists" on
subsequent runs within the same pipeline

## Verification
- First run: PASSED
- Second run without fix: FAILED ("Table 't1' already exists")  
- Second run with fix: PASSED

## Test plan
- [x] Verified on live CI environment (build 188671)
- [x] Reproduced failure by running the test twice

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@morningman morningman requested a review from yiguolei as a code owner May 22, 2026 15:45
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

yiguolei
yiguolei previously approved these changes May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • regression-test/data/datatype_p0/nested_types/base_cases/one_level_nestedtypes_with_s3data.out

Consider using git-lfs to manage large files.

@github-actions github-actions Bot added lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request approved Indicates a PR has been approved by one committer. labels May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

SHOW CREATE TABLE prints the table's replication_allocation, which is
rewritten by FE's force_olap_table_replication_allocation config and
therefore depends on the actual BE count in the CI cluster. The previous
.out baseline hard-coded "tag.location.default: 1", so any run on a
multi-BE cluster (default 3 replicas) fails the check.

Detect BE count via SHOW BACKENDS once at suite start. For every
SHOW CREATE TABLE check, branch:
  beNum == 1 -> qt_show_1be   expects "tag.location.default: 1"
  beNum  > 1 -> qt_show_multi_be expects "tag.location.default: 3"

The .out file keeps both _1be and _multi_be blocks interleaved per call
site; the regression framework picks the matching tag and drains the
other one.

The single SHOW CREATE VIEW check is unaffected (no replication info
in its output) and remains qt_show.
@github-actions
Copy link
Copy Markdown
Contributor

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • regression-test/data/datatype_p0/nested_types/base_cases/one_level_nestedtypes_with_s3data.out

Consider using git-lfs to manage large files.

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label May 22, 2026
@morningman
Copy link
Copy Markdown
Contributor Author

run buildall

morningman and others added 2 commits May 22, 2026 23:30
…ic-partition logic

Pick apache#63551 from master to branch-4.0.

Code logic fixes:
- SchemaChangeHandler: avoid race between ALTER TABLE properties and
  DynamicPartitionScheduler (skip per-partition loop when no partition-level
  property needs to be sent; tolerate concurrent partition drops otherwise).
- LoadManager/LoadProcessor: add diagnostic logs for InsertLoadJob jobId
  mismatch path so SHOW LOAD all-zero JobDetails can be diagnosed.

Unstable case fixes:
- test_manager_interface_1: use `admin set all frontends config` so the
  value is broadcast to all FEs on multi-FE/cloud setups.
- test_temp_table: derive expected SHOW TABLETS row count from
  force_olap_table_replication_allocation.
- scanner_profile: loosen `actualRows=9` to a regex range [1, 9].
- test_backup_restore_colocate: wait for the new-db colocate group to
  become stable before EXPLAIN in test 6.
- check_hash_bucket_table: quote interpolated identifiers in `use`,
  `desc`, `show create table`, `show partitions`, `show replica status`
  and `crc32_internal` so reserved-keyword table names do not crash the
  whole suite.
- test_doris_jdbc_catalog: drop the leaked `order` table at the end.
- pipeline/p0/conf/fe.conf: add max_remote_file_system_cache_num=1000.

Note: SchemaChangeHandler is adapted for branch-4.0 — the master version
also threads `verticalCompactionNumColumnsPerGroup` through
`updatePartitionProperties`, which does not exist on this branch, so
both the `needPerPartitionUpdate` check and the call site drop that arg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • regression-test/data/datatype_p0/nested_types/base_cases/one_level_nestedtypes_with_s3data.out

Consider using git-lfs to manage large files.

@morningman
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 7.69% (2/26) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants