Skip to content

Conversation

@zxs1633079383
Copy link

Fixes #36978

Changes proposed in this pull request

This PR fixes the incorrect handling of PostgreSQL custom types (UDT, enum) and VARBIT/BIT VARYING in ShardingSphere-Proxy, especially under the extended query protocol (Parse → Bind → Describe → Execute).
The main issue was that all PostgreSQL Types.OTHER were incorrectly mapped to JSON, causing protocol inconsistencies and incorrect parameter/column parsing.

  1. Type System Improvements

Added explicit support for PostgreSQL VARBIT and VARBIT_ARRAY with a dedicated value parser.
Introduced UDT_GENERIC to represent PostgreSQL user-defined types (enum, domain, composite).
Added detection logic based on both JDBC type and actual columnTypeName.
Removed the incorrect fallback mapping: Types.OTHER → JSON.

  1. Parse Phase Enhancement

Parse phase now propagates both parameterTypes and parameterTypeNames into PostgreSQLServerPreparedStatement.
Ensures downstream Bind/Describe phases can resolve correct PostgreSQL types instead of JSON.

  1. Describe Phase Fixes

Insert statements now resolve column types (dataType, typeName) from ShardingSphere metadata.
Automatically assigns accurate PostgreSQL column types to each parameter (including UDT + VARBIT).
Saves resolved typeName into the prepared statement for Bind phase use.

  1. Bind Phase Corrections

Construct proper PGobject when binding UDT or JSONB parameters using the resolved typeName.
Prevent JSON fallback for VARBIT and UDT parameters.
Ensures PostgreSQL receives correct parameter type during Bind.

  1. Metadata Enhancements

Added typeName to ColumnMetaData and ShardingSphereColumn.
PostgreSQLDataTypeOption updated to:
Include VARBIT / BIT VARYING in extra types.
Load all PostgreSQL UDTs dynamically via pg_type (enum, composite, domain).

  1. End-to-End Validation

Verified on PostgreSQL 17.5 with real application workloads:
INSERT: VARBIT and custom UDT work correctly.
SELECT: raw VARBIT bitstring + UDT enum returned correctly.
UPDATE: bitwise OR on VARBIT executed successfully.
No JSON fallback or protocol errors.

This fully resolves issue #36978.

Before committing this PR, I'm sure that I have checked the following options:

Checklist

  • I have self-reviewed the commit code.
  • I have added (or requested) the correct labels for this PR.
  • I have passed the Maven check locally:
    ./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e
  • I have updated the relevant documentation.
  • I have added unit tests for my changes.
  • I have updated the Release Notes for the current development version.

(https://shardingsphere.apache.org/community/en/involved/contribute/contributor/)

12951764386988_ pic 12961764386994_ pic 12971764386999_ pic

张立超 added 5 commits November 24, 2025 15:10
… 2. 其他数据库的MetaDataLoader ColumnMetaData 占位. 3. ColumnMetaData 和ShardingSphereColumn 新增type_name 字段. 用以Pg 识别自定义Type
… 2. 其他数据库的MetaDataLoader ColumnMetaData 占位. 3. ColumnMetaData 和ShardingSphereColumn 新增type_name 字段. 用以Pg 识别自定义Type
…语句 支持识别typeName. 4. ParseExec 阶段 支持TypeName 读取. final: 稳定版本. 自定义UDT 类型 update,insert ,select ,delete 本地项目无问题.
Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

• Problem Understanding
Thank you for tackling PostgreSQL custom types/VARBIT, but the current PR is very large (93 files, 878 additions, 470 deletions) and contains several regression risks. We need to narrow the scope and fix key issues before moving forward.

Root Cause

  • Types.OTHER is globally remapped to a fabricated UDT_GENERIC OID (and it absorbs JSON/JSONB), so extended-protocol parameter/column OIDs become wrong; clients may mis-type or fail to bind (JSON must stay at OIDs 114/3802).
  • PostgreSQLColumnType stores typeName by mutating the enum singleton (withTypeName), which risks cross-session/cross-statement leakage and concurrency issues.
  • PostgreSQLUDTValueParser builds PGobject without setType; Bind also assumes typeName is non-null, so null typeName leads to Bind-time exceptions.
  • Multiple metadata loaders/tests inject the Chinese placeholder "占位" as typeName, violating English-only readability and polluting metadata/YAML.

Analysis

  • PostgreSQL extended protocol requires exact type OIDs; sending JSON/JSONB as a fabricated UDT OID breaks driver/client compatibility.
  • Enums are global singletons; writing mutable state (typeName) onto them invites races and cross-request contamination.
  • PGobject must have setType to carry the real type; absent typeName needs a safe fallback to avoid Bind failures.
  • Large, unrelated edits (H2/MySQL/SQLServer/Firebird, etc.) only to add placeholders are unnecessary and enlarge regression surface.

Conclusion & Requests (please simplify and fix accordingly)

  1. Narrow changes strictly to PostgreSQL custom types/VARBIT; drop unrelated placeholder/typeName edits across other dialects/tests (avoid a 90+ file change).
  2. Restore correct JSON/JSONB OIDs; only attach column typeName as a side-car for non-JSON/VARBIT/UUID Types.OTHER, without altering enum mappings.
  3. Remove mutable state from the enum; keep typeName alongside parameters, not stored on the enum instance.
  4. In UDT/JSONB binding, always setType(resolvedTypeName) on PGobject and add a safe fallback when typeName is missing to prevent Bind-time NPE/SQLException.
  5. Remove all placeholder/non-English strings; use real column typeNames or leave blank without polluting metadata.
  6. Add/adjust tests to cover: JSON/JSONB OIDs unchanged, VARBIT path, UDT binding with/without typeName, and no shared state across statements/sessions.

Please resubmit with only the necessary PostgreSQL fixes and the strengthened tests so the review can proceed efficiently.

@zxs1633079383
Copy link
Author

zxs1633079383 commented Nov 29, 2025

Thank you for the detailed review and clarification.

I have now narrowed this PR strictly to the PostgreSQL custom type / VARBIT
scope and reverted all unrelated edits across other dialects and metadata
loaders. The change surface is minimized and focused.

Summary of completed fixes:

  1. Restored correct JSON / JSONB OIDs (114 / 3802).
  2. Removed the mutable state from PostgreSQLColumnType:
    • dropped typeName field
    • removed withTypeName()
    • enum is now pure and stateless again.
  3. Restored the original JDBC type mapping logic, and added explicit
    detection for JSON / JSONB / VARBIT / UUID without altering global
    enum mappings.
  4. Removed placeholder / non-English strings ("占位") from all metadata
    and tests.
  5. Reverted all non-PostgreSQL tests and code paths which were modified
    earlier. Only PostgreSQL code is touched now.
  6. Centralized JSON / JSONB / UDT PGobject construction in the Bind phase
    with safe fallback typeName handling to prevent Bind-time NPE/SQLException.

These changes now address the core regressions described in your review.

Remaining work:
• No shared state across statements/sessions
I have already restored PostgreSQLColumnTypeTest and added JSON/JSONB and
VARBIT assertions;

Regarding mvnw test failures: they are caused only by the reverted non-PG
test files referencing removed placeholder fields. Once the PostgreSQL-only
tests are updated as described above, the build will pass again.

Thank you again for the review. The PR is now narrowly scoped, low-risk,
and focused on the actual protocol correctness issues.

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes are even more extensive than last time, reaching 163 modifications. I don't think anyone can thoroughly review it or fully assess its impact in a short time. Could you please break this PR into smaller ones, keeping the changes within 10 classes?

张立超 added 3 commits November 30, 2025 13:03
…t-wide UDT loading

Enhances PostgreSQL type consumption by adding typeName support across metadata, loaders, and prepared statements.
Introduces loadUDTTypes in dialects and updates PG-specific bindings, parsing, describing, and column swapping logic.
@zxs1633079383 zxs1633079383 force-pushed the fix/postgresql-custom-types branch from d81392b to 33ebc87 Compare November 30, 2025 05:05
@zxs1633079383
Copy link
Author

@terrymanu Thank you again for the detailed review and guidance.

This is my first time contributing to an open-source project of this size.If there is anything I am not handling correctly, please kindly point it out —I will strictly follow your suggestions. I really hope to contribute to thePostgreSQL custom UDT support together with the community.

Regarding the PR size:

After removing all unrelated dialect changes and restoring the enum to a
pure/immutable design, the scope has already been reduced significantly.
The current version now touches 18 files, all of which are tightly
related to the PostgreSQL extended-protocol path
(Parse → Describe → Bind → Metadata).

At this point, the PR is coherent and its intention is clear.However, if you would still prefer to split it further into sub-PRs (each under 10 classes), I am absolutely willing to do so.

Before proceeding, could you please confirm whether:

• the current 18-file PR is acceptable for review,
or
• you would like me to break it into several smaller PRs?

If splitting is required, I can reorganize the changes according to clear
module boundaries, for example:

  1. Column metadata layer

    • ColumnMetaData / ShardingSphereColumn adjustments
    • typeName propagation
    • YamlColumnSwapper and metadata conversion logic
  2. PostgreSQL dialect loader

    • PostgreSQLDataTypeLoader
    • loading of enum/domain/composite types
    • VARBIT and BIT VARYING metadata handling
  3. PostgreSQL type system

    • PostgreSQLColumnType enum refinements
    • JSON/JSONB/VARBIT/UUID detection
    • related text/binary parsers
  4. Extended-protocol flow (Parse → Describe → Bind)

    • parameter type collection
    • typeName resolution in Describe
    • PGobject construction and safe fallback in Bind
  5. PostgreSQL-only tests

    • JSON/JSONB OID correctness
    • VARBIT parsing/handling
    • UDT typeName propagation and Bind behavior

Or I can follow any grouping you prefer.

My goal is to keep the changes clear, focused, and low-risk so that the review process becomes easier for everyone.

Thank you again for your patience and guidance. I will update the PR immediately according to your confirmation.

@zxs1633079383
Copy link
Author

Hi @terrymanu , thanks again for your guidance.

I have now split the original large PR into two smaller and focused PRs, each within the 10-class limit:

Metadata & typeName pipeline – 8 files
#37230

Extended protocol fixes – 10 files
#37228

Both PRs isolate the changes into clear functional boundaries (metadata vs. protocol) and avoid unnecessary modifications outside the PostgreSQL scope. The original parent PR (#37216) has also been updated to reference these two sub-PRs.

If these PRs still do not meet the review expectations, could you please let me know the specific problems or concerns? I take this contribution very seriously and I’m willing to carefully adjust and improve the changes according to your suggestions.

Thank you again for your time and review. 🙏

@terrymanu
Copy link
Member

The CI is broken, could you fix it first?

@zxs1633079383
Copy link
Author

@terrymanu Thanks for the feedback. I will fix all CI failures first.

The current failures are mainly caused by the introduction of typeName,
which affects multiple components including metadata, binders, and the
PostgreSQL prepared statement structures. As a result, a number of
existing unit tests and E2E tests need to be updated.

For easier review, I plan to open a new PR that includes:

  • all the functional changes from this PR, and
  • the required updates to unit tests and E2E tests

This new PR will ensure that the entire build passes, including:

./mvnw clean install -B -T1C -Pcheck

However, since many test classes need to be adjusted, that PR will
naturally include a larger amount of file changes.

Thanks again for your patience. I will update the PR soon.

@zxs1633079383
Copy link
Author

Hi @terrymanu,

Following your feedback, I have created a new PR #37259, which includes:

  1. All functional changes from this PR (Fix/postgresql custom types PostgreSQL Custom Types Parsing Issue in ShardingSphere-Proxy #37216)

These include the fixes for:

PostgreSQL custom types (UDT/enum/domain/composite)

VARBIT / BIT VARYING

Extended protocol: Parse → Bind → Describe → Execute

typeName propagation across metadata and prepared statements

  1. All required test updates

Because typeName was added to:

ColumnMetaData

ShardingSphereColumn

ColumnReviseEngine

binder & extended protocol structures

many unit tests and E2E tests needed adjustments.
The new PR (#37259) includes all these test updates, ensuring that the full CI pipeline passes after running:

./mvnw spotless:apply -Pcheck
./mvnw clean install -B -T1C -Pcheck

📌 Purpose of splitting

To keep this PR (#37216) focused and easy to review, I intentionally did not include the large amount of test modifications here.
The new PR provides the complete, CI-passing version and can be used as the final merge target.

If you prefer, I can also merge the test updates back into this PR, but creating a separate complete PR seemed clearer based on your earlier suggestion of reducing change size.

I’m very open to any further adjustments.
Thank you again for your time and guidance — happy to keep improving ShardingSphere’s PostgreSQL support together!

Copy link
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix GitHub action first

@terrymanu
Copy link
Member

Closed because of no response anymore.

@terrymanu terrymanu closed this Jan 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PostgreSQL Custom Types Parsing Issue in ShardingSphere-Proxy

2 participants