Skip to content

Fix FOR XML RAW('') empty element name handling and all-NULL row output#4680

Open
JapleenKaur3 wants to merge 6 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
JapleenKaur3:raw-diff
Open

Fix FOR XML RAW('') empty element name handling and all-NULL row output#4680
JapleenKaur3 wants to merge 6 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
JapleenKaur3:raw-diff

Conversation

@JapleenKaur3
Copy link
Copy Markdown
Contributor

@JapleenKaur3 JapleenKaur3 commented Mar 23, 2026

Description

Fix FOR XML RAW behavior with empty element name and all-NULL rows

Changes are in tsql_row_to_xml_raw() in forxml.c. When element_name is empty, the opening and closing wrapper tags are now skipped in ELEMENTS mode (same approach as PATH mode). For attribute-centric mode (without ELEMENTS), an error is raised since empty row tag names are not valid. The self-closing tag logic for all-NULL rows was also corrected.

Authored-by: Japleen Kaur amjj@amazon.com
Signed-off-by: Japleen Kaur amjj@amazon.com

Issues Resolved

JIRA : BABEL-6379

  1. FOR XML RAW(''), ELEMENTS — Babelfish produced <><a>1</a><b>2</b></> (invalid XML with empty wrapper tags). SQL Server produces <a>1</a><b>2</b> (no wrapper). Fixed by skipping the wrapper tag when element name is empty, consistent with how PATH('') handles it.

  2. FOR XML RAW('') (without ELEMENTS) — Babelfish produced < a="1" b="2"/> (invalid XML). SQL Server throws: "Row tag omission (empty row tag name) cannot be used with attribute-centric FOR XML serialization." Fixed by adding the same error that PATH mode already throws for this case.

  3. All-NULL rows with ELEMENTS — Babelfish produced <row></row>. SQL Server produces<row/> (self-closing tag). Fixed by replacing the closing > of the opening tag with /> when all columns are NULL.

Test Scenarios Covered

  • Use case based -
    RAW(''), ELEMENTS without wrapper tag, RAW('') error without ELEMENTS, all-NULL rows producing .

  • Boundary conditions -
    Empty element name (RAW('')).

  • Arbitrary inputs -
    N/A

  • Negative test cases -
    N/A

  • Minor version upgrade tests -
    N/A

  • Major version upgrade tests -
    N/A

  • Performance tests -
    N/A

  • Tooling impact -
    N/A

  • Client tests -
    N/A

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@JapleenKaur3 JapleenKaur3 changed the title Raw diff Fix FOR XML RAW('') empty element name handling and all-NULL row output Mar 23, 2026
Copy link
Copy Markdown
Contributor

@kuntalghosh kuntalghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM Code Review:

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In forxml.c - Potential Memory Leak:

HeapTupleData   tmptup;
HeapTuple       tuple;
bool            allnull = true;  // New variable

td = DatumGetHeapTupleHeader(record);

/* Extract rowtype info and find a tupdesc */
tupdesc = lookup_rowtype_tupdesc(HeapTupleHeaderGetTypeId(td),
                               HeapTupleHeaderGetTypMod(td));

The code calls lookup_rowtype_tupdesc() but doesn't have a matching ReleaseTupleDesc(). This violates xml_json rule #4: "ALWAYS call ReleaseTupleDesc() after tupdesc() call in FOR XML and FOR JSON ro
s to prevent tuple descriptor leaks."

Suggested fix:

tupdesc = lookup_rowtype_tupdesc(HeapTupleHeaderGetTypeId(td),
                               HeapTupleHeaderGetTypMod(td));
/* Process the tuple */
...
ReleaseTupleDesc(tupdesc);  // Add this before returning

Issue 2 - Test Coverage Gaps:

The PR's test coverage section is incomplete according to testing rule #12. Several required sections are empty:

  • Arbitrary inputs
  • Negative test cases
  • Minor version upgrade tests
  • Major version upgrade tests
  • Performance tests
  • Tooling impact
  • Client tests

The PR should document N/A for non-applicable sections rather than leaving them blank.

Issue 3 - Missing Error Message Standards:

The new error message:

ereport(ERROR,
        (errcode(ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION),
         errmsg("Row tag omission (empty row tag name) cannot be used "
                "with attribute-centric FOR XML serialization.")));

This violates coding_conventions rule #28 and #31:

  • Uses passive voice ("cannot be used") instead of active voice
  • Doesn't follow error message capitalization standards

Suggested fix:

ereport(ERROR,
        (errcode(ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION),
         errmsg("attribute-centric FOR XML serialization requires a row tag name")));

Positive Aspects:

  1. The changes correctly implement xml_json rule #6: "ALWAYS treat an empty string element name in FOR XML RAW('') as a request to omit the row wrapper tag"

  2. The test file additions properly verify both the positive case (empty element name with ELEMENTS) and negative case (error without ELEMENTS)

  3. The code follows the coding style conventions for indentation and brace placement

Test Coverage Gaps:

  1. Missing test for boundary conditions:

    • Element names with only whitespace
    • Element names with special characters
    • Maximum length element names
  2. Missing test for NULL handling:

    • NULL element name parameter
    • All-NULL columns with empty element name

Note: The LLMCodeReviewer is meant to augment, not replace, human review! Use the suggestions with discretion.


#BABEL-AI-REVIEW

@JapleenKaur3
Copy link
Copy Markdown
Contributor Author

JapleenKaur3 commented Mar 24, 2026

LLM Code Review Feedback:

Issue 1 - Potential Memory Leak:
ReleaseTupleDesc(tupdesc) is already called at the end of tsql_row_to_xml_raw().

Issue 2 - Test Coverage Gaps:
Updated the PR description with N/A for non-applicable sections.

Issue 3 - Missing Error Message Standards:
The error message matches SQL Server's exact error text. Kept as-is for compatibility.

Comment on lines +328 to +329
state->data[state->len - 1] = '/';
appendStringInfoChar(state, '>');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opening tag in ELEMENTS mode appends <row> so the buffer ends with >. When all columns are NULL, we need to turn <row> into <row/>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant why are we updating stringInfo data directly state->data[state->len - 1] = '/';? what is surety that it wouldn't overwrite existing data?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only runs when allnull is true, so no child elements have been appended after the opening tag. The last character in the buffer at this point is always the > from <row>, so we're only replacing that specific character.
Also, this is the same pattern already used in tsql_row_to_xml_path() for the same all-NULL case

SELECT 1 AS a, 2 AS b FOR XML RAW(''), ELEMENTS;
GO

-- Empty element name without ELEMENTS (attribute-centric) - should error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do we have tests for all nulls under attribute mode? and with element mode but with empty element name?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added both tests now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants