Fix FOR XML RAW('') empty element name handling and all-NULL row output#4680
Fix FOR XML RAW('') empty element name handling and all-NULL row output#4680JapleenKaur3 wants to merge 6 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
Conversation
kuntalghosh
left a comment
There was a problem hiding this comment.
LLM Code Review:
Based on the provided code changes, here are my feedback and suggestions:
Issue 1 - In forxml.c - Potential Memory Leak:
HeapTupleData tmptup;
HeapTuple tuple;
bool allnull = true; // New variable
td = DatumGetHeapTupleHeader(record);
/* Extract rowtype info and find a tupdesc */
tupdesc = lookup_rowtype_tupdesc(HeapTupleHeaderGetTypeId(td),
HeapTupleHeaderGetTypMod(td));The code calls lookup_rowtype_tupdesc() but doesn't have a matching ReleaseTupleDesc(). This violates xml_json rule #4: "ALWAYS call ReleaseTupleDesc() after tupdesc() call in FOR XML and FOR JSON ro
s to prevent tuple descriptor leaks."
Suggested fix:
tupdesc = lookup_rowtype_tupdesc(HeapTupleHeaderGetTypeId(td),
HeapTupleHeaderGetTypMod(td));
/* Process the tuple */
...
ReleaseTupleDesc(tupdesc); // Add this before returningIssue 2 - Test Coverage Gaps:
The PR's test coverage section is incomplete according to testing rule #12. Several required sections are empty:
- Arbitrary inputs
- Negative test cases
- Minor version upgrade tests
- Major version upgrade tests
- Performance tests
- Tooling impact
- Client tests
The PR should document N/A for non-applicable sections rather than leaving them blank.
Issue 3 - Missing Error Message Standards:
The new error message:
ereport(ERROR,
(errcode(ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION),
errmsg("Row tag omission (empty row tag name) cannot be used "
"with attribute-centric FOR XML serialization.")));This violates coding_conventions rule #28 and #31:
- Uses passive voice ("cannot be used") instead of active voice
- Doesn't follow error message capitalization standards
Suggested fix:
ereport(ERROR,
(errcode(ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION),
errmsg("attribute-centric FOR XML serialization requires a row tag name")));Positive Aspects:
-
The changes correctly implement xml_json rule #6: "ALWAYS treat an empty string element name in FOR XML RAW('') as a request to omit the row wrapper tag"
-
The test file additions properly verify both the positive case (empty element name with ELEMENTS) and negative case (error without ELEMENTS)
-
The code follows the coding style conventions for indentation and brace placement
Test Coverage Gaps:
-
Missing test for boundary conditions:
- Element names with only whitespace
- Element names with special characters
- Maximum length element names
-
Missing test for NULL handling:
- NULL element name parameter
- All-NULL columns with empty element name
Note: The LLMCodeReviewer is meant to augment, not replace, human review! Use the suggestions with discretion.
#BABEL-AI-REVIEW
LLM Code Review Feedback:Issue 1 - Potential Memory Leak: Issue 2 - Test Coverage Gaps: Issue 3 - Missing Error Message Standards: |
| state->data[state->len - 1] = '/'; | ||
| appendStringInfoChar(state, '>'); |
There was a problem hiding this comment.
The opening tag in ELEMENTS mode appends <row> so the buffer ends with >. When all columns are NULL, we need to turn <row> into <row/>
There was a problem hiding this comment.
I meant why are we updating stringInfo data directly state->data[state->len - 1] = '/';? what is surety that it wouldn't overwrite existing data?
There was a problem hiding this comment.
This code only runs when allnull is true, so no child elements have been appended after the opening tag. The last character in the buffer at this point is always the > from <row>, so we're only replacing that specific character.
Also, this is the same pattern already used in tsql_row_to_xml_path() for the same all-NULL case
| SELECT 1 AS a, 2 AS b FOR XML RAW(''), ELEMENTS; | ||
| GO | ||
|
|
||
| -- Empty element name without ELEMENTS (attribute-centric) - should error |
There was a problem hiding this comment.
where do we have tests for all nulls under attribute mode? and with element mode but with empty element name?
There was a problem hiding this comment.
I've added both tests now.
Description
Fix FOR XML RAW behavior with empty element name and all-NULL rows
Changes are in
tsql_row_to_xml_raw()in forxml.c. When element_name is empty, the opening and closing wrapper tags are now skipped in ELEMENTS mode (same approach as PATH mode). For attribute-centric mode (without ELEMENTS), an error is raised since empty row tag names are not valid. The self-closing tag logic for all-NULL rows was also corrected.Authored-by: Japleen Kaur amjj@amazon.com
Signed-off-by: Japleen Kaur amjj@amazon.com
Issues Resolved
JIRA : BABEL-6379
FOR XML RAW(''), ELEMENTS — Babelfish produced
<><a>1</a><b>2</b></>(invalid XML with empty wrapper tags). SQL Server produces<a>1</a><b>2</b>(no wrapper). Fixed by skipping the wrapper tag when element name is empty, consistent with how PATH('') handles it.FOR XML RAW('') (without ELEMENTS) — Babelfish produced
< a="1" b="2"/>(invalid XML). SQL Server throws: "Row tag omission (empty row tag name) cannot be used with attribute-centric FOR XML serialization." Fixed by adding the same error that PATH mode already throws for this case.All-NULL rows with ELEMENTS — Babelfish produced
<row></row>. SQL Server produces<row/>(self-closing tag). Fixed by replacing the closing > of the opening tag with /> when all columns are NULL.Test Scenarios Covered
Use case based -
RAW(''), ELEMENTS without wrapper tag, RAW('') error without ELEMENTS, all-NULL rows producing .
Boundary conditions -
Empty element name (RAW('')).
Arbitrary inputs -
N/A
Negative test cases -
N/A
Minor version upgrade tests -
N/A
Major version upgrade tests -
N/A
Performance tests -
N/A
Tooling impact -
N/A
Client tests -
N/A
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.