Skip to content

fix for char and nchar function correct handling#3855

Merged
shardgupta merged 24 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
pranavJ23:babel-4810
Jul 1, 2025
Merged

fix for char and nchar function correct handling#3855
shardgupta merged 24 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
pranavJ23:babel-4810

Conversation

@pranavJ23
Copy link
Copy Markdown
Contributor

@pranavJ23 pranavJ23 commented Jun 19, 2025

Description

Function char throws error if the input integer is not in range 0-255, where as tsql returns NULL instead of error.
Function nchar should throw NULL if the the input integer is not in range 0-65535
Fix for nchar() function so that if input type is coercible to int then it should execute successfully without throwing error

Issues Resolved

[BABEL-4810] [BABEL-4812][BABEL-5933]

Signed-off-by: Pranav Jain pranavjc@amazon.com

Test Scenarios Covered

  • Use case based -

  • Boundary conditions -

  • Arbitrary inputs -

  • Negative test cases -

  • Minor version upgrade tests -

  • Major version upgrade tests -

  • Performance tests -

  • Tooling impact -

  • Client tests -

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Jun 20, 2025

Pull Request Test Coverage Report for Build 15966586600

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 23 of 23 (100.0%) changed or added relevant lines in 1 file are covered.
  • 35 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.04%) to 75.872%

Files with Coverage Reduction New Missed Lines %
contrib/babelfishpg_tsql/src/cursor.c 35 84.53%
Totals Coverage Status
Change from base Build 15889588684: 0.04%
Covered Lines: 49471
Relevant Lines: 65203

💛 - Coveralls

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Copy link
Copy Markdown
Contributor

@Anikait143 Anikait143 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add following test case:

  1. Tests with schema.function_call
  2. Test with different collation settings
  3. Test for multibyte character input integer value.
  4. Test with edge case values
  5. Test with datatype inputs that can be implicitly casted to int

END IF;
END $$;

create or replace function sys.cht(x in int) returns sys.varchar
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old function should be deprecated since we have changed function signature

END;
$body$
language plpgsql STABLE;
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mark all the functions as STABLE since the underlying helper function depends on encoding settings

Copy link
Copy Markdown
Contributor Author

@pranavJ23 pranavJ23 Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with stable we are not able to use the CHAR function in computed column, where as in tsql the char can be used in computed column.
Since the encoding in babelfish_db can' t be changed once the db is formed, hence we can mark the function as immutable

END;
$body$
language plpgsql STABLE;
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we verified that it is really PARALLEL SAFE?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the test case is passing in the JDBC parallel query mode, so we can say that it's parallel safe ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the pg_proc catalog too, we have verified that chr() function is parallel safe and since we are using chr() underlying sys.char(), so we can say it's parallel safe

);

create or replace function sys.CHAR(x in int)returns char
create or replace function sys.cht(x in int) returns sys.varchar
Copy link
Copy Markdown
Contributor

@Anikait143 Anikait143 Jun 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return type should be sys.bpchar

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO varchar might be a better return function since bpchar using fixed length whereas the output will be of length = 1 always, so it might allocate extra space in case of bpchar. Else anything can be used, not an issue.

create or replace function sys.cht(x in int) returns sys.varchar
AS
$body$
BEGIN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if input is 0? (Same for other functions as well)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this edge case need to be handled since pg function 'chr' expects the input b/w 1-255 where as the tsql expects input b/w 0-255. Thanks

Comment on lines +3042 to +3047
if (proc->DOUBLE_QUOTE_ID())
stream.setText(startIndex, "\"cht\" ");
else if (proc->SQUARE_BRACKET_ID())
stream.setText(startIndex, "[cht] ");
else
stream.setText(startIndex, " cht");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any case where DOUBLE_QUOTE_ID and SQUARE_BRACKET_ID are valid for function calls?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do support the query like :

select [char](255)

adding tests for it

else if (proc->SQUARE_BRACKET_ID())
stream.setText(startIndex, "[cht] ");
else
stream.setText(startIndex, " cht");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to have proper text replacement maybe use underscore instead of space for new function name.

AS
$body$
BEGIN
--- 1114111 is 0x10FFFF - max value permitted as specified by documentation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also correct the comment here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this function be dependant upon the input collation ?

language plpgsql STABLE;
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;

CREATE OR REPLACE FUNCTION sys.nchar(IN x INTEGER) RETURNS sys.nvarchar
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return type should be nchar

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO nvarchar might be a better return function since nchar using fixed length whereas the output will be of length = 1 always, so it might allocate extra space in case of nchar. Else anything can be used, not an issue.

AS
$body$
BEGIN
--- 1114111 is 0x10FFFF - max value permitted as specified by documentation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this function be dependant upon the input collation ?

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
DECLARE
exception_message text;
BEGIN
ALTER FUNCTION sys.char(x in int) RENAME TO chr_deprecated_5_2_0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated version 5.3.0, please correct the name everywhere.

DECLARE
exception_message text;
BEGIN
ALTER FUNCTION sys.nchar(In x INTEGER) RENAME TO nchar_int_deprecated_5_2_0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also drop the deprecated functions. Refer:

CALL sys.babelfish_drop_deprecated_object('function', 'sys', 'stgeomfromtext_helper_deprecated_5_3_0');

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
… char & nchar function handling

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Comment on lines +3228 to +3244
if (pg_strcasecmp(procNameStr.c_str(), "nchar") == 0)
{
int startIndex = proc->start->getStartIndex();
std::string schNameStr;

if (schema)
schNameStr = stripQuoteFromId(schema);

if(schema && !schNameStr.empty() && pg_strcasecmp(schNameStr.c_str(), "sys") != 0)
return;

if (proc->DOUBLE_QUOTE_ID())
stream.setText(startIndex, "\"ncht_\"");
else if (proc->SQUARE_BRACKET_ID())
stream.setText(startIndex, "[ncht_]");
else
stream.setText(startIndex, "ncht_");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this block is duplicated 4 times. Can we create one helper function for this ?

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Signed-off-by: pranav jain <pranav23iitd@gmail.com>
@pranavJ23 pranavJ23 requested a review from tanscorpio7 June 30, 2025 15:29
@shardgupta shardgupta merged commit 36085f6 into babelfish-for-postgresql:BABEL_5_X_DEV Jul 1, 2025
52 of 55 checks passed
pranavJ23 added a commit to pranavJ23/babelfish_extensions that referenced this pull request Jul 1, 2025
…resql#3855)

Function char throws error if the input integer is not in range 0-255, where as tsql returns NULL instead of error.
Function nchar should throw NULL if the the input integer is not in range 0-65535
Fix for nchar() function so that if input type is coercible to int then it should execute successfully without throwing error

Task: [BABEL-4810] [BABEL-4812][BABEL-5933]

Signed-off-by: pranav jain <pranav23iitd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants