Skip to content

Support BCP on TSQL #temp table#4632

Open
Ayush-061 wants to merge 15 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
Ayush-061:bcp_on_temp
Open

Support BCP on TSQL #temp table#4632
Ayush-061 wants to merge 15 commits intobabelfish-for-postgresql:BABEL_5_X_DEVfrom
Ayush-061:bcp_on_temp

Conversation

@Ayush-061
Copy link
Copy Markdown
Contributor

@Ayush-061 Ayush-061 commented Mar 6, 2026

Description

Through this changes now users will be able to perform BCP on #temptable through dotnet's SqlBulkCopy.

  1. Currently BCP is not Supported for #temptable because the existing procedure sp_tablecollations_100 does not route for temp table.
  2. With the modification in sp_tablecollations_100 BCP is now supported for #temptable
  3. Also added function babelfish_get_temp_table_attributes to get the attributes for both ENR and Non-ENR temp tables.

Issues Resolved

BABEL-5264

Test Scenarios Covered

  • Use case based -

  • Boundary conditions -

  • Arbitrary inputs -

  • Negative test cases -

  • Minor version upgrade tests -

  • Major version upgrade tests -

  • Performance tests -

  • Tooling impact -

  • Client tests -

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Comment on lines +369 to +379
-- Create sp_tablecollations_100 in tempdb for BCP temp table support
-- This procedure enables SqlBulkCopy to work with temp tables by providing
-- column collation metadata that BCP needs to encode string data correctly.
CREATE OR REPLACE PROCEDURE sys.create_sp_tablecollations_100_in_tempdb_dbo()
LANGUAGE C
AS 'babelfishpg_tsql', 'create_sp_tablecollations_100_in_tempdb_dbo_internal';

CALL sys.create_sp_tablecollations_100_in_tempdb_dbo();
ALTER PROCEDURE tempdb_dbo.sp_tablecollations_100 OWNER TO sysadmin;
GRANT EXECUTE ON PROCEDURE tempdb_dbo.sp_tablecollations_100 TO PUBLIC;
DROP PROCEDURE sys.create_sp_tablecollations_100_in_tempdb_dbo;
Copy link
Copy Markdown
Contributor

@ayushdsh ayushdsh Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upgrades are actually simple because we know tempdb_dbos schema will always exists. so we don't have to do this. we will instead have to create the tempdb_dbo.sp_tablecollations_100 procedure here like a normal procedure

#endif /* USE_LIBXML */
}

/*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks its about time babelfish temp table's code have a seperate file in extension : )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't agree more!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ayush-061 lets start with this one. Could you instead create a new file (maybe called - temp-table-hooks) and write this function there instead. We'll move the rest slowly later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Fine

Copy link
Copy Markdown
Contributor Author

@Ayush-061 Ayush-061 Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the function get_tsql_temp_table_attribute or the other one is_temp_table name too?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both of them

* (reads from actual pg_attribute catalog).
*/
Datum
get_enr_attributes(PG_FUNCTION_ARGS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please check if the function - ENRMetadataGetTupDesc will simplify your job here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cattups[ENR_CATTUP_ATTRIBUTE] already has HeapTuples ready, but with ENRMetadataGetTupDesc we would need to manually build HeapTuples from Form_pg_attribute. Both will work .

Comment on lines +6137 to +6151
temp_name_ptr = strchr(table_name_input, '#');
if (!temp_name_ptr)
PG_RETURN_NULL();

/* Lowercase the table name and strip trailing "]" */
table_name = pstrdup(temp_name_ptr);
for (i = 0; table_name[i]; i++)
{
if (table_name[i] == ']')
{
table_name[i] = '\0';
break;
}
table_name[i] = tolower((unsigned char) table_name[i]);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use object_id to get the oid of the table and then use get_ENR_withoid(). That would help skip these checks and string handling

Comment on lines +6196 to +6201
foreach(lc, enr->md.cattups[ENR_CATTUP_ATTRIBUTE])
{
HeapTuple tup = (HeapTuple) lfirst(lc);

tuplestore_puttuple(tupstore, tup);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if this can be simply replaced by the function ENRMetadataGetTupDesc

@Ayush-061 Ayush-061 changed the title Bcp on temp Support BCP on TSQL #temp table Mar 18, 2026
Comment on lines +6144 to +6150
if (object_name[0] != '#')
{
pfree(db_name);
pfree(schema_name);
pfree(object_name);
PG_RETURN_NULL();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't we already verified that this is a temp table .. before calling this function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is if user calls the function babelfish_get_temp_table_attributes. for procedure the check is done by the other function is_temp_table_name

Comment on lines +6236 to +6239
/*
* is_temp_table_name - Check if a name refers to a temp table
* Returns true if the object name part starts with #
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please add more comments mentioning that the name can be a four-part qualified name, within quoted-identifiers and even within square brackets qualifier.

Copy link
Copy Markdown
Contributor

@ayushdsh ayushdsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with almost all pieces. I just want one more pass on get_tsql_temp_table_attributes function which I will try to conclude today.

Copy link
Copy Markdown
Contributor

@Deepesh125 Deepesh125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is fine for me. But we need to prove that it won't cause any interoperability issue or unexpected behaviour.

GRANT SELECT ON sys.spt_tablecollations_view TO PUBLIC;

-- Function to get pg_attribute rows for #temp tables (ENR and non-ENR)
CREATE OR REPLACE FUNCTION sys.babelfish_get_temp_table_attributes(IN table_name sys.varchar(4000))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for safer side, it should be sys.nvarchar(128)

GRANT EXECUTE ON FUNCTION sys.babelfish_get_temp_table_attributes(IN sys.varchar(4000)) TO PUBLIC;

-- Function to check if a name refers to a #temp table
CREATE OR REPLACE FUNCTION sys.is_temp_table_name(IN name sys.varchar(4000))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, use sys.nvarchar


PG_RETURN_TEXT_P(cstring_to_text(xpath_expr));
} No newline at end of file
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert file back to original

pfree(input);

/* Must be a temp table (starts with #) */
if (object_name[0] != '#')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object_name could be empty or null? any chance this could be called from PG dialect?

downcase_truncate_split_object_name(input, NULL, &db_name, &schema_name, &object_name);
pfree(input);

/* Must be a temp table (starts with #) */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid confusion, let's use tsql temp tables

input = text_to_cstring(PG_GETARG_VARCHAR_PP(0));

/* Parse the table name */
downcase_truncate_split_object_name(input, NULL, &db_name, &schema_name, &object_name);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need db_name? we can pass NULL instead

/* Try ENR first */
if (currentQueryEnv != NULL)
{
enr = get_ENR(currentQueryEnv, object_name, true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we tested this part of code for following change of commands?

TSQL Dialect -> TSQL Proc -> calls get_tsql_temp_table_attributes
TSQL Dialect -> TSQL Proc -> PG proc -> call get_tsql_temp_table_attributes
TSQL Dialect -> PG proc -> call get_tsql_temp_table_attributes
TSQL Dialect -> PG proc -> TSQL proc -> call get_tsql_temp_table_attributes

remember that TSQL temp table can be created at stage in between

Comment on lines +103 to +115
/* Setup return - use pg_attribute's tuple descriptor */
per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
oldcontext = MemoryContextSwitchTo(per_query_ctx);

pg_attribute_rel = table_open(AttributeRelationId, AccessShareLock);
tupdesc = CreateTupleDescCopy(RelationGetDescr(pg_attribute_rel));

tupstore = tuplestore_begin_heap(rsinfo->allowedModes & SFRM_Materialize_Random,
false, work_mem);

rsinfo->returnMode = SFRM_Materialize;
rsinfo->setResult = tupstore;
rsinfo->setDesc = BlessTupleDesc(tupdesc);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason why we are preferring this way of SRF instead of what PG Doc says, https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-C-RETURN-SET

Comment on lines +126 to +128
HeapTuple tup = (HeapTuple) lfirst(lc);

tuplestore_puttuple(tupstore, tup);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to make a copy before puttuple

Comment on lines +188 to +190
/* Check if object_name starts with # */
if (object_name[0] == '#')
result = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to make such decision just base on name? we should look into currentQueryEnv.

There could be issue with interoperability here since name with # in PG is valid and can be regular table. And sys.sp_tablecollations_100 could be called for those PG relations, right? how are we making sure that it works fine under those scenarios?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants