Skip to content
This repository was archived by the owner on Sep 27, 2019. It is now read-only.

Conversation

mengranwo
Copy link
Contributor

@mengranwo mengranwo commented Apr 16, 2018

1. catalog refactoring

  • Get rid of catalog table singleton. Make pg_table, pg_index, pg_attribute, pg_stats_tables, pg_trigger under per database. All of them are now stored in a per-database object in catalog_map (under catalog.h)
  • Add version_id column into pg_table
  • Add update(with index scan) helper function inside AbstractcCatalog.cpp

2. Namespace(Schema) support

default_database=#CREATE TABLE foo(a int);
create table under default namespace "public"

default_database=#CREATE SCHEMA private;
default_database=#CREATE TABLE private.foo(a int);
create table with same name "foo" under different namespace "private"
default_database=#DROP SCHEMA private;
delete record from pg_namespace catalog table and drop table private.foo
  • Add pg_namespace catalog table
  • Add namespace(schema) information in parser, binder, planner and executor
  • Need to pass in database name, schema name and table name to locate one unique table object
DataTable *table = catalog::Catalog::GetInstance()->GetTableWithName(
    database_name, schema_name, table_name, txn);

TableCatalogObject *table_object = catalog::Catalog::GetInstance()->GetTableObject(
    database_name, schema_name, table_name, txn);

3. To-Do

@mengranwo mengranwo changed the title catalog refactoring + Namespace(Schema) support Namespace(Schema) support + Catalog refactoring Apr 16, 2018
@coveralls
Copy link

coveralls commented Apr 16, 2018

Coverage Status

Coverage increased (+0.2%) to 77.481% when pulling 781b6d7 on camellyx:master into ff5b583 on cmu-db:master.

Copy link
Contributor

@poojanilangekar poojanilangekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mengranwo, @camellyx!
Nice job! I think this PR looks good. I just had a few questions (mostly present in the review).

  1. Why is there a change the visibility of the catalog contractors (eg. ColumnCatalog, IndexCatalog, etc)?
  2. Can you please add CATALOG_TABLES_COUNT or something similar in the catalog_defaults.h . And make sure the test cases check equality of this number instead of 8.
  3. Would it be possible to remove the Bootstrap calls from the first test to the constructor in the PelotonTest class? We seem to be calling it in most functions or you can derive a CatalogTestUtil class and have this done inside it.
  4. Why did most of the SQL tests require explicitly appending the public schema to the query. The binder should be taking care of it right?
  5. Can you add a test which creates two tables with the same name in two different schemas? Also another test which lets a query select from two tables in two different schemas?
  6. Does this PR include support for TempSchema? If not, could you write up an issue about it? I think it can be given to a beginner.

@@ -137,7 +136,7 @@ TEST_F(UDFTest, IfElseExpressionTest) {
auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance();
auto txn = txn_manager.BeginTransaction();
catalog::Catalog::GetInstance()->CreateDatabase(DEFAULT_DB_NAME, txn);
catalog::Catalog::GetInstance()->Bootstrap();
// catalog::Catalog::GetInstance()->Bootstrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this line is commented out, please delete it.

@@ -196,7 +195,7 @@ TEST_F(UDFTest, RecursiveFunctionTest) {
auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance();
auto txn = txn_manager.BeginTransaction();
catalog::Catalog::GetInstance()->CreateDatabase(DEFAULT_DB_NAME, txn);
catalog::Catalog::GetInstance()->Bootstrap();
// catalog::Catalog::GetInstance()->Bootstrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this line is commented out, please delete it.

catalog::Catalog::GetInstance()->Bootstrap();
// NOTE: Catalog::GetInstance()->Bootstrap() has been called in previous tests
// you can only call it once!
// catalog::Catalog::GetInstance()->Bootstrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this line is commented out, please delete it.

catalog::Catalog::GetInstance()->Bootstrap();
// NOTE: Catalog::GetInstance()->Bootstrap() has been called in previous tests
// you can only call it once!
// catalog::Catalog::GetInstance()->Bootstrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this line is commented out, please delete it.

catalog::Catalog::GetInstance()->Bootstrap();
// NOTE: Catalog::GetInstance()->Bootstrap() has been called in previous tests
// you can only call it once!
// catalog::Catalog::GetInstance()->Bootstrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this line is commented out, please delete it.

static ColumnCatalog *GetInstance(storage::Database *pg_catalog = nullptr,
type::AbstractPool *pool = nullptr,
concurrency::TransactionContext *txn = nullptr);
ColumnCatalog(storage::Database *pg_catalog, type::AbstractPool *pool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other classes. Why are we making this public?

@@ -83,10 +81,11 @@ class ColumnCatalog : public AbstractCatalog {
// write Related API
//===--------------------------------------------------------------------===//
bool InsertColumn(oid_t table_oid, const std::string &column_name,
oid_t column_id, oid_t column_offset,
uint32_t column_id, uint32_t column_offset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert. internal_types.h typedefs uint32_t to oid_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

std::vector<oid_t> key_column_offsets =
index->GetMetadata()->GetKeySchema()->GetIndexedColumns();
PELOTON_ASSERT(scan_values.size() == key_column_offsets.size());
std::vector<ExpressionType> expr_types(scan_values.size(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only allow equal keys, can you please add a line in the documentation? So the next person using it would know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

* The system catalog tables must be created in certain order to make sure
* all tuples are indexed (actually this might be fine now after Paulo's fix)
*/
void Catalog::BootstrapSystemCatalogs(storage::Database *database,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add java doc style documentation for this function.

Copy link
Contributor Author

@mengranwo mengranwo Apr 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -541,5 +561,29 @@ TableCatalog::GetTableObjects(oid_t database_oid,
return database_object->GetTableObjects();
}

bool TableCatalog::UpdateVersionId(oid_t update_val, oid_t table_oid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add java doc style documentation for this function?

Copy link
Contributor Author

@mengranwo mengranwo Apr 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@mengranwo
Copy link
Contributor Author

mengranwo commented Apr 28, 2018

Hey @poojanilangekar!
Thank you for your reviews and I've made changes according to your suggestion. The following is a summary to what I've modified and answers to some of your question.

  • delete all the commented out line
  • Add java doc style documentation if necessary
  • Add CATALOG_TABLES_COUNT variable into catalog_defaults.h to get rid of the magic number 8
  • The reason why we change the visibility of the catalog contractors from private to public is that: before this PR each catalog table is a global singleton (thus the constructor must be private), but now each database has their own set of catalog tables (a.k.a system_catalogs.h), when creating a new system catalog object, we need to create multiple catalog tables(pg_table, pg_index, etc), thus the constructors must be public now
  • In all the test cases, we require you to use lowercase database_name. Cause many test cases invoke catalog::CreateDatabase(database_name) directly, and we use create sql to build secondary catalog tables(e.g CREATE TABLE database_name.pg_trigger(a int);) which means that this sql statement will pass through parser--> binder-->optimizer-->planner-->executor.And binder/optimizer will convert whatever name to be lowercase. We need to keep both name in sync, otherwise "can't find database" exception will be thrown.
  • PR [15721] Temporary Table #1286 is adding TempSchema
  • I'll push the schema test cases during weekend.

Copy link
Contributor

@poojanilangekar poojanilangekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mengranwo for addressing all the comments.
LGTM!

@poojanilangekar
Copy link
Contributor

@tcm-marcel, The same issue again. Travis build is not triggered. Do you think the bug is because of some error in our configuration or a bug on their end?

Copy link
Contributor

@poojanilangekar poojanilangekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@camellyx Thanks for adding the test!
Overall this PR is ready to be merged in!
If you or @mengranwo can resolve the merge conflicts and address minor issues in the review, I think we can merge this today or tomorrow. Thanks!

@@ -116,40 +117,6 @@ TEST_F(CatalogTests, CreatingTable) {
->GetColumnObject(1)
->GetColumnName());
txn_manager.CommitTransaction(txn);
// EXPECT_EQ(5, time_stamp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible for you to check the table count? Currently it should be 8, correct? Similarly to the index as well?
So this way we can check that people always account for the additional catalog tables/indexes they create.

@@ -248,7 +249,9 @@ void TestingSQLUtil::ExecuteSQLQueryAndCheckResult(
// Execute query
TestingSQLUtil::ExecuteSQLQuery(std::move(query), result, tuple_descriptor,
rows_changed, error_message);
unsigned int rows = result.size() / tuple_descriptor.size();
unsigned int rows = tuple_descriptor.size() == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a PELOTON_ASSERT instead? I mean, ideally we should only have an emptyresult if the tuple_descriptor.size() is 0, right? Or am I missing a corner case?

@@ -23,6 +23,8 @@
#include "executor/update_executor.h"
#include "expression/abstract_expression.h"
#include "expression/operator_expression.h"

#define private public
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this as well.

@camellyx
Copy link
Contributor

@poojanilangekar Thanks for your comments. We are working on adding more tests, if possible, once we have time.

@poojanilangekar
Copy link
Contributor

@camellyx Can this PR be merged in for now?
You can add more tests in the subsequent PR. If this gets merged in and I merge in the layout PR #1327, then we can start work on schema change. Or do you want to wait? Let me know what works for you.

@mengranwo
Copy link
Contributor Author

@poojanilangekar I think we are ready to go.

@poojanilangekar
Copy link
Contributor

Thanks @mengranwo!
@pervazea @apavlo Can you please merge this in?

@pervazea pervazea merged commit d68ab71 into cmu-db:master Apr 30, 2018
Dingshilun pushed a commit to sxzh93/peloton that referenced this pull request May 2, 2018
* codegen insert catalog tuple

* fix segfault caused by bad insert plan

* consider cached catalog queries

* add update method using old engine

* add update method using old engine

* add version id column into pg_table

* column id and offset use uint32

* system catalogs

* using system catalog instead of singleton

* remove default constructer

* fix catalog map

* system catalog return value change to shared pointer

* refactoring system catalog, seems to compile now

* refactor fix

* refactor, peloton can start and finish bootstrap. TODO: refactor global unique oid

* fixed pg_database index, still cannot find system catalog in other databases

* fix two misuse of CATALOG_DATABASE_OID

* seems to work now

* fix two tests

* fix drop index bug, need information about database

* change test case, mainly gettablecount

* fix query cache test

* half way fixing trigger test

* trigger catalog

* move trigger_catalog into per database

* can pass trigger test now, also drop trigger when drop table

* fix udf test, bootstrap only once

* fix db names

* initialize nullptr

* TODO: fix GetInstance, fix AddBuiltInFunctions

* roll back pg_proc for now... require refactoring later

* query metrics catalog refactor

* change metrics tables access method

* stats_test still has problem

* fix query metrics bugs, pass test cases in stats_test

* merge local changes

* fix all the test cases

* fix double initialize

* add full functionality of create/drop schema(namespace)

* rebase to latest master branch

* add more comments

* fix test case errors, able to compile, one failing test

* resolve conflicts

* fix iso compile bug

* make changes according to pooja's reviews

* added namespace sql test

* addressing pooja's comments

* fix plan util test
@nwang57
Copy link
Contributor

nwang57 commented May 13, 2018

@mengranwo I think there are typos in src/include/catalog/catalog_defaults.h and other places. DEFUALT_SCHEMA_OID and DEFUALT_SCHEMA_NAME which should be DEFAULT_SCHEMA_OID and DEFAULT_SCHEMA_NAME

mtunique pushed a commit to mtunique/peloton that referenced this pull request Apr 16, 2019
* codegen insert catalog tuple

* fix segfault caused by bad insert plan

* consider cached catalog queries

* add update method using old engine

* add update method using old engine

* add version id column into pg_table

* column id and offset use uint32

* system catalogs

* using system catalog instead of singleton

* remove default constructer

* fix catalog map

* system catalog return value change to shared pointer

* refactoring system catalog, seems to compile now

* refactor fix

* refactor, peloton can start and finish bootstrap. TODO: refactor global unique oid

* fixed pg_database index, still cannot find system catalog in other databases

* fix two misuse of CATALOG_DATABASE_OID

* seems to work now

* fix two tests

* fix drop index bug, need information about database

* change test case, mainly gettablecount

* fix query cache test

* half way fixing trigger test

* trigger catalog

* move trigger_catalog into per database

* can pass trigger test now, also drop trigger when drop table

* fix udf test, bootstrap only once

* fix db names

* initialize nullptr

* TODO: fix GetInstance, fix AddBuiltInFunctions

* roll back pg_proc for now... require refactoring later

* query metrics catalog refactor

* change metrics tables access method

* stats_test still has problem

* fix query metrics bugs, pass test cases in stats_test

* merge local changes

* fix all the test cases

* fix double initialize

* add full functionality of create/drop schema(namespace)

* rebase to latest master branch

* add more comments

* fix test case errors, able to compile, one failing test

* resolve conflicts

* fix iso compile bug

* make changes according to pooja's reviews

* added namespace sql test

* addressing pooja's comments

* fix plan util test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants