Skip to content

Commit eb34902

Browse files
AUTO: Sync ScalarDB docs in English to docs site repo (#1508)
Co-authored-by: josh-wong <[email protected]>
1 parent 41a4beb commit eb34902

File tree

1 file changed

+163
-6
lines changed

1 file changed

+163
-6
lines changed

docs/scalardb-analytics/reference-data-source.mdx

Lines changed: 163 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Data sources are registered to catalogs using the CLI with data source registrat
2323
{
2424
"catalog": "<catalog-name>", // The catalog to register the data source in
2525
"name": "<data-source-name>", // A unique name for this data source
26-
"type": "<database-type>", // Database type: postgres, mysql, scalardb, sqlserver, oracle, dynamodb
26+
"type": "<database-type>", // Database type: postgres, mysql, scalardb, sqlserver, oracle, dynamodb, databricks, snowflake
2727
"provider": {
2828
// Type-specific connection configuration
2929
// Configuration varies by database type
@@ -268,6 +268,106 @@ The following configurations are for SQL Server.
268268
```
269269

270270
</TabItem>
271+
<TabItem value="databricks" label="Databricks">
272+
273+
<h3>Configuration</h3>
274+
275+
The following configurations are for Databricks (Databricks SQL/JDBC).
276+
277+
<h4>`host`</h4>
278+
279+
- **Field:** `host`
280+
- **Description:** Databricks workspace hostname (for example, `adb-1234567890123.4.azuredatabricks.net`).
281+
282+
<h4>`port`</h4>
283+
284+
- **Field:** `port`
285+
- **Description:** Port number.
286+
- **Default value:** Driver default. (Optional)
287+
288+
<h4>`httpPath`</h4>
289+
290+
- **Field:** `httpPath`
291+
- **Description:** HTTP path of your SQL warehouse or cluster (for example, `/sql/1.0/warehouses/xxxxxxxxxxxxxx`).
292+
293+
<h4>`oAuthClientId`</h4>
294+
295+
- **Field:** `oAuthClientId`
296+
- **Description:** OAuth client ID for Databricks SQL/JDBC authentication.
297+
298+
<h4>`oAuthSecret`</h4>
299+
300+
- **Field:** `oAuthSecret`
301+
- **Description:** OAuth client secret for Databricks SQL/JDBC authentication.
302+
303+
<h4>`catalog`</h4>
304+
305+
- **Field:** `catalog`
306+
- **Description:** Default catalog to use. (Optional)
307+
308+
<h3>Example</h3>
309+
310+
```json
311+
{
312+
"catalog": "production",
313+
"name": "databricks_analytics",
314+
"type": "databricks",
315+
"provider": {
316+
"host": "adb-1234567890123.4.azuredatabricks.net",
317+
"port": 443,
318+
"httpPath": "/sql/1.0/warehouses/xxxxxxxxxxxxxx",
319+
"oAuthClientId": "YOUR_CLIENT_ID",
320+
"oAuthSecret": "YOUR_CLIENT_SECRET",
321+
"catalog": "main"
322+
}
323+
}
324+
```
325+
326+
</TabItem>
327+
<TabItem value="snowflake" label="Snowflake">
328+
329+
<h3>Configuration</h3>
330+
331+
The following configurations are for Snowflake.
332+
333+
<h4>`account`</h4>
334+
335+
- **Field:** `account`
336+
- **Description:** Snowflake account identifier (for example, `xy12345.ap-northeast-1`).
337+
338+
<h4>`username`</h4>
339+
340+
- **Field:** `username`
341+
- **Description:** Database user.
342+
343+
<h4>`password`</h4>
344+
345+
- **Field:** `password`
346+
- **Description:** Database password.
347+
348+
<h4>`database`</h4>
349+
350+
- **Field:** `database`
351+
- **Description:** Default database to resolve/import. (Optional)
352+
353+
<h3>Example</h3>
354+
355+
```json
356+
{
357+
"catalog": "production",
358+
"name": "snowflake_dwh",
359+
"type": "snowflake",
360+
"provider": {
361+
"account": "YOUR-ACCOUNT",
362+
"username": "analytics_user",
363+
"password": "secure_password",
364+
"database": "ANALYTICS"
365+
}
366+
}
367+
```
368+
369+
</TabItem>
370+
271371
<TabItem value="dynamodb" label="DynamoDB">
272372

273373
<h3>Configuration</h3>
@@ -397,7 +497,7 @@ When registering a data source to ScalarDB Analytics, the catalog structure of t
397497

398498
The catalog-level mappings are the mappings of the namespace names, table names, and column names from the data sources to the universal data catalog. To see the catalog-level mappings in each data source, select a data source.
399499

400-
<Tabs groupId="data-source" queryString>
500+
<Tabs groupId="data-source-type" queryString>
401501
<TabItem value="scalardb" label="ScalarDB" default>
402502
The catalog structure of ScalarDB is automatically resolved by ScalarDB Analytics. The catalog-level objects are mapped as follows:
403503

@@ -406,8 +506,7 @@ The catalog-level mappings are the mappings of the namespace names, table names,
406506
- The ScalarDB column is mapped to the column.
407507

408508
</TabItem>
409-
410-
<TabItem value="postgresql" label="PostgreSQL" default>
509+
<TabItem value="postgresql" label="PostgreSQL">
411510
The catalog structure of PostgreSQL is automatically resolved by ScalarDB Analytics. The catalog-level objects are mapped as follows:
412511

413512
- The PostgreSQL schema is mapped to the namespace. Therefore, the namespace of the PostgreSQL data source is always single level, consisting of only the schema name.
@@ -477,7 +576,7 @@ The catalog-level mappings are the mappings of the namespace names, table names,
477576
<TabItem value="sql-server" label="SQL Server">
478577
The catalog structure of SQL Server is automatically resolved by ScalarDB Analytics. The catalog-level objects are mapped as follows:
479578

480-
- The SQL Server database and schema are mapped to the namespace together. Therefore, the namespace of the SQL Server data source is always two-level, consisting of the database name and the schema name.
579+
- Each SQL Server database-schema pair is mapped to a namespace in ScalarDB Analytics. Therefore, the namespace of the SQL Server data source is always two-level, consisting of the database name and the schema name.
481580
- Only user-defined databases are mapped to namespaces. The following system databases are ignored:
482581
- `sys`
483582
- `guest`
@@ -499,6 +598,28 @@ The catalog-level mappings are the mappings of the namespace names, table names,
499598
- The SQL Server table is mapped to the table.
500599
- The SQL Server column is mapped to the column.
501600

601+
</TabItem>
602+
<TabItem value="databricks" label="Databricks">
603+
The catalog structure of Databricks is automatically resolved by ScalarDB Analytics. The catalog-level objects are mapped as follows:
604+
605+
- Each Databricks catalog-schema pair is mapped to a namespace in ScalarDB Analytics. Therefore, the namespace of the Databricks data source always has two levels, consisting of the catalog name and the schema name.
606+
- The following system catalogs/schemas are ignored:
607+
- **Catalogs:** `system`
608+
- **Schemas:** `information_schema`, `global_temp`, `sys`, `routines`
609+
- The Databricks table is mapped to the table.
610+
- The Databricks column is mapped to the column.
611+
612+
</TabItem>
613+
<TabItem value="snowflake" label="Snowflake">
614+
The catalog structure of Snowflake is automatically resolved by ScalarDB Analytics. The catalog-level objects are mapped as follows:
615+
616+
- Each Snowflake database-schema pair is mapped to a namespace in ScalarDB Analytics. Therefore, the namespace of the Snowflake data source always has two levels, consisting of the database name and the schema name.
617+
- The following system databases/schemas are ignored:
618+
- **Databases:** `SNOWFLAKE`
619+
- **Schemas:** `INFORMATION_SCHEMA`
620+
- The Snowflake table is mapped to the table.
621+
- The Snowflake column is mapped to the column.
622+
502623
</TabItem>
503624
<TabItem value="dynamodb" label="DynamoDB">
504625
Since DynamoDB is schema-less, you need to specify the catalog structure explicitly when registering a DynamoDB data source by using the following format JSON:
@@ -670,6 +791,43 @@ Columns with data types that are not included in the mapping tables below will b
670791
| `smalldatetime` | `TIMESTAMP` |
671792
| `datetimeoffset` | `TIMESTAMPTZ` |
672793

794+
</TabItem>
795+
<TabItem value="databricks" label="Databricks">
796+
797+
| **Databricks SQL Data Type** | **ScalarDB Analytics Data Type** |
798+
| :--------------------------- | :---------------------------------------------------------------------------------- |
799+
| `TINYINT` | `SMALLINT` |
800+
| `SMALLINT` | `SMALLINT` |
801+
| `INT` / `INTEGER` | `INT` |
802+
| `BIGINT` | `BIGINT` |
803+
| `FLOAT` | `FLOAT` |
804+
| `DOUBLE` | `DOUBLE` |
805+
| `DECIMAL(p,0)` | `BYTE` (p ≤ 2), `SMALLINT` (3–4), `INT` (5–9), `BIGINT` (10–18), `DECIMAL` (p > 18) |
806+
| `STRING` / `VARCHAR` | `TEXT` |
807+
| `BINARY` | `BLOB` |
808+
| `BOOLEAN` | `BOOLEAN` |
809+
| `DATE` | `DATE` |
810+
| `TIMESTAMP` | `TIMESTAMPTZ` |
811+
| `TIMESTAMP_NTZ` | `TIMESTAMP` |
812+
813+
</TabItem>
814+
<TabItem value="snowflake" label="Snowflake">
815+
816+
| **Snowflake Data Type** | **ScalarDB Analytics Data Type** |
817+
| :--------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------- |
818+
| `NUMBER(p,0)` | `BYTE` (p ≤ 2), `SMALLINT` (3–4), `INT` (5–9), `BIGINT` (10–18), `DECIMAL` (p > 18) |
819+
| `NUMBER` / `NUMERIC` | `DECIMAL` |
820+
| `INT` / `INTEGER` / `BIGINT` / `SMALLINT` / `TINYINT` / `BYTEINT` | `DECIMAL` |
821+
| `FLOAT` / `FLOAT4` / `FLOAT8` / `DOUBLE` / `DOUBLE PRECISION` / `REAL` | `DOUBLE` |
822+
| `VARCHAR` / `STRING` / `TEXT` / `NVARCHAR` / `NVARCHAR2` / `CHAR VARYING` / `NCHAR VARYING` / `CHAR` / `CHARACTER` / `NCHAR` | `TEXT` |
823+
| `BINARY` / `VARBINARY` | `BLOB` |
824+
| `BOOLEAN` | `BOOLEAN` |
825+
| `DATE` | `DATE` |
826+
| `TIME` | `TIME` |
827+
| `TIMESTAMP_NTZ` / `DATETIME` | `TIMESTAMP` |
828+
| `TIMESTAMP_LTZ` | `TIMESTAMPTZ` |
829+
| `TIMESTAMP_TZ` | `TIMESTAMPTZ` |
830+
673831
</TabItem>
674832
<TabItem value="dynamodb" label="DynamoDB">
675833

@@ -694,4 +852,3 @@ DynamoDB complex data types (String Set, Number Set, Binary Set, List, Map) are
694852

695853
</TabItem>
696854
</Tabs>
697-

0 commit comments

Comments
 (0)