You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/scalardb-analytics/create-scalardb-analytics-catalog.mdx
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ Catalog information is managed by a component called a ScalarDB Analytics server
18
18
19
19
### Prerequisites
20
20
21
-
The ScalarDB Analytics server requires a database to store catalog information. We refer to this database as the **metadata database** throughout this documentation. ScalarDB Analytics supports the following databases for the metadata database:
21
+
The ScalarDB Analytics server requires a database to store catalog information. This database is referred to as the **metadata database** throughout this documentation. ScalarDB Analytics supports the following databases for the metadata database:
For production deployments, configure metering storage to use object storage (for example, Amazon S3, Google Cloud Storage, or Azure Blob Storage) instead of the local filesystem.For detailed configuration options, see the [Configuration reference](./configurations.mdx).
59
+
60
+
For production deployments, configure metering storage to use object storage (for example, Amazon S3, Google Cloud Storage, or Azure Blob Storage) instead of the local filesystem. For detailed configuration options, see [ScalarDB Analytics Configurations](./configurations.mdx).
ScalarDB Analytics CLI is a command-line tool that communicates with the ScalarDB Analytics server to manage catalogs, register data sources, and perform administrative tasks.
110
112
113
+
For details, see the [ScalarDB Analytics CLI Command Reference](./reference-cli-command.mdx)
114
+
111
115
### Install the CLI
112
116
113
117
The `scalardb-analytics-cli` tool is available as a container image:
Copy file name to clipboardExpand all lines: docs/scalardb-analytics/deployment.mdx
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,14 @@ import TabItem from "@theme/TabItem";
10
10
# Deploy ScalarDB Analytics in Public Cloud Environments
11
11
12
12
This guide explains how to deploy ScalarDB Analytics in a public cloud environment. ScalarDB Analytics consists of two main components: a ScalarDB Analytics server and Apache Spark. In this guide, you can choose either Amazon EMR or Databricks for the Spark environment.
13
+
13
14
For details about ScalarDB Analytics, refer to [ScalarDB Analytics Design](./design.mdx).
14
15
15
-
## Deploy ScalarDB Analytics catalog server
16
+
## Deploy ScalarDB Analytics server
16
17
17
-
ScalarDB Analytics requires a catalog server to manage metadata and data source connections. The catalog server should be deployed by using Helm charts on a Kubernetes cluster.
18
+
ScalarDB Analytics requires a catalog server to manage metadata and data source connections. The catalog server should be deployed by using Helm Charts on a Kubernetes cluster.
18
19
19
-
For detailed deployment instructions, see [TBD - Helm chart deployment guide].
20
+
For detailed deployment instructions, see [How to install Scalar products through AWS Marketplace](../scalar-kubernetes/AwsMarketplaceGuide?products=scalardb-analytics-server).
20
21
21
22
After deploying the catalog server, note the following information for Spark configuration:
-`<CATALOG_NAME>`: The name of the catalog. This must match a catalog created on the ScalarDB Analytics server.
159
-
-`<CATALOG_SERVER_HOST>`: The host address of your ScalarDB Analytics catalog server.
160
+
-`<CATALOG_SERVER_HOST>`: The host address of your ScalarDB Analytics server.
160
161
161
162
4. Add the library of ScalarDB Analytics to the launched cluster as a Maven dependency. For details on how to add the library, refer to the [Databricks cluster libraries documentation](https://docs.databricks.com/en/libraries/cluster-libraries.html).
Copy file name to clipboardExpand all lines: docs/scalardb-analytics/design.mdx
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,12 +50,12 @@ graph TD
50
50
The following are definitions for those levels:
51
51
52
52
-**Catalog** is a folder that contains all your data source information. For example, you might have one catalog called `analytics_catalog` for your analytics data and another called `operational_catalog` for your day-to-day operations.
53
-
-**Data source** represents each data source you connect to. For each data source, we store important information like:
53
+
-**Data source** represents each data source you connect to. For each data source, ScalarDB Analytics stores important information like:
54
54
- What kind of data source it is (PostgreSQL, Cassandra, etc.)
55
55
- How to connect to it (connection details and passwords)
56
56
- Special features the data source supports (like transactions)
57
57
-**Namespace** is like a subfolder within your data source that groups related tables together. In PostgreSQL these are called schemas, in Cassandra they're called keyspaces. You can have multiple levels of namespaces, similar to having folders within folders.
58
-
-**Table** is where your actual data lives. For each table, we keep track of:
58
+
-**Table** is where your actual data lives. For each table, ScalarDB Analytics keeps track of:
59
59
- What columns it has
60
60
- What type of data each column can store
61
61
- Whether columns can be empty (null)
@@ -95,7 +95,7 @@ When registering a data source to ScalarDB Analytics, two types of mappings occu
95
95
1.**Catalog structure mapping**: The data source's catalog information (namespaces, tables, and columns) is resolved and mapped to the universal data catalog structure
96
96
2.**Data type mapping**: Native data types from each data source are mapped to the universal data types listed above
97
97
98
-
These mappings ensure compatibility and consistency across different database systems. For detailed information about how specific databases are mapped, see [Catalog information mappings by data source](./design.mdx#catalog-information-mappings-by-data-source).
98
+
These mappings ensure compatibility and consistency across different database systems. For detailed information about how specific databases are mapped, see [Catalog structure mappings by data source](./reference-data-source.mdx#catalog-structure-mappings-by-data-source).
Please replace `<path-to-json>` with the file path to your data source registration file.
89
89
90
-
The `register` command requires a data source registration file. The file format is described in the [Data source configuration](#data-source-configuration) section below.
90
+
The `register` command requires a data source registration file. The file format is described in the [Data source registration file format](reference-data-source.mdx#data-source-registration-file-format) section below.
Copy file name to clipboardExpand all lines: docs/scalardb-analytics/run-analytical-queries.mdx
+3-9Lines changed: 3 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,13 +20,7 @@ This section describes the prerequisites, setting up ScalarDB Analytics in the S
20
20
### Prerequisites
21
21
22
22
-**ScalarDB Analytics server:** A running instance that manages catalog information and connects to your data sources. The server must be set up with at least one data source registered. For registering data sources, see [Create a ScalarDB Analytics Catalog](./create-scalardb-analytics-catalog.mdx).
23
-
-**Apache Spark:** A compatible version of Apache Spark. For supported versions, see [Version compatibility](#version-compatibility). If you don't have Spark installed yet, please download the Spark distribution from [Apache's website](https://spark.apache.org/downloads.html).
24
-
25
-
:::note
26
-
27
-
Apache Spark is built with either Scala 2.12 or Scala 2.13. ScalarDB Analytics supports both versions. You need to be sure which version you are using so that you can select the correct version of ScalarDB Analytics later. For more details, see [Version compatibility](#version-compatibility).
28
-
29
-
:::
23
+
-**Apache Spark:** A compatible version of Apache Spark. For supported versions, see [Spark](../requirements.mdx#spark). If you don't have Spark installed yet, please download the Spark distribution from [Apache's website](https://spark.apache.org/downloads.html).
30
24
31
25
### Set up ScalarDB Analytics in the Spark configuration
32
26
@@ -116,7 +110,7 @@ Depending on your environment, you may not be able to use all the methods mentio
116
110
117
111
:::
118
112
119
-
With all these methods, you can refer to tables in ScalarDB Analytics by using the same table identifier format. For details about how ScalarDB Analytics maps catalog information from data sources, see [Catalog information reference](./reference-data-source.mdx#catalog-information-reference).
113
+
With all these methods, you can refer to tables in ScalarDB Analytics by using the same table identifier format. For details about how ScalarDB Analytics maps catalog information from data sources, see [Catalog structure mappings by data source](./reference-data-source.mdx#catalog-structure-mappings-by-data-source).
0 commit comments