You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/install/cluster.md
+46-24Lines changed: 46 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
**Note**: *This deployment configuration is currently **experimental** and subject to future updates.*
4
4
5
-
This document offers step-by-step instructions for deploying **Graphistry** in a multinode environment using Docker Compose. In this architecture, a **leader**node handles dataset ingestion and manages the single PostgreSQL instance, while **follower** nodes can visualize graphs too using the shared datasets. Currently, only the leader node has permission to upload datasets and files (data ingestion), but future updates will allow follower nodes to also perform dataset and file uploads (data ingestion).
5
+
This document provides step-by-step instructions for deploying **Graphistry** in a multinode environment using Docker Compose. In this architecture, both the **Leader** and **Follower** nodes can ingest datasets and files, with all nodes accessing the same **PostgreSQL** instance on the **Leader** node. As a result, **Follower** nodes can also perform data uploads, ensuring that both **Leader** and **Follower** nodes have equal access to dataset ingestion and visualization.
6
6
7
-
The leader and followers will share datasets using a **Distributed File System**, for example, using the Network File System (NFS) protocol. This setup allows all nodes to access the same dataset directory. This configuration ensures that **Graphistry** can be deployed across multiple machines, each with different GPU configuration profiles (some with more powerful GPUs, enabling multi-GPU on multinode setups), while keeping the dataset storage centralized and synchronized.
7
+
The leader and followers will share datasets using a **Distributed File System**, for example, using the **Network File System (NFS)** protocol. This setup allows all nodes to access the same dataset directory. This configuration ensures that **Graphistry** can be deployed across multiple machines, each with different **GPU** configuration profiles (some with more powerful GPUs, enabling **multi-GPU** on multinode setups), while keeping the dataset storage centralized and synchronized.
8
8
9
9
This deployment mode is flexible and can be used both in **on-premises** clusters or in the **cloud**. For example, it should be possible to use **Amazon Machine Images (AMIs)** from the [Graphistry AWS Marketplace](https://aws.amazon.com/marketplace/pp/prodview-ppbjy2nny7xzk?sr=0-1&ref_=beagle&applicationId=AWSMPContessa), assigning Amazon VMs created from those images to the **leader** and **follower** roles. This allows for scalable and customizable cloud-based deployments with the same multinode architecture.
10
10
@@ -13,7 +13,7 @@ This deployment mode is flexible and can be used both in **on-premises** cluster
13
13
1.**Leader Node**: Handles the ingestion of datasets, PostgreSQL write operations, and exposes the required PostgreSQL ports.
14
14
2.**Follower Nodes**: Connect to the PostgreSQL instance on the leader and can visualize graphs using the shared datasets. However, they do not have their own attached PostgreSQL instance.
15
15
3.**Shared Dataset**: All nodes will access the dataset directory using a **Distributed File System**. This ensures that the leader and followers use the same dataset, maintaining consistency across all nodes.
16
-
4.**PostgreSQL**: The PostgreSQL instance on the leader node is used by all nodes in the cluster for querying. The **Nexus** service, which provides the main dashboard for Graphistry, on the **Leader**node is responsible for managing access to the PostgreSQL database. The**Nexus**services on the **follower** nodes will use the PostgreSQL instance of the **Leader**.
16
+
4.**PostgreSQL**: The PostgreSQL instance on the **Leader** node is used by all nodes for querying. The **Nexus** serviceon the **Leader**manages access to the database, while **Follower** nodes also use the **Leader’s**PostgreSQL instance. Both**Leader**and **Follower** nodes can perform actions like user sign-ups and settings modifications through their own **Nexus** dashboards, with changes applied system-wide for consistency across all nodes.
17
17
18
18
## Configuration File: `cluster.env`
19
19
@@ -64,26 +64,44 @@ NFS will be used to share the dataset directory between nodes. Follow the steps
64
64
65
65
#### On the Leader Node (Main Machine)
66
66
67
-
1.**Create directories for PostgreSQL data and backups**:
67
+
1.**Install NFS server**:
68
+
69
+
On the leader node, install the NFS server software:
68
70
69
71
```bash
72
+
sudo apt install nfs-kernel-server
73
+
```
74
+
75
+
This will install the necessary software for serving NFS shares to the follower nodes.
76
+
77
+
2. **Create directories for PostgreSQL and shared data**:
78
+
79
+
```bash
80
+
# These directories will store PostgreSQL data and backups
70
81
mkdir -p /mnt/data/shared/postgresql_data
71
82
mkdir -p /mnt/data/shared/postgres_backups
72
-
```
73
83
74
-
These directories will hold the PostgreSQL data and backups, which will be shared with follower nodes.
3. **Set appropriate permissions on the shared directory**:
77
89
78
-
On the leader node, install the NFS server software:
90
+
To ensure the shared directory has the correct permissions and can be written to by NFS clients, it’s important to verify and configure access properly. The user is responsible for ensuring that the shared directory has the necessary permissions to allow remote follower nodes to read, write, and modify files as needed. For instance, you may need to apply the following changes to make sure the shared directory is accessible by NFS clients:
79
91
80
92
```bash
81
-
sudo apt install nfs-kernel-server
93
+
# Set permissions to allow full access (read, write, execute) for all users
94
+
sudo chmod -R 777 /mnt/data/shared/
95
+
96
+
# Change ownership to 'nobody:nogroup' for NFS access
97
+
sudo chown -R nobody:nogroup /mnt/data/shared/
82
98
```
83
99
84
-
This will install the necessary software for serving NFS shares to the follower nodes.
100
+
This will allow all users and processes (including the remote follower instances) to read and write to the shared directory, ensuring they can ingest datasets and files. You can adjust these permissions later based on your security requirements.
85
101
86
-
3. **Configure NFS exports**:
102
+
*Notice: The following shared directory permissions are provided as an example. Please ensure the settings align with your security policies.*
103
+
104
+
4. **Configure NFS exports**:
87
105
88
106
Edit the `/etc/exports` file to specify which directories should be shared and with what permissions. The following configuration allows the follower node (with IP `192.168.0.20`) to mount the shared directory with read/write permissions.
89
107
@@ -94,14 +112,17 @@ NFS will be used to share the dataset directory between nodes. Follow the steps
94
112
Add the following line to export the shared dataset directory:
- `sync`: Ensures that changes are written to disk before responding to the client.
102
120
- `no_subtree_check`: Disables subtree checking to improve performance.
121
+
- `no_root_squash`: Retains root access for the client’s root user on the shared directory, which can be necessary for certain tasks but should be used with caution due to the elevated permissions.
122
+
123
+
*Notice: The following NFS configuration is provided as an example. Please ensure the settings align with your security policies.*
103
124
104
-
4. **Export the NFS share** and restart the NFS server to apply the changes:
125
+
5. **Export the NFS share** and restart the NFS server to apply the changes:
105
126
106
127
```bash
107
128
sudo exportfs -a
@@ -110,22 +131,22 @@ NFS will be used to share the dataset directory between nodes. Follow the steps
110
131
111
132
#### On the Follower Node (Secondary Machine)
112
133
113
-
1. **Create a directory to mount the NFS share**:
134
+
1. **Install NFS client**:
135
+
136
+
On the follower node, install the NFS client software to mount the NFS share:
114
137
115
138
```bash
116
-
mkdir -p /home/user1/mnt/data/shared/
139
+
sudo apt install nfs-common
117
140
```
118
141
119
-
This is where the shared dataset will be mounted on the follower node.
120
-
121
-
2. **Install NFS client**:
122
-
123
-
On the follower node, install the NFS client software to mount the NFS share:
142
+
2. **Create a directory to mount the NFS share**:
124
143
125
144
```bash
126
-
sudo apt install nfs-common
145
+
mkdir -p /home/user1/mnt/data/shared/
127
146
```
128
147
148
+
This is where the shared dataset will be mounted on the follower node.
149
+
129
150
3. **Mount the shared NFS directory**:
130
151
131
152
Mount the directory shared by the leader node to the local directory on the follower node:
@@ -222,12 +243,13 @@ Once the deployment is complete, you can use the leader node to upload datasets,
0 commit comments