Prototype to explore serving/viewing zarr data.
This repository contains a simple server application made using Express.
The application has 3 endpoints:
- the endpoint
/files/{path-to-zarr}
, that serves the content of Zarr files checking user authorization. - the endpoint
/alive
, that returns the status of the service. - the optional endpoint
/vizarr
, that serves vizarr static files when theVIZARR_STATIC_FILES_PATH
environment variable is set.
To run fractal-data you need to have an active instance of
fractal-server
and an active instance offractal-web
. You need to log-in tofractal-web
from the browser using a user that has been authorized to see the vizarr files. Details about authorization are explained in the next section.
When a user logins to fractal-web, the browser receives a cookie that is generated by fractal-server. The same cookie is sent by the browser to other services on the same domain. The fractal-data service extracts the token contained in the cookie and forwards it back to fractal-server in order to obtain the allowed viewer paths for the user and then decides if the user is authorized to retrieve the requested file or not:
Currently we support 3 different kinds of authorization checks, that can be specified using the AUTHORIZATION_SCHEME
environment variable. The service retrieves the user details from the cookie calling fractal server and then applies the configured authorization logic. See the environment variables section below for details about the supported authorization schemes.
While in the browser the authentication relies on cookies, that are automatically shared by the browser across fractal services, for command line and desktop applications it is more appropriate to use bearer tokens. For this reason, files exposed by fractal-data can be retrieved both using cookies and tokens. The token can be downloaded from fractal-web user profile page. The following example shows an example of usage with curl
:
curl -H "Authorization: Bearer $(cat /path/to/fractal-token.txt)" http://localhost:3000/files/path/to/file
This cookie-based technique can be used only if fractal-server and fractal-data are reachable from the same domain (or different subdomains of the same main domain). The single applications can be located on different servers, but a common reverse proxy must be used to expose them on the same domain.
If different subdomains are used for fractal-web and fractal-data, the fractal-web environment variable AUTH_COOKIE_DOMAIN
must contain the common parent domain.
Example: if fractal-data is served on fractal-data.mydomain.net
and fractal-web is served on fractal-web.mydomain.net
, then AUTH_COOKIE_DOMAIN
must be set to mydomain.net
.
If we need to serve these services on different domains a different authentication strategy has to be chosen, for example something token-based. That results in a more complicated setup, possibly involving some extra changes on the vizarr code.
The release packages include the Node.js server and the Vizarr static files. Starting from version 0.1.3, fractal-data releases provide tar.gz files containing built Vizarr static files and a package of built files for each supported node version.
Vizarr static files can be served using any server like Apache or Ngnix.
Create a dedicated folder for vizarr on your server. For Apache, it could be /var/www/html/vizarr
.
Navigate to the directory and extract the Vizarr static files:
FRACTAL_DATA_VERSION=0.4.0 && wget -qO- "https://github.com/fractal-analytics-platform/fractal-data/releases/download/v${FRACTAL_DATA_VERSION}/fractal-vizarr-v${FRACTAL_DATA_VERSION}.tar.gz" | tar -xz
Note: this will unpack in the current working directory the vizarr dist
folder.
Create a folder for the server files, navigate into it, and extract the fractal-data server files:
FRACTAL_DATA_VERSION=0.1.3a0 && NODE_MAJOR_VERSION=20 && wget -qO- "https://github.com/fractal-analytics-platform/fractal-data/releases/download/v${FRACTAL_DATA_VERSION}/node-${NODE_MAJOR_VERSION}-fractal-data-v${FRACTAL_DATA_VERSION}.tar.gz " | tar -xz
Note: this will unpack in the current working directory the file package.json
and the folders dist
and node_modules
.
To start the application installed in this way see the section Run fractal-data from the build folder below.
PORT
: the port where fractal-data app is served;BIND_ADDRESS
: specifies the IP address for the server to bind to; use0.0.0.0
(IPv4) or::
(IPv6) to listen on all interfaces,127.0.0.1
(IPv4) or::1
(IPv6) for localhost only; the default value is0.0.0.0
;FRACTAL_SERVER_URL
: the base URL of fractal-server;VIZARR_STATIC_FILES_PATH
: path to the files generated runningnpm run build
in Vizarr source folder; this variable is optional and, if present, it will be used to serve Vizarr static files from the/vizarr
endpoint;BASE_PATH
: base path of fractal-data application;AUTHORIZATION_SCHEME
: defines how the service verifies user authorization. The following options are available:fractal-server
: the paths that can be accessed by each user are retrieved calling fractal-server API.testing-basic-auth
: enables Basic Authentication for testing purposes. The credentials are specified through two additional environment variables:TESTING_USERNAME
andTESTING_PASSWORD
. This option should not be used in production environments.none
: no authorization checks are performed, allowing access to all users, including anonymous ones. This option is useful for demonstrations and testing but should not be used in production environments.
CACHE_EXPIRATION_TIME
: cookie cache TTL in seconds; when user info is retrieved from a cookie calling the current user endpoint on fractal-server the information is cached for the specified amount of seconds, to reduce the number of calls to fractal-server;LOG_LEVEL_CONSOLE
: the log level of logs that will be written to the console; the default value isinfo
;LOG_FILE
: the path of the file where logs will be written; by default is unset and no file will be created;LOG_LEVEL_FILE
: the log level of logs that will be written to the file; the default value isinfo
;
You can create a script with the following content to run fractal-data installed from a release package:
#!/bin/sh
export PORT=3000
export BIND_ADDRESS=0.0.0.0
export FRACTAL_SERVER_URL=http://localhost:8000
export AUTHORIZATION_SCHEME=fractal-server
# default values for logging levels (uncomment if needed)
# export LOG_LEVEL_CONSOLE=info
# export LOG_FILE=/path/to/log
# export LOG_LEVEL_FILE=info
# default values are usually fine for the following variables; remove comments if needed
# export BASE_PATH=/data
# export CACHE_EXPIRATION_TIME=60
node dist/app.js
Note: starting from Node 20 you can also load the environment variables from a file using the --env-file
flag:
node --env-file=.env dist/app.js
Create a folder (i.e. zarr-files
) that will contain the zarr files served by fractal-data. This folder has to be added to the allowed viewer paths exposed by fractal-server API, for example setting it as the project_dir
for a given user.
You can fill the folder with some test data using the following command:
mkdir zarr-files
cd zarr-files
wget https://zenodo.org/records/10424292/files/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip?download=1
unzip 20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip?download=1
Login on fractal-web and then on another tab open the following URL to display the example dataset:
Add an Apache configuration to expose fractal-data service on a given path of the public server. The specified location must have the same value set in fractal-data BASE_PATH
environment variable (the default value is /data
).
<Location /data>
ProxyPass http://127.0.0.1:3000/data
ProxyPassReverse http://127.0.0.1:3000/data
</Location>
Add a systemd unit file in /etc/systemd/system/fractal-data.service
:
[Unit]
Description=Fractal Data service
After=syslog.target
[Service]
User=fractal
Environment="PORT=3000"
Environment="BIND_ADDRESS=0.0.0.0"
Environment="FRACTAL_SERVER_URL=https://fractal-server.example.com/"
Environment="BASE_PATH=/data"
Environment="AUTHORIZATION_SCHEME=fractal-server"
Environment="CACHE_EXPIRATION_TIME=60"
Environment="LOG_FILE=/path/to/log"
Environment="LOG_LEVEL_FILE=info"
ExecStart=/path/to/node /path/to/fractal-data/dist/app.js
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
Enable the service and start it:
sudo systemctl enable fractal-data
sudo systemctl start fractal-data
Get and install the fractal-data
application:
git clone https://github.com/fractal-analytics-platform/fractal-data.git
cd fractal-data
npm install
Copy the file .env.example
to .env
and customize values for the environment variables.
In order to display a proper error message related to the missing authorization it is necessary to use a modified version of vizarr.
Note: for simplicity, we assume that
fractal-data
andvizarr
are subfolders of the same folder:
git clone https://github.com/hms-dbmi/vizarr.git
cd vizarr
git checkout eb2b77fed92a08c78c5770144bc7ccf19e9c7658
npx pnpm install
npx pnpm run build
The output is located in the dist
folder.
Then go back to fractal-data folder and run npm run start
to start the project. The server will start on port 3000. Remember to set the VIZARR_STATIC_FILES_PATH
, to serve Vizarr static files from the /vizarr
endpoint. Vizarr static files need to be served from the same port and domain of the fractal-data service, otherwise you will encounter CORS issues.
It is possible to use the /alive
endpoint to check if the service is up and running and retrieve its version.
The following script can be used to build and start a docker image for testing:
#!/bin/sh
COMMIT_HASH=$(git rev-parse HEAD)
IMAGE_NAME="fractal-data-$COMMIT_HASH"
docker build . -t "$IMAGE_NAME"
docker run --network host \
-v /tmp/zarr-files:/zarr-files \
-e FRACTAL_SERVER_URL=http://localhost:8000 \
-e AUTHORIZATION_SCHEME=fractal-server \
"$IMAGE_NAME"
For production replace the --network host
option with a proper published port -p 3000:3000
and set FRACTAL_SERVER_URL
as an URL using a public domain.