Skip to content

fractal-analytics-platform/fractal-data

Repository files navigation

fractal-data

Prototype to explore serving/viewing zarr data.

This repository contains a simple server application made using Express.

The application has 3 endpoints:

  • the endpoint /files/{path-to-zarr}, that serves the content of Zarr files checking user authorization.
  • the endpoint /alive, that returns the status of the service.
  • the optional endpoint /vizarr, that serves vizarr static files when the VIZARR_STATIC_FILES_PATH environment variable is set.

To run fractal-data you need to have an active instance of fractal-server and an active instance of fractal-web. You need to log-in to fractal-web from the browser using a user that has been authorized to see the vizarr files. Details about authorization are explained in the next section.

How it works

When a user logins to fractal-web, the browser receives a cookie that is generated by fractal-server. The same cookie is sent by the browser to other services on the same domain. The fractal-data service extracts the token contained in the cookie and forwards it back to fractal-server in order to obtain the allowed viewer paths for the user and then decides if the user is authorized to retrieve the requested file or not:

Fractal Data cookie flow

Currently we support 3 different kinds of authorization checks, that can be specified using the AUTHORIZATION_SCHEME environment variable. The service retrieves the user details from the cookie calling fractal server and then applies the configured authorization logic. See the environment variables section below for details about the supported authorization schemes.

Accessing files using the token

While in the browser the authentication relies on cookies, that are automatically shared by the browser across fractal services, for command line and desktop applications it is more appropriate to use bearer tokens. For this reason, files exposed by fractal-data can be retrieved both using cookies and tokens. The token can be downloaded from fractal-web user profile page. The following example shows an example of usage with curl:

curl -H "Authorization: Bearer $(cat /path/to/fractal-token.txt)" http://localhost:3000/files/path/to/file

Note about the domain constraint

This cookie-based technique can be used only if fractal-server and fractal-data are reachable from the same domain (or different subdomains of the same main domain). The single applications can be located on different servers, but a common reverse proxy must be used to expose them on the same domain.

If different subdomains are used for fractal-web and fractal-data, the fractal-web environment variable AUTH_COOKIE_DOMAIN must contain the common parent domain.

Example: if fractal-data is served on fractal-data.mydomain.net and fractal-web is served on fractal-web.mydomain.net, then AUTH_COOKIE_DOMAIN must be set to mydomain.net.

If we need to serve these services on different domains a different authentication strategy has to be chosen, for example something token-based. That results in a more complicated setup, possibly involving some extra changes on the vizarr code.

Install fractal-data from release packages

The release packages include the Node.js server and the Vizarr static files. Starting from version 0.1.3, fractal-data releases provide tar.gz files containing built Vizarr static files and a package of built files for each supported node version.

Install Vizarr static files

Vizarr static files can be served using any server like Apache or Ngnix.

Create a dedicated folder for vizarr on your server. For Apache, it could be /var/www/html/vizarr.

Navigate to the directory and extract the Vizarr static files:

FRACTAL_DATA_VERSION=0.4.0 && wget -qO- "https://github.com/fractal-analytics-platform/fractal-data/releases/download/v${FRACTAL_DATA_VERSION}/fractal-vizarr-v${FRACTAL_DATA_VERSION}.tar.gz" | tar -xz

Note: this will unpack in the current working directory the vizarr dist folder.

Install fractal-data Node.js server files

Create a folder for the server files, navigate into it, and extract the fractal-data server files:

FRACTAL_DATA_VERSION=0.1.3a0 && NODE_MAJOR_VERSION=20 && wget -qO- "https://github.com/fractal-analytics-platform/fractal-data/releases/download/v${FRACTAL_DATA_VERSION}/node-${NODE_MAJOR_VERSION}-fractal-data-v${FRACTAL_DATA_VERSION}.tar.gz " | tar -xz

Note: this will unpack in the current working directory the file package.json and the folders dist and node_modules.

To start the application installed in this way see the section Run fractal-data from the build folder below.

Environment variables

  • PORT: the port where fractal-data app is served;
  • BIND_ADDRESS: specifies the IP address for the server to bind to; use 0.0.0.0 (IPv4) or :: (IPv6) to listen on all interfaces, 127.0.0.1 (IPv4) or ::1 (IPv6) for localhost only; the default value is 0.0.0.0;
  • FRACTAL_SERVER_URL: the base URL of fractal-server;
  • VIZARR_STATIC_FILES_PATH: path to the files generated running npm run build in Vizarr source folder; this variable is optional and, if present, it will be used to serve Vizarr static files from the /vizarr endpoint;
  • BASE_PATH: base path of fractal-data application;
  • AUTHORIZATION_SCHEME: defines how the service verifies user authorization. The following options are available:
    • fractal-server: the paths that can be accessed by each user are retrieved calling fractal-server API.
    • testing-basic-auth: enables Basic Authentication for testing purposes. The credentials are specified through two additional environment variables: TESTING_USERNAME and TESTING_PASSWORD. This option should not be used in production environments.
    • none: no authorization checks are performed, allowing access to all users, including anonymous ones. This option is useful for demonstrations and testing but should not be used in production environments.
  • CACHE_EXPIRATION_TIME: cookie cache TTL in seconds; when user info is retrieved from a cookie calling the current user endpoint on fractal-server the information is cached for the specified amount of seconds, to reduce the number of calls to fractal-server;
  • LOG_LEVEL_CONSOLE: the log level of logs that will be written to the console; the default value is info;
  • LOG_FILE: the path of the file where logs will be written; by default is unset and no file will be created;
  • LOG_LEVEL_FILE: the log level of logs that will be written to the file; the default value is info;

Run fractal-data from the build folder

You can create a script with the following content to run fractal-data installed from a release package:

#!/bin/sh

export PORT=3000
export BIND_ADDRESS=0.0.0.0
export FRACTAL_SERVER_URL=http://localhost:8000
export AUTHORIZATION_SCHEME=fractal-server
# default values for logging levels (uncomment if needed)
# export LOG_LEVEL_CONSOLE=info
# export LOG_FILE=/path/to/log
# export LOG_LEVEL_FILE=info

# default values are usually fine for the following variables; remove comments if needed
# export BASE_PATH=/data
# export CACHE_EXPIRATION_TIME=60

node dist/app.js

Note: starting from Node 20 you can also load the environment variables from a file using the --env-file flag:

node --env-file=.env dist/app.js

Create some test data

Create a folder (i.e. zarr-files) that will contain the zarr files served by fractal-data. This folder has to be added to the allowed viewer paths exposed by fractal-server API, for example setting it as the project_dir for a given user.

You can fill the folder with some test data using the following command:

mkdir zarr-files
cd zarr-files
wget https://zenodo.org/records/10424292/files/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip?download=1
unzip 20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr.zip?download=1

See the test data

Login on fractal-web and then on another tab open the following URL to display the example dataset:

http://localhost:3000/data?source=http://localhost:3000/data/files/path/to/20200812-CardiomyocyteDifferentiation14-Cycle1_mip.zarr/B/03/0

Production setup

Add an Apache configuration to expose fractal-data service on a given path of the public server. The specified location must have the same value set in fractal-data BASE_PATH environment variable (the default value is /data).

<Location /data>
    ProxyPass http://127.0.0.1:3000/data
    ProxyPassReverse http://127.0.0.1:3000/data
</Location>

Add a systemd unit file in /etc/systemd/system/fractal-data.service:

[Unit]
Description=Fractal Data service
After=syslog.target

[Service]
User=fractal
Environment="PORT=3000"
Environment="BIND_ADDRESS=0.0.0.0"
Environment="FRACTAL_SERVER_URL=https://fractal-server.example.com/"
Environment="BASE_PATH=/data"
Environment="AUTHORIZATION_SCHEME=fractal-server"
Environment="CACHE_EXPIRATION_TIME=60"
Environment="LOG_FILE=/path/to/log"
Environment="LOG_LEVEL_FILE=info"
ExecStart=/path/to/node /path/to/fractal-data/dist/app.js
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=multi-user.target

Enable the service and start it:

sudo systemctl enable fractal-data
sudo systemctl start fractal-data

Build fractal-data manually

Fractal-data-viewer setup

Get and install the fractal-data application:

git clone https://github.com/fractal-analytics-platform/fractal-data.git
cd fractal-data
npm install

Copy the file .env.example to .env and customize values for the environment variables.

Vizarr setup

In order to display a proper error message related to the missing authorization it is necessary to use a modified version of vizarr.

Note: for simplicity, we assume that fractal-data and vizarr are subfolders of the same folder:

git clone https://github.com/hms-dbmi/vizarr.git
cd vizarr
git checkout eb2b77fed92a08c78c5770144bc7ccf19e9c7658
npx pnpm install
npx pnpm run build

The output is located in the dist folder.

Run fractal-data

Then go back to fractal-data folder and run npm run start to start the project. The server will start on port 3000. Remember to set the VIZARR_STATIC_FILES_PATH, to serve Vizarr static files from the /vizarr endpoint. Vizarr static files need to be served from the same port and domain of the fractal-data service, otherwise you will encounter CORS issues.

Alive endpoint

It is possible to use the /alive endpoint to check if the service is up and running and retrieve its version.

Docker setup

The following script can be used to build and start a docker image for testing:

#!/bin/sh

COMMIT_HASH=$(git rev-parse HEAD)
IMAGE_NAME="fractal-data-$COMMIT_HASH"

docker build . -t "$IMAGE_NAME"

docker run --network host \
  -v /tmp/zarr-files:/zarr-files \
  -e FRACTAL_SERVER_URL=http://localhost:8000 \
  -e AUTHORIZATION_SCHEME=fractal-server \
  "$IMAGE_NAME"

For production replace the --network host option with a proper published port -p 3000:3000 and set FRACTAL_SERVER_URL as an URL using a public domain.

About

Prototype to explore serving/viewing zarr data

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 2

  •  
  •