Skip to content

Commit 6b3719c

Browse files
authored
Merge pull request #10 from ctlab/migrate-converters-update-ui
Massive feature updates
2 parents 142d484 + 27297ec commit 6b3719c

File tree

84 files changed

+3585
-235
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+3585
-235
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,3 +191,5 @@ nbdist/
191191
.github/**
192192
.gitignore
193193
/src/main/resources/webui/**
194+
data/**
195+
start.sh

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2021-2024 Aleksandr Serdiukov, Anton Zamyatin, Aleksandr Sinitsyn, Vitalii Dravgelis and Computer Technologies Laboratory ITMO University team.
3+
Copyright (c) 2021-2026 Aleksandr Serdiukov, Anton Zamyatin, Aleksandr Sinitsyn, Vitalii Dravgelis and Computer Technologies Laboratory ITMO University team.
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.adoc

Lines changed: 288 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -4,69 +4,326 @@ image:https://github.com/AxisAlexNT/HiCT_JVM/actions/workflows/autobuild-release
44

55
== Launching pre-built version
66

7-
**NOTE: currently only Windows (tested on 10 and 11) and Linux (with `glibc`, common Debain/Ubuntu are OK, Alpine users are out of luck) are supported, native libraries for MacOS are not bundled in these builds. Only AMD64 platform is supported. On Windows you might need to install https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170[additional libraries]**.
87

9-
1. Install Java 19 or newer (older versions won't be able to launch this code);
10-
1. Make sure that `JAVA_HOME` variable points to the correct installation path (if you have multiple JREs or JDKs);
11-
1. Download latest "fat" JAR from the https://github.com/ctlab/HiCT_JVM/releases[*Releases* page] in *Assets* section. Latest build will usually be on top, however the most stable implementation is in the build from `master` branch (called "Latest autogenerated build (branch master)"). You can rename it to `hict.jar` for convenience;
12-
1. Open a terminal and change directory to where the downloaded `hict.jar` is located;
13-
1. Issue `java -jar hict.jar` command and wait until message `Starting WebUI server on port 8080 ... WebUI Server started` appears;
14-
1. Open your browser and navigate to the `http://localhost:8080` where HiCT WebUI should now be available.
8+
== For users of `.jar` distribution
159

16-
=== Startup options
10+
This section is intended for bioinformatics users who download a ready-to-run fat JAR from GitHub Releases.
11+
You need to install Java 21+ (this project is built for Java 21 bytecode).
12+
Download the latest fat JAR from the https://github.com/ctlab/HiCT_JVM/releases[Releases page] (Assets section).
13+
**NOTE:** prebuilt native bundles are currently provided for *Windows* (tested on 10/11) and *Linux with glibc* (common Debian/Ubuntu-like distributions). Alpine/musl is not supported by these bundled binaries. Current prebuilt artifacts are AMD64-only. On Windows you might need to install https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170[Microsoft Visual C++ Redistributable].
1714

18-
Currently, there are multiple environment variables that could be set prior to launching HiCT.
15+
=== Quick start
1916

20-
* `DATA_DIR` -- should be a path to the directory containing `.hict.hdf5`, `.agp` and `fasta` files. These files could be anywhere in subtree of this directory, it is scanned recursively.
21-
* `VXPORT` -- should be an integer between `1` and `65535` denoting port number which will be served by HiCT API. Note that listening on ports below `4096` usually requires some kind of administrative privileges. If not provided, the default value is `5000`. Startup might fail if the port is already occupied by another service. Be sure to set correct port in Connection -> API Gateway field in HiCT WebUI if changed.
22-
* `WEBUI_PORT` -- should be an integer between `1` and `65535` denoting port number which will be served by HiCT WebUI. Note that listening on ports below `4096` usually requires some kind of administrative privileges. If not provided, the default value is `8080`. Startup might fail if the port is already occupied by another service.
23-
* `SERVE_WEBUI` -- should either be `true` or `false` telling whether to start serving HiCT WebUI on the desired port or not. Might be useful during debugging or when WebUI is served by another process. Default is `true`. This option does not have any effect in case WebUI is not packed into the jar file.
24-
* `TILE_SIZE` -- should be an integer greater than one. Defines the default tile size for visualization. Experimental setting, currently might break WebUI renderer. Default is `256`. The greater the tile size is, the less tiles are shown on screen and therefore less requests are sent to the server, but each request could potentially take longer to process.
17+
1. Download the latest `-fat.jar` from the Releases page (Assets) and rename it to `hict-fat.jar`.
18+
2. Place your `.hict.hdf5`, `.mcool`, `.cool`, `.agp`, and `.fasta` files under a single directory.
19+
3. Run:
20+
+
21+
```bash
22+
java -jar hict-fat.jar start-server
23+
```
24+
+
25+
Directory with files is set using `DATA_DIR` environment variable, by default it scans subtree of the directory in which `hict-fat.jar` is launched from.
26+
In Linux you may set it as follows:
27+
+
28+
```bash
29+
DATA_DIR=/path/to/data/ java -jar hict-fat.jar start-server
30+
```
31+
+
32+
4. Open WebUI at `http://localhost:8080`.
33+
34+
=== CLI commands (summary)
35+
36+
```bash
37+
# API + WebUI (default mode, includes converters in WebUI as descibed below)
38+
java -jar hict-fat.jar start-server
39+
40+
# API only (no WebUI)
41+
java -jar hict-fat.jar start-api-server
42+
43+
# Convert .mcool -> .hict.hdf5 (CLI mode)
44+
java -jar hict-fat.jar convert mcool-to-hict \
45+
--input /data/sample.mcool \
46+
--output /data/sample.hict.hdf5
47+
48+
# Convert .hict.hdf5 -> .mcool (CLI mode)
49+
java -jar hict-fat.jar convert hict-to-mcool \
50+
--input /data/sample.hict.hdf5 \
51+
--output /data/sample.mcool
52+
```
53+
54+
Get full CLI help:
55+
56+
```bash
57+
java -jar hict-fat.jar --help
58+
java -jar hict-fat.jar start-server --help
59+
java -jar hict-fat.jar start-api-server --help
60+
java -jar hict-fat.jar convert --help
61+
java -jar hict-fat.jar convert mcool-to-hict --help
62+
java -jar hict-fat.jar convert hict-to-mcool --help
63+
```
64+
65+
=== WebUI conversion (Experimental / W.I.P.)
66+
67+
WARNING: WebUI conversion is experimental and may be slower or less stable than the CLI.
68+
69+
1. Open the WebUI.
70+
2. Use *File → Convert Coolers*.
71+
3. Track progress in the conversion window.
72+
73+
=== API access (Experimental / W.I.P.)
74+
75+
WARNING: The API is still evolving. Endpoints, parameters, and response formats may change.
76+
77+
Example (Python) for fetching a submatrix tile as an image:
78+
79+
```python
80+
import requests
81+
82+
host = "http://localhost:5000"
83+
params = {
84+
"version": 0,
85+
"bpResolution": 10000,
86+
"format": "PNG_BY_PIXELS",
87+
"row": 0,
88+
"col": 0,
89+
"rows": 512,
90+
"cols": 512,
91+
}
2592

26-
An example of launching HiCT with parameters:
93+
r = requests.get(f"{host}/get_tile", params=params)
94+
r.raise_for_status()
95+
data = r.json()
96+
png_data_url = data["image"]
97+
print(png_data_url[:64])
98+
```
99+
100+
To apply visualization/normalization settings before fetching tiles:
101+
102+
* POST `/set_visualization_options` with visualization parameters.
103+
* POST `/set_normalization` with normalization settings.
104+
* Then call `/get_tile` as shown above.
105+
106+
=== Supported platforms / JDK details
107+
108+
* *OS/CPU (prebuilt libs):* Linux (glibc) and Windows, AMD64.
109+
* *Not bundled by default:* macOS variants and Linux ARM variants.
110+
* *JDK:* Java 19 or newer is required for running/building this repository.
111+
112+
== Startup options and CLI
113+
114+
The fat JAR is runnable and exposes a CLI with subcommands:
115+
116+
* `start-server` -- API + WebUI (default when no args are given)
117+
* `start-api-server` -- API only (no WebUI)
118+
* `convert` -- conversion tools
119+
** `convert mcool-to-hict`
120+
** `convert hict-to-mcool`
121+
122+
Help:
27123

28-
==== *Linux, bash:*
29124
```bash
30-
DATA_DIR=/home/${USER}/hict/data SERVE_WEBUI=false java -jar hict.jar
125+
java -jar hict.jar --help
126+
java -jar hict.jar convert --help
127+
java -jar hict.jar convert mcool-to-hict --help
31128
```
32129

33-
==== *Windows, cmd:*
130+
Environment variables supported by the server startup:
131+
132+
* `DATA_DIR` -- directory that is scanned recursively for `.hict.hdf5`, `.agp`, `fasta`, `.cool`, and `.mcool` files.
133+
* `VXPORT` -- API gateway port, default `5000`.
134+
* `WEBUI_PORT` -- WebUI port, default `8080`.
135+
* `SERVE_WEBUI` -- `true`/`false`, default `true`.
136+
* `TILE_SIZE` -- default visualization tile size, default `256`.
137+
* `MIN_DS_POOL` / `MAX_DS_POOL` -- min/max pool sizes used when opening chunked datasets.
138+
139+
=== Launch examples (fat JAR)
140+
141+
==== Linux (bash)
142+
143+
```bash
144+
DATA_DIR=/home/${USER}/hict/data java -jar hict.jar
145+
146+
# API only
147+
DATA_DIR=/home/${USER}/hict/data java -jar hict.jar start-api-server
148+
149+
# Explicit server (API + WebUI)
150+
DATA_DIR=/home/${USER}/hict/data java -jar hict.jar start-server
151+
```
152+
153+
==== Windows (cmd)
154+
34155
```cmd
35156
set DATA_DIR="D:\hict\data"
36157
set WEBUI_PORT="8888"
37-
java -jar hict.jar
158+
java -jar hict.jar start-server
38159
```
39160

40-
==== *Windows, PowerShell:*
161+
==== Windows (PowerShell)
162+
41163
```powershell
42164
$env:DATA_DIR = "D:\hict\data"
43165
$env:WEBUI_PORT = "8888"
44-
java -jar hict.jar
166+
java -jar hict.jar start-server
167+
```
168+
169+
==== Custom JVM options
170+
171+
```bash
172+
DATA_DIR=/home/${USER}/hict/data java -ea -Xms512M -Xmx16G -jar hict.jar start-api-server
173+
```
174+
175+
=== Launch examples (Gradle, from source)
176+
177+
```bash
178+
# Default: runs HiCT CLI (equivalent to `java -jar ...`)
179+
./gradlew clean run
180+
181+
# Explicit modes
182+
./gradlew run --args="start-server"
183+
./gradlew run --args="start-api-server"
45184
```
46185

47-
==== Custom JVM Options
186+
== Converter workflows (`.mcool` ↔ `.hict.hdf5`)
48187

49-
Of course, you can also pass JVM parameters like this:
188+
=== CLI commands
189+
190+
Use the JVM CLI for both directions:
50191

51192
```bash
52-
DATA_DIR=/home/${USER}/hict/data SERVE_WEBUI=false java -ea -Xms512M -Xmx16G -jar hict.jar
193+
# mcool -> hict
194+
java -jar hict.jar convert mcool-to-hict \
195+
--input /data/sample.mcool \
196+
--output /data/sample.hict.hdf5
197+
198+
# hict -> mcool
199+
java -jar hict.jar convert hict-to-mcool \
200+
--input /data/sample.hict.hdf5 \
201+
--output /data/sample.roundtrip.mcool
53202
```
54203

55-
=== Startup errors
204+
=== Web conversion API flow
56205

57-
Since library naming conventions are different for different platform and libraries, there is currently a mechanism to try and load each library under a different name. This CAN produce errors on server startup, you can ignore them if `Starting WebUI server on port 8080 ... WebUI Server started` message appeared in console.
206+
Typical asynchronous conversion sequence used by WebUI/integrations:
58207

59-
If, however, server works but maps are not displayed in WebUI and an error sign displays at the bottom right corner of WebUI, you should check console for error output.
208+
1. *Upload*: `POST /api/convert/upload`
209+
* Upload source file and target format metadata.
210+
* Response returns a `jobId`.
211+
2. *Status polling*: `GET /api/convert/status/{jobId}`
212+
* Poll until state becomes `DONE` or `FAILED`.
213+
3. *Download*: `GET /api/convert/download/{jobId}`
214+
* Download converted artifact when status is `DONE`.
60215

61-
== Obtaining `.hict.hdf5` files
216+
Recommended size limits:
62217

63-
Currently, it's necessary to use https://github.com/ctlab/HiCT_Utils[`HiCT_Utils` package] for the file format conversion, there are plans to simplify this process.
218+
* Keep upload limits explicit at ingress/proxy and app gateway.
219+
* For JVM safety, avoid unbounded request bodies in production; set max request size and timeouts.
220+
* For very large matrices, prefer direct local file conversion (CLI) and then load resulting artifacts through `DATA_DIR`.
64221

65-
== Building `HiCT_JVM` from source
222+
== Scaffolding API behavior notes
223+
224+
Scaffolding operations are served as POST endpoints and return updated assembly information:
225+
226+
* `/reverse_selection_range`
227+
* `/move_selection_range`
228+
* `/split_contig_at_bin`
229+
* `/group_contigs_into_scaffold`
230+
* `/ungroup_contigs_from_scaffold`
231+
* `/move_selection_to_debris`
232+
233+
Important tile-version expectation:
66234

67-
To start building from source, you can run:
235+
* Tile requests use `GET /get_tile?...&version=<n>`.
236+
* If the requested version is *older* than server-side tile version, server returns HTTP `204` (no tile body) to force client invalidation.
237+
* If the requested version is newer, server advances the internal version counter.
238+
* Practical client rule: after each scaffolding mutation, increment your tile version and refresh visible tile requests.
239+
240+
== Startup errors and JHDF5 native library troubleshooting
241+
242+
During startup, you may see several native-library load attempts with warnings/errors. This can be expected because different platform-specific library names are tried.
243+
244+
If startup completes and API/WebUI are healthy, these warnings can be non-fatal.
245+
246+
When native loading actually fails:
247+
248+
1. Confirm architecture match (AMD64 JVM + AMD64 native bundle).
249+
2. Confirm OS compatibility (Linux glibc; not Alpine/musl).
250+
3. On Linux, ensure native/plugin paths are discoverable, for example:
251+
+
252+
```bash
253+
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/path/to/hdf5/lib:/path/to/hdf5/lib/plugin"
254+
export HDF5_PLUGIN_PATH="/path/to/hdf5/lib/plugin"
68255
```
256+
4. On Windows, install/update Visual C++ runtime redistributables.
257+
5. Verify Java version (`java -version`) is 19+.
258+
6. If tiles fail to render but server starts, inspect logs for `UnsatisfiedLinkError` and HDF5 plugin load failures.
259+
260+
== Production checklist (short)
261+
262+
Before deploying to production, verify:
263+
264+
* Logging: structured logs, retention, and centralized collection.
265+
* Metrics/health: request latency/error metrics and liveness/readiness checks.
266+
* Limits: request body size, timeouts, and JVM heap sizing are set explicitly.
267+
* Graceful shutdown: stop accepting traffic, finish in-flight requests, then terminate.
268+
* Backup/cleanup: regular backup strategy for source/converted files and periodic cleanup of temporary/intermediate artifacts.
269+
270+
== Building `HiCT_JVM` from source
271+
272+
To build from source:
273+
274+
```bash
69275
./gradlew clean build
70276
```
71277

278+
=== Dependency management workflow
279+
280+
This project uses Gradle dependency locking (`gradle.lockfile`) to keep transitive dependency resolution reproducible.
281+
282+
* Refresh lock state after dependency changes:
283+
+
284+
```bash
285+
./gradlew dependencies --write-locks
286+
```
287+
* Inspect the resolved version for a specific dependency before/after updates:
288+
+
289+
```bash
290+
./gradlew dependencyInsight --dependency org.slf4j:slf4j-api --configuration runtimeClasspath
291+
./gradlew dependencyInsight --dependency ch.qos.logback:logback-classic --configuration runtimeClasspath
292+
./gradlew dependencyInsight --dependency org.jetbrains:annotations --configuration compileClasspath
293+
```
294+
295+
Commit both `build.gradle.kts` and `gradle.lockfile` together whenever lock state changes.
296+
72297
Current progress on modifying HDF5 and JHDF5 configuration resides in https://github.com/AxisAlexNT/jhdf5-with-plugins-configuration-snapshot[my personal repository]. Modified configuration is necessary to rebuild native libraries (HDF5, HDF5 plugins and JHDF5 should all be build as dynamic libraries). However, prebuilt native libraries for AMD64 Windows and Linux platforms are already present in `HiCT_JVM` repository. Missing platforms are Linux on `armv7` and `aarch64` and MacOS (both `amd64` and `aarch64` variants).
298+
299+
== Conversion tools (CLI + API)
300+
301+
A native converter module is now available in JVM codebase with two services:
302+
303+
* `McoolToHictConverter` (`mcool-to-hict`)
304+
* `HictToMcoolConverter` (`hict-to-mcool`)
305+
306+
CLI launcher:
307+
308+
```bash
309+
./gradlew runConversionCli --args="convert hict-to-mcool --input=/data/sample.hict.hdf5 --output=/data/sample.mcool --resolutions=10000,50000 --compression=4 --chunk-size=8192"
310+
./gradlew runConversionCli --args="convert mcool-to-hict --input=/data/sample.mcool --output=/data/sample.hict.hdf5 --resolutions=10000,50000 --parallelism=16"
311+
```
312+
313+
Arguments:
314+
315+
* `--input=<path>` source file path
316+
* `--output=<path>` destination file path
317+
* `--resolutions=<comma-separated>` optional resolution filter
318+
* `--compression=<0..9>` deflate level (`0` means chunked/no deflate)
319+
* `--chunk-size=<N>` chunk size for streaming traversal
320+
* `--agp=<file.agp> --apply-agp` apply AGP before `hict-to-mcool` export
321+
* `--parallelism=<N>` max worker threads (default: available CPU cores)
322+
323+
Web API endpoints:
324+
325+
* `POST /convert/upload` (multipart + query params: `direction`, `resolutions`, `compression`, `chunkSize`, `applyAgp`, `agpPath`, `parallelism`)
326+
* `GET /convert/jobs/:jobId`
327+
* `GET /convert/download/:jobId`
328+
329+
Conversion jobs are asynchronous, include streaming logs/error details, enforce upload size limit and have temporary file cleanup TTL.

0 commit comments

Comments
 (0)