Skip to content

Commit b495530

Browse files
authored
Merge pull request #184 from NYU-RTS/torch-hardware
spec sheet Greene -> Torch, add announcements
2 parents 11e1761 + 9a74dee commit b495530

File tree

10 files changed

+36
-87
lines changed

10 files changed

+36
-87
lines changed

docs/hpc/01_getting_started/01_intro.md

Lines changed: 0 additions & 5 deletions
This file was deleted.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Start here!
2+
3+
Welcome to the Torch HPC documentation! If you do not have an HPC account, please proceed to the next section that explains how you may be able to get one.
4+
5+
If you are an active user, you can proceed to one of the categories on the left.
6+
7+
## Announcements
8+
9+
<iframe src="https://docs.google.com/presentation/d/e/2PACX-1vR8sCeFahadhOL_PRaMl8_dBT6dXEVkLqWznRvEEzNbF3-aUH7AQT1KiZEiwznsy8cMYrvzZoTHORVx/pubembed?start=false&loop=false&delayms=3000" frameborder="0" width="960" height="569" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe>

docs/hpc/03_storage/01_intro_and_data_management.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ The HPC team makes available a number of public datasets that are commonly used
105105

106106
For some of the datasets users must provide a signed usage agreement before accessing.
107107

108-
Public datasets available on the HPC clusters can be viewed on the [Datasets page](../01_getting_started/01_intro.md).
108+
Public datasets available on the HPC clusters can be viewed on the [Datasets page](../04_datasets/01_intro.md).
109109

110110
#### HPC Archive
111111
Once the Analysis stage of the research data lifecycle has completed, <ins>_HPC users should **tar** their data and code into a single tar.gz file and then copy the file to their archive directory (**`/archive/$USER`**_).</ins> The HPC Archive file system is not accessible by running jobs; it is suitable for long-term data storage. Each user has access to a default disk quota of **2TB** and ***20,000 inode (files) limit***. The rather low limit on the number of inodes per user is intentional. The archive file system is available only ***on login nodes*** of Greene. The archive file system is backed up daily.

docs/hpc/03_storage/03_data_transfers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Globus is the recommended tool to use for large-volume data transfers due to the
55
:::
66

77
## Data-Transfer nodes
8-
Attached to the NYU HPC cluster Greene, the Greene Data Transfer Node (gDTN) are nodes optimized for transferring data between cluster file systems (e.g. scratch) and other endpoints outside the NYU HPC clusters, including user laptops and desktops. The gDTNs have 100-Gb/s Ethernet connections to the High Speed Research Network (HSRN) and are connected to the HDR Infiniband fabric of the HPC clusters. More information on the hardware characteristics is available at [Greene spec sheet](../10_spec_sheet.mdx).
8+
Attached to the NYU HPC cluster Greene, the Greene Data Transfer Node (gDTN) are nodes optimized for transferring data between cluster file systems (e.g. scratch) and other endpoints outside the NYU HPC clusters, including user laptops and desktops. The gDTNs have 100-Gb/s Ethernet connections to the High Speed Research Network (HSRN) and are connected to the HDR Infiniband fabric of the HPC clusters. More information on the hardware characteristics is available at [Greene spec sheet](../10_spec_sheet.md).
99

1010
### Data Transfer Node Access
1111
The HPC cluster filesystems include `/home`, `/scratch`, `/archive` and the [HPC Research Project Space](./05_research_project_space.mdx) are available on the gDTN. The Data-Transfer Node (DTN) can be accessed in a variety of ways

docs/hpc/03_storage/08_transferring_cloud_storage_data_with_rclone.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,7 @@ Please enter 'q' and we're done with configuration.
344344
345345
### Step 4: Transfer
346346
:::warning
347-
Please be sure to perform data transters on a data transfer node (DTN). It can degrade performance for other users to perform transfers on other types of nodes. For more information please see [Data Transfers](./03_data_transfers.md)
347+
Please be sure to perform data transfers on a data transfer node (DTN). It can degrade performance for other users to perform transfers on other types of nodes. For more information please see [Data Transfers](./03_data_transfers.md)
348348
:::
349349
350350
Sample commands:

docs/hpc/05_submitting_jobs/03_slurm_tutorial.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The Slurm software system is a resource manager and a job scheduler, which is de
1616

1717
- It also assumes you are comfortable with Linux command-line environment. To learn about linux please read our [Linux Tutorial](../12_tutorial_intro_shell_hpc/01_intro.mdx).
1818

19-
- Please review the [Hardware Specs page](../10_spec_sheet.mdx) for more information on Greene's hardware specifications.
19+
- Please review the [Hardware Specs page](../10_spec_sheet.md) for more information on Greene's hardware specifications.
2020

2121
## Slurm Commands
2222

docs/hpc/06_tools_and_software/01_intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
We encourage you to setup your own computational environment on Torch and to assist you in doing so, we allow you to run [Apptainer](../07_containers/01_intro.md) (formerly known as Singularity) containers, we manage licensed software suites and offer extensive documentation, training and support.
44

55
:::tip
6-
We stongly advise that you setup your own computational enviromments via Apptainer containers and overlay files. Detailed documentation is available in the [containers section](../07_containers/01_intro.md).
6+
We strongly advise that you setup your own computational enviromments via Apptainer containers and overlay files. Detailed documentation is available in the [containers section](../07_containers/01_intro.md).
77
:::
88

99
## Package Management for R, Python, & Julia, and Conda in general

docs/hpc/10_spec_sheet.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
2+
# Torch Spec Sheet
3+
4+
The Torch cluster has 518 [Intel "Xeon Platinum 8592+ 64C"](https://www.intel.com/content/www/us/en/products/sku/237261/intel-xeon-platinum-8592-processor-320m-cache-1-90-ghz/specifications.html) CPUs, 29 NVIDIA [H200](https://nvdam.widen.net/s/nb5zzzsjdf/hpc-datasheet-sc23-h200-datasheet-3002446) GPUs & 68 NVIDIA [L40S](https://resources.nvidia.com/en-us-l40s/l40s-datasheet-28413) GPUs connected together via Infiniband NDR400 interconnect. Further details on each kind of node is provided in the table below.
5+
6+
| Type | Nodes | CPU Cores | GPUs | Memory (GB) |
7+
|---|---|---|---|---|
8+
| Standard Memory | 186 | 23,808 | N/A | 95,232 |
9+
| Large memory | 7 | 896 | N/A | 21,504 |
10+
| H200 GPU | 29 | 3,712 | 232 | 59,392 |
11+
| L40S GPU | 68 | 8,704 | 272 | 34,816 |
12+
| Login | 4 | 512 | N/A | 1024 |
13+
| Data Transfer | 2 | 64 | N/A | 512 |
14+
| Provisioning | 4 | 320 | N/A | 1024 |
15+
| Scheduler | 2 | 64 | N/A | 1024 |
16+
| Total | N/A | 38,080 | 504 | 209.5(TB) |
17+
18+
19+
Torch was tested in June 2025 using the [LINPACK benchmark system](https://top500.org/project/linpack/), which is the basis for all HPC systems ranked on the Top500 list. It had a theoretical maximum performance of 12.25 PF/s thanks to its powerful GPU resources, of which LINPACK was able to use 10.79 PF/s, thus placing it at [#133 on the listed](https://top500.org/system/180363/).
20+
21+
Torch was recently ranked [#40 on the Green 500 list](https://top500.org/lists/green500/list/2025/06/), a global list of the most energy efficient supercomputers in the world thanks to its advanced liquid cooling system.

docs/hpc/10_spec_sheet.mdx

Lines changed: 0 additions & 76 deletions
This file was deleted.

docs/srde/02_faq/01_basics.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Fill out our intake form to provide more information about your project: [Secure
2525
The cost is dependent on the needs of the project, such as size of the data and the type of machine needed for the analysis. Google Cloud has a calculator that will help estimate the costs: https://cloud.google.com/products/calculator
2626

2727
## My data is not high risk, are there other options available?
28-
There are several options available depending on the data risk classification and the needs of the project. There are resources provided by the university such as research project space, cloud computing, etc. You can check out many of these services on the [HPC Support Site](../../hpc/01_getting_started/01_intro.md). If you are unsure on how to proceed, a consultation with the SRDE team will help determine the best path forward.
28+
There are several options available depending on the data risk classification and the needs of the project. There are resources provided by the university such as research project space, cloud computing, etc. You can check out many of these services on the [HPC Support Site](../../hpc/01_getting_started/01_intro.mdx). If you are unsure on how to proceed, a consultation with the SRDE team will help determine the best path forward.
2929

3030
## What does the SRDE team need from me for the consultation?
3131
To help get things started it would be beneficial to submit an [intake form](https://nyu.qualtrics.com/jfe/form/SV_3Vok9ax87Bxxdsy) with any related data governance documentation (files can be attached to the form), including, but not limited to, data use agreement, OSP/IRB documents, and project information.

0 commit comments

Comments
 (0)