Skip to content

Commit 2f86f06

Browse files
committed
Fixes for hpc_foundations.md
modified: docs/navigating_the_cluster/hpc_foundations.md
1 parent be9e1f5 commit 2f86f06

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

docs/navigating_the_cluster/hpc_foundations.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ sidebar_position: 2
77

88
The goal of this exercise is to help you understand the fundamentals **_A to Z_** on effecively navigating the cluster for your research or academic projects.
99

10-
Before we begin this exercise please make sure you have access to the NYU HPC cluster, if not please review the \[Accessing HPC page].
10+
Before we begin this exercise please make sure you have access to the NYU HPC cluster, if not please review the [Accessing HPC page](../accessing_hpc/accessing_hpc.md).
1111

1212
Login to the Greene cluster with ssh at :
1313
> Accessible under NYU Net ( either via VPN or within campus network )
@@ -120,7 +120,7 @@ Similar to `/home`, users have access to multiple filesystems that are :
120120
| /scratch | /scratch/**Net_ID**/ | General Storage | $SCRATCH
121121
| /archive | /archive/**Net_ID**/ | Cold Storage | $ARCHIVE
122122

123-
You will find more details about these filesystems at \[Greene Storage Types page].
123+
You will find more details about these filesystems at [Greene Storage Types page](../storage_specs.mdx).
124124

125125
You can jump to your `/scratch` directory at `/scratch/Net_ID/` with the `cd` command as `cd /scratch/Net_ID`, Or you could simple use the `$SCRATCH` environment variable as:
126126

@@ -204,6 +204,7 @@ print("hello, world")
204204
[pp2959@log-3 ~]$
205205
```
206206

207+
207208
> Here `os.execute()` executes a shell command, in this example the command `hostname` to print the name of the host on which the script is being executed. Followed by printing the message `hello, world`
208209
209210
Now if you try to run the script as `lua hello.lua`, you may get an error like:
@@ -423,8 +424,6 @@ hello, world
423424
>
424425
> 4. Based on your output, you may notice the name of the compute node that this program runs on, the node `cm001.hpc.nyu.ed` in this example is a CPU only node, you may notice a different node. You can find more details about the \[specific nodes here].
425426
426-
<br/>
427-
428427
**_Now how do we determine Or specify the amount of resources needed to run our `hello.lua` script ?_**
429428

430429
By defualt slurm schedules just **_1 CPU_** and **_1 GB_** memory to run your programs.
@@ -759,6 +758,7 @@ srun: job 56142474 has been allocated resources
759758
>
760759
> - `--label` labels standard output of tasks based on task ID from 0 to N.
761760
761+
762762
So far we understood that slurm chooses '_on which nodes our programs should run on_' based on it's scheduling decisions, however it also provides more control like specifying explicitly on which `partition` we can run our programs on.
763763
764764
Here partitions are similar nodes grouped together as a list. For example H100 nodes are grouped together as a partition named `H100_Partition`. Whenever we submit a job request for H100s then nodes sequentially along this partition are reserved and our job is scheduled on them.
@@ -775,8 +775,6 @@ To specify a particular partition, you can use the `--partition` option as shown
775775
srun --partition=cs --nodes=2 --tasks-per-node=1 --cpus-per-task=4 --mem=4GB --time=05:00 lua hello.lua
776776
```
777777
778-
779-
780778
> **_(A) SLURM OVERVIEW_**
781779
>
782780
> - Users submit jobs on the cluster.
@@ -970,6 +968,7 @@ srun --time=02:00 /bin/bash -c "echo '(step 1): hello, world'; "
970968
971969
Every `srun` declared in the `batch script` is called a `job step` that will get it's own `step id` from 0 to N.
972970
971+
973972
Modify `hello.sbatch` file with the above code and submit the batch job:
974973
975974
```sh

0 commit comments

Comments
 (0)