Skip to content

Commit 5ec9b39

Browse files
authored
doc: add the example of expanse (#188)
* add more examples * write doc * fix grammar
1 parent 1cbc434 commit 5ec9b39

File tree

8 files changed

+88
-2
lines changed

8 files changed

+88
-2
lines changed

doc/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
# ones.
3232
extensions = [
3333
'deepmodeling_sphinx',
34-
'recommonmark',
34+
'myst_parser',
3535
"sphinx_rtd_theme",
3636
'sphinx.ext.viewcode',
3737
'sphinx.ext.intersphinx',

doc/examples/expanse.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Running the DeePMD-kit on the Expanse cluster
2+
3+
[Expanse](https://www.sdsc.edu/support/user_guides/expanse.html) is a cluster operated by the San Diego Supercomputer Center. Here we provide an example to run jobs on the expanse.
4+
5+
The machine parameters are provided below. Expanse uses the SLURM workload manager for job scheduling. `remote_root` has been created in advance. It's worth metioned that we do not recommend to use the password, so [SSH keys](https://www.ssh.com/academy/ssh/key) are used instead to improve security.
6+
7+
```{literalinclude} ../../examples/machine/expanse.json
8+
:language: json
9+
:linenos:
10+
```
11+
12+
Expanse's standard compute nodes are each powered by two 64-core AMD EPYC 7742 processors and contain 256 GB of DDR4 memory. Here, we request one node with 32 cores and 16 GB memory from the `shared` partition. Expanse does not support `--gres=gpu:0` command, so we use `custom_gpu_line` to customize the statement.
13+
14+
```{literalinclude} ../../examples/resources/expanse_cpu.json
15+
:language: json
16+
:linenos:
17+
```
18+
19+
The following task parameter runs a DeePMD-kit task, forwarding an input file and backwarding graph files. Here, the data set will be used among all the tasks, so it is not included in the `forward_files`. Instead, it should be included in the submission's `forward_common_files`.
20+
21+
```{literalinclude} ../../examples/task/deepmd-kit.json
22+
:language: json
23+
:linenos:
24+
```

doc/index.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,11 @@ DPDispatcher will monitor (poke) until these jobs finish and download the result
2222
task
2323
api/api
2424

25+
.. toctree::
26+
:caption: Examples
27+
:glob:
28+
29+
examples/expanse
2530

2631
Indices and tables
2732
==================

examples/machine/expanse.json

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"batch_type": "Slurm",
3+
"local_root": "./",
4+
"remote_root": "/expanse/lustre/scratch/njzjz/temp_project/dpgen_workdir",
5+
"clean_asynchronously": true,
6+
"context_type": "SSHContext",
7+
"remote_profile": {
8+
"hostname": "login.expanse.sdsc.edu",
9+
"username": "njzjz",
10+
"port": "22"
11+
}
12+
}

examples/machine/lazy_local.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
{
2+
"batch_type": "Shell",
3+
"local_root": "./",
4+
"context_type": "LazyLocalContext"
5+
}

examples/resources/expanse_cpu.json

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"number_node": "1",
3+
"cpu_per_node": "1",
4+
"gpu_per_node": "0",
5+
"queue_name": "shared",
6+
"group_size": "1",
7+
"custom_flags": [
8+
"#SBATCH -c 32",
9+
"#SBATCH --mem=16G",
10+
"#SBATCH --time=48:00:00",
11+
"#SBATCH --account=rut149",
12+
"#SBATCH --requeue"
13+
],
14+
"source_list": [
15+
"activate /home/njzjz/deepmd-kit"
16+
],
17+
"envs": {
18+
"OMP_NUM_THREADS": 4,
19+
"TF_INTRA_OP_PARALLELISM_THREADS": 4,
20+
"TF_INTER_OP_PARALLELISM_THREADS": 8,
21+
"DP_AUTO_PARALLELIZATION": 1
22+
},
23+
"batch_type": "Slurm",
24+
"kwargs": {
25+
"custom_gpu_line": "#SBATCH --gpus=0"
26+
}
27+
}

examples/task/deepmd-kit.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"command": "dp train input.json && dp freeze && dp compress",
3+
"task_work_path": "model1/",
4+
"forward_files": [
5+
"input.json"
6+
],
7+
"backward_files": [
8+
"frozen_model.pb",
9+
"frozen_model_compressed.pb"
10+
],
11+
"outlog": "log",
12+
"errlog": "err"
13+
}

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
keywords='deep potential generator active learning deepmd-kit',
4141
install_requires=install_requires,
4242
extras_require={
43-
'docs': ['sphinx', 'recommonmark', 'sphinx_rtd_theme>=1.0.0rc1', 'numpydoc', 'deepmodeling_sphinx'],
43+
'docs': ['sphinx', 'myst-parser', 'sphinx_rtd_theme>=1.0.0rc1', 'numpydoc', 'deepmodeling_sphinx'],
4444
"cloudserver": ["oss2", "tqdm"],
4545
":python_version<'3.7'": ["typing_extensions"],
4646
},

0 commit comments

Comments
 (0)