Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 80 additions & 7 deletions .github/actions/spelling/allow.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
ACLs
ACR
AMD
AWS
Alpstein
Balfrin
Besard
Broyden
CFLAGS
CHARMM
Expand All @@ -17,17 +17,16 @@ Ceph
Containerfile
DNS
Dockerfiles
EDF
EDFs
EDFs
Dufourspitze
EMPA
ETHZ
Ehrenfest
Errigal
FFT
Fawzi
Fock
Foket
GAPW
GCC
GGA
GPFS
GPG
Expand All @@ -39,29 +38,41 @@ GTL
Gaussian
Google
HDD
HDDs
HPC
HPCP
HPE
HSN
Hartree
Invernizzi
Jax
Jira
Keycloak
Kwasniewski
LAMMPS
LAPACK
LDA
LLM
LLMs
LOCALID
LUMI
Libc
Linaro
Linux
MDS
MDSs
MFA
MLP
MNDO
MPICH
Malvoisin
MeteoSwiss
NAMD
NICs
NVMe
Nordend
OSS
OSSs
OTP
OTPs
PASC
Expand All @@ -71,8 +82,10 @@ PID
PMPI
POSIX
Parrinello
Pintarelli
Piz
Plesset
Podladchikov
Pulay
RCCL
RDMA
Expand All @@ -83,22 +96,25 @@ Roothaan
SSHService
STMV
Scopi
Signalkuppe
TOTP
UANs
UserLab
VASP
Waldur
Wannier
XDG
Zumsteinspitz
aarch
aarch64
acl
artifactory
autodetection
aws
baremetal
biomolecular
bristen
bytecode
capstor
chatbot
clariden
concretise
concretizer
Expand All @@ -109,39 +125,82 @@ cuda
customised
dcomex
diagonalisation
dimms
dockerhub
dotenv
dropbear
edf
edfs
eiger
epyc
fftw
filesystems
fontawesome
gcc
gdrcopy
github
gitlab
gpt
gpu
groundstate
gsl
hdf
huggingface
hwloc
iframe
ijulia
inodes
iopsstor
jfrog
jobreport
juhpc
julia
juliaup
jupyter
kokkos
lexer
libfabric
linalg
linux
matlab
meteo
miniconda
mkl
mpi
mps
multitenancy
nanotron
nccl
netlib
netrc
nsight
numa
nvcr
nvdashboard
nvidia
nwp
octicons
ofi
omlin
omp
oom
osts
osu
papi
pme
pmi
podman
preinstalled
prerelease
prereleases
prgenv
prioritisation
prioritise
prioritised
proactively
pyfirecrest
pytorch
quantumespresso
quickstart
rocm
runtime
Expand All @@ -151,6 +210,7 @@ sbatch
screenshot
slurm
smartphone
sourced
sphericart
squashfs
srun
Expand All @@ -177,23 +237,36 @@ torchaudio
torchvision
treesitter
trilinos
trl
uarch
uenv
uenvs
uids
utkin
vCluster
vClusters
valgrind
vasp
vboost
venv
versioned
versioning
waldur
wandb
webhooks
webinar
webpage
website
wikipedia
wikitext
wlcg
workaround
workflows
xattr
xattrs
xcb
xfer
xname
xpmem
youtube
zstd
8 changes: 8 additions & 0 deletions .github/actions/spelling/block-delimiters.list
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,11 @@
# ignore code blocks
```
```

# ignore indented code blocks
```
```

# ignore embedded iframes
<iframe
</iframe>
19 changes: 15 additions & 4 deletions .github/actions/spelling/patterns.txt
Original file line number Diff line number Diff line change
@@ -1,18 +1,29 @@
# Recognized as "Firec" and "REST" with the regular rules, so in patterns.txt
# instead of allow.txt
# Recognized as separate words (e.g. "Firec" and "REST") with the regular rules,
# so in patterns.txt instead of allow.txt
FirecREST
RESTful
IPyParallel
\`ENV\`ironment

# markdown figure
^!\[.*\]\(.*\)$

# Most obvious URLs
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)

# Markdown references (definition and use)
# Markdown references and URLs (definition and use)
^\[\]\(\){#[a-z-]+}$
\]\(#[a-z-]+\)
\]\([^\s]+\)
\]\[[a-z-]+\]

# Markdown URLs

# Inline code
\`[^\`]+\`

# kebab-case and snake_case words
[a-z]+-[a-z-]+
[a-z]+_[a-z_]+

# versions
[0-9]+\.[.0-9]+(\+[0-9a-z]+)?
4 changes: 2 additions & 2 deletions .github/workflows/spelling.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ jobs:
suppress_push_for_open_pull_request: ${{ github.actor != 'dependabot[bot]' && 1 }}
checkout: true
check_file_names: 0
only_check_changed_files: 1
post_comment: 1
only_check_changed_files: ${{ github.event.pull_request && 1 }}
post_comment: ${{ github.event.pull_request && 1 }}
use_magic_file: 1
warnings: bad-regex,binary-file,deprecated-feature,large-file,limited-references,no-newline-at-eof,noisy-file,token-is-substring,unexpected-line-ending,whitespace-in-dictionary,minified-file,unsupported-configuration,no-files-to-check
use_sarif: ${{ (!github.event.pull_request || (github.event.pull_request.head.repo.full_name == github.repository)) && 1 }}
Expand Down
2 changes: 1 addition & 1 deletion docs/accounts/account-create.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Clicking the "Create a new account" button will lead the user to the second step

After submitting personal information, users have to wait for CSCS to review and approve the submission.

Once accepted, you will recieve an email with a link to set your password.
Once accepted, you will receive an email with a link to set your password.

```title="Acceptance email"
Dear John Doe,
Expand Down
2 changes: 1 addition & 1 deletion docs/alps/hardware.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ There are 24 cabinets, in 4 rows with 6 cabinets per row, and each cabinet conta
!!! info "Why 7 blades per chassis?"
A chassis can contain up to 8 blades, however Alps' gh200 chassis are underpopulated so that we can increase the amount of power delivered to each GPU.

Each node contains four Grace-Hopper modules and four corresponding network interface cards (NICS) per blade, as illustrated below:
Each node contains four Grace-Hopper modules and four corresponding network interface cards (NICs) per blade, as illustrated below:

![](../images/alps/gh200-schematic.svg)

Expand Down
2 changes: 1 addition & 1 deletion docs/clusters/eiger.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Eiger is an Alps cluster that provides compute nodes and file systems designed t
Eiger consists of multicore [AMD Epyc Rome][ref-alps-zen2-node] compute nodes: please note that the total number of available compute nodes on the system might vary over time.
See the [Slurm documentation][ref-slurm-partitions-nodecount] for information on how to check the number of nodes.

Additionally, there are four login nodes with hostnames `eiger-ln00[1-4]`.
Additionally, there are four login nodes with host names `eiger-ln00[1-4]`.

### Storage and file systems

Expand Down
2 changes: 1 addition & 1 deletion docs/guides/mlp_tutorials/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
These tutorials solve simple MLP tasks using the [Container Engine][ref-container-engine] on the ML Platform.

1. [LLM Inference][ref-mlp-llm-inference-tutorial]
2. [LLM Finetuning][ref-mlp-llm-finetuning-tutorial]
2. [LLM Fine-tuning][ref-mlp-llm-finetuning-tutorial]
3. [Nanotron Training][ref-mlp-llm-nanotron-tutorial]


Expand Down
10 changes: 5 additions & 5 deletions docs/guides/mlp_tutorials/llm-finetuning.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[](){#ref-mlp-llm-finetuning-tutorial}

# LLM Finetuning Tutorial
# LLM Fine-tuning Tutorial

This tutorial will take the model from the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial and show you how to perform finetuning.
This tutorial will take the model from the [LLM Inference][ref-mlp-llm-inference-tutorial] tutorial and show you how to perform fine-tuning.
This means that we take the model and train it on some new custom data to change its behavior.

To complete the tutorial, we set up some extra libraries that will help us to update the state of the machine learning model.
Expand Down Expand Up @@ -38,10 +38,10 @@ $ pip install -e ./trl # install in editable mode

When this step is complete, you can exit the shell by typing `exit`.

### Finetune Gemma-7B
### Fine-tune Gemma-7B

t this point, we can set up a fine-tuning script and start training Gemma-7B.
Use your favorite text editor to create the file `fine-tune-gemma.sh` just outside the trl and gemma-venv directories:
Use your favorite text editor to create the file `fine-tune-gemma.sh` just outside the `trl` and `gemma-venv` directories:

```bash title="fine-tune-gemma.sh"
#!/bin/bash
Expand Down Expand Up @@ -119,7 +119,7 @@ It should take about 10-15 minutes to fine-tune Gemma:
$ sbatch --nodes=1 fine-tune-sft.sbatch
```

### Compare finetuned Gemma against default Gemma
### Compare fine-tuned Gemma against default Gemma

We can reuse our python script from the first tutorial to do inference on the Gemma model that we just fine-tuned.
Let's try out a different prompt in `gemma-inference.py`:
Expand Down
Loading
Loading