Skip to content

Commit 5aabe4c

Browse files
[Tutorials] Lazy import GPU modules in the Llama Nemotron tutorial (#831)
Signed-off-by: Mehran Maghoumi <Maghoumi@users.noreply.github.com> Co-authored-by: Sarah Yurick <53962159+sarahyurick@users.noreply.github.com>
1 parent 88908c1 commit 5aabe4c

File tree

2 files changed

+18
-3
lines changed

2 files changed

+18
-3
lines changed

tutorials/llama-nemotron-data-curation/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,9 @@ This tutorial demonstrates how a user can process a subset the Llama Nemotron da
3939

4040
Setup requirements:
4141

42-
- Hardware: CPU is sufficient, GPU is recommended for enhanced performance
42+
- Hardware:
43+
- This tutorial can be run entire on a CPU with 4 workers 64GB RAM.
44+
- This tutorial can also be run on a single H100 GPU.
4345
- Recommended environment: This tutorial was developed and tested with a Conda environment
4446

4547
Please refer to NeMo Curator's [README](https://github.com/NVIDIA/NeMo-Curator?tab=readme-ov-file#get-started) for instructions on how to download NeMo Curator via PyPI, source, or Docker.

tutorials/llama-nemotron-data-curation/main.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,19 +12,24 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414

15+
from __future__ import annotations
16+
1517
import argparse
1618
import os
1719
import time
1820
from itertools import zip_longest
21+
from typing import TYPE_CHECKING
1922

20-
import cudf
2123
import dask.dataframe as dd
22-
import dask_cudf
2324
import fasttext
2425
import pandas as pd
2526
from dask.delayed import delayed
2627
from transformers import AutoTokenizer
2728

29+
if TYPE_CHECKING:
30+
import cudf
31+
import dask_cudf
32+
2833
from nemo_curator import ScoreFilter, Sequential
2934
from nemo_curator.datasets import DocumentDataset
3035
from nemo_curator.filters import DocumentFilter
@@ -366,6 +371,8 @@ def interleave_partitions(
366371
merged_parts.append(p2)
367372

368373
if gpu:
374+
import dask_cudf
375+
369376
return dask_cudf.from_delayed(merged_parts, meta=df1._meta) # noqa: SLF001
370377
else:
371378
return dd.from_delayed(merged_parts, meta=df1._meta) # noqa: SLF001
@@ -386,6 +393,8 @@ def _interleave_rows(
386393
rows.append(df2.iloc[i])
387394

388395
if gpu:
396+
import cudf
397+
389398
return cudf.DataFrame(rows)
390399
else:
391400
return pd.DataFrame(rows)
@@ -408,6 +417,8 @@ def interleave_rows(
408417
interleaved_parts.append(interleaved)
409418

410419
if gpu:
420+
import dask_cudf
421+
411422
return dask_cudf.from_delayed(interleaved_parts, meta=df1._meta) # noqa: SLF001
412423
else:
413424
return dd.from_delayed(interleaved_parts, meta=df1._meta) # noqa: SLF001
@@ -505,6 +516,8 @@ def main(args: argparse.Namespace) -> None: # noqa: C901, PLR0915
505516

506517
# Convert to GPU if requested
507518
if args.device == "gpu":
519+
import cudf
520+
508521
print("Converting to GPU")
509522
dataset_df = dataset_df.map_partitions(lambda partition: cudf.from_pandas(partition))
510523

0 commit comments

Comments
 (0)