Skip to content
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ The tutorial build is very large and requires a GPU. If your machine does not ha

1. Install required dependencies by running: `pip install -r requirements.txt`.

> Typically, you would run either in `conda` or `virtualenv`. If you want to use `virtualenv`, in the root of the repo, run: `virtualenv venv`, then `source venv/bin/activate`.
> To use `virtualenv`, in the root of the repo, run: `virtualenv venv`, then `source venv/bin/activate`.

- If you have a GPU-powered laptop, you can build using `make docs`. This will download the data, execute the tutorials and build the documentation to `docs/` directory. This might take about 60-120 min for systems with GPUs. If you do not have a GPU installed on your system, then see next step.
- You can skip the computationally intensive graph generation by running `make html-noplot` to build basic html documentation to `_build/html`. This way, you can quickly preview your tutorial.
Expand Down
2 changes: 1 addition & 1 deletion _templates/layout.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
<script>
if((window.location.href.indexOf("/prototype/")!= -1) && (window.location.href.indexOf("/prototype/prototype_index")< 1))
{
var div = '<div class="admonition note"><p class="admonition-title">Note</p><p><i class="fa fa-flask" aria-hidden="true">&nbsp</i> This tutorial describes a prototype feature. Prototype features are typically not available as part of binary distributions like PyPI or Conda, except sometimes behind run-time flags, and are at an early stage for feedback and testing.</p></div>'
var div = '<div class="admonition note"><p class="admonition-title">Note</p><p><i class="fa fa-flask" aria-hidden="true">&nbsp</i> This tutorial describes a prototype feature. Prototype features are typically not available as part of binary distributions like PyPI, except sometimes behind run-time flags, and are at an early stage for feedback and testing.</p></div>'
document.getElementById("pytorch-article").insertAdjacentHTML('afterBegin', div)
}
</script>
Expand Down
33 changes: 16 additions & 17 deletions advanced_source/sharding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,52 +9,52 @@ tables by explicitly configuring them.
Installation
------------

Requirements: - python >= 3.7
Requirements: - Python >= 3.7

We highly recommend CUDA when using torchRec. If using CUDA: - cuda >=
11.0

.. code:: python

# install conda to make installying pytorch with cudatoolkit 11.3 easier.
# TODO: replace these
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs update

# install Conda to make installing PyTorch with cudatoolkit 11.3 easier.
!sudo rm Miniconda3-py37_4.9.2-Linux-x86_64.sh Miniconda3-py37_4.9.2-Linux-x86_64.sh.*
!sudo wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.9.2-Linux-x86_64.sh
!sudo chmod +x Miniconda3-py37_4.9.2-Linux-x86_64.sh
!sudo bash ./Miniconda3-py37_4.9.2-Linux-x86_64.sh -b -f -p /usr/local

.. code:: python

# install pytorch with cudatoolkit 11.3
# install PyTorch with cudatoolkit 11.3
!sudo conda install pytorch cudatoolkit=11.3 -c pytorch-nightly -y

Installing torchRec will also install
Installing TorchRec will also install
`FBGEMM <https://github.com/pytorch/fbgemm>`__, a collection of CUDA
kernels and GPU enabled operations to run
kernels and GPU enabled operations to run.

.. code:: python

# install torchrec
!pip3 install torchrec-nightly

Install multiprocess which works with ipython to for multi-processing
programming within colab
Install `multiprocess`` which works with `iPython` for multi-processing
programming within `Colab``:

.. code:: python

!pip3 install multiprocess

The following steps are needed for the Colab runtime to detect the added
shared libraries. The runtime searches for shared libraries in /usr/lib,
so we copy over the libraries which were installed in /usr/local/lib/.
**This is a very necessary step, only in the colab runtime**.
shared libraries. The runtime searches for shared libraries is in `/usr/lib`,
so we copy over the libraries which were installed in `/usr/local/lib/`.
**This is a very necessary step, only in the Colab runtime**.

.. code:: python

!sudo cp /usr/local/lib/lib* /usr/lib/

**Restart your runtime at this point for the newly installed packages
to be seen.** Run the step below immediately after restarting so that
python knows where to look for packages. **Always run this step after
Python knows where to look for packages. **Always run this step after
restarting the runtime.**

.. code:: python
Expand All @@ -71,7 +71,7 @@ Due to the notebook enviroment, we cannot run
can do multiprocessing inside the notebook to mimic the setup. Users
should be responsible for setting up their own
`SPMD <https://en.wikipedia.org/wiki/SPMD>`_ launcher when using
Torchrec. We setup our environment so that torch distributed based
TorchRec. We setup our environment so that torch distributed based
communication backend can work.

.. code:: python
Expand Down Expand Up @@ -213,7 +213,7 @@ embedding table placement using planner and generate sharded model using
)
sharders = [cast(ModuleSharder[torch.nn.Module], EmbeddingBagCollectionSharder())]
plan: ShardingPlan = planner.collective_plan(module, sharders, pg)

sharded_model = DistributedModelParallel(
module,
env=ShardingEnv.from_process_group(pg),
Expand All @@ -234,7 +234,7 @@ ranks.
.. code:: python

import multiprocess

def spmd_sharing_simulation(
sharding_type: ShardingType = ShardingType.TABLE_WISE,
world_size = 2,
Expand All @@ -254,7 +254,7 @@ ranks.
)
p.start()
processes.append(p)

for p in processes:
p.join()
assert 0 == p.exitcode
Expand Down Expand Up @@ -333,4 +333,3 @@ With data parallel, we will repeat the tables for all devices.

rank:0,sharding plan: {'': {'large_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'large_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None)}}
rank:1,sharding plan: {'': {'large_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'large_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_0': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None), 'small_table_1': ParameterSharding(sharding_type='data_parallel', compute_kernel='batched_dense', ranks=[0, 1], sharding_spec=None)}}

1 change: 1 addition & 0 deletions advanced_source/torch_script_custom_ops.rst
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ Environment setup

We need an installation of PyTorch and OpenCV. The easiest and most platform
independent way to get both is to via Conda::
.. # TODO: replace these
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs update


conda install -c pytorch pytorch
conda install opencv
Expand Down
5 changes: 2 additions & 3 deletions beginner_source/hta_intro_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ below.

Installing HTA
~~~~~~~~~~~~~~

We recommend using a Conda environment to install HTA. To install Anaconda, see
`the official Anaconda documentation <https://docs.anaconda.com/anaconda/install/index.html>`_.

Expand Down Expand Up @@ -130,12 +129,12 @@ on each rank.

.. image:: ../_static/img/hta/idle_time_summary.png
:scale: 100%

.. tip::

By default, the idle time breakdown presents the percentage of each of the
idle time categories. Setting the ``visualize_pctg`` argument to ``False``,
the function renders with absolute time on the y-axis.
the function renders with absolute time on the y-axis.


Kernel Breakdown
Expand Down
Loading
Loading