NVIDIA-Merlin
diff --git a/‎.github/workflows/docs-sched-rebuild.yaml‎
Lines changed: 5 additions & 1 deletion b/‎.github/workflows/docs-sched-rebuild.yaml‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 8 additions & 8 deletions b/‎README.md‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎docs/README.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/conf.py‎
Lines changed: 5 additions & 3 deletions b/‎docs/source/conf.py‎
Lines changed: 5 additions & 3 deletions
diff --git a/‎docs/source/index.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/index.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/02-Merlin-Models-and-NVTabular-integration.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎examples/02-Merlin-Models-and-NVTabular-integration.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/03-Exploring-different-models.ipynb‎
Lines changed: 2 additions & 2 deletions b/‎examples/03-Exploring-different-models.ipynb‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/04-Exporting-ranking-models.ipynb‎
Lines changed: 3 additions & 3 deletions b/‎examples/04-Exporting-ranking-models.ipynb‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎examples/05-Retrieval-Model.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎examples/05-Retrieval-Model.ipynb‎
Lines changed: 1 addition & 1 deletion
@@ -26,6 +26,10 @@ jobs:
       - name: Install dependencies
         run: |
           python -m pip install --upgrade pip tox
+      - name: Setup local branches for docs build
+        run: |
+          git branch --track main origin/main || true
+          git branch --track stable origin/stable || true
       - name: Building docs (multiversion)
         run: |
           tox -vv -e docs-multi
@@ -83,7 +87,7 @@ jobs:
             exit 0
           fi
           # If any of these commands fail, fail the build.
-          def_branch=$(gh api "repos/${GITHUB_REPOSITORY}" --jq ".default_branch")
+          def_branch="stable"
           html_url=$(gh api "repos/${GITHUB_REPOSITORY}/pages" --jq ".html_url")
           cat > index.html << EOF
           <!DOCTYPE html>
 
@@ -23,7 +23,7 @@ into three categories:
 
 ### Your first issue
 
-1. Read the project's [README.md](https://github.com/NVIDIA-Merlin/models/blob/main/README.md)
+1. Read the project's [README.md](https://github.com/NVIDIA-Merlin/models/blob/stable/README.md)
    to learn how to setup the development environment.
 2. Find an issue to work on. The best way is to look for the [good first issue](https://github.com/NVIDIA-Merlin/models/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
    or [help wanted](https://github.com/NVIDIA-Merlin/models/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.
@@ -116,7 +116,7 @@ deep_block: Block
 ```
 
 The [Intersphinx](https://docs.readthedocs.io/en/stable/guides/intersphinx.html)
-extension truncates the text to [Schema](https://nvidia-merlin.github.io/core/main/api/merlin.schema.html)
+extension truncates the text to [Schema](https://nvidia-merlin.github.io/core/stable/api/merlin.schema.html)
 and makes it a link.
 
 ## Attribution
 
@@ -2,7 +2,7 @@
 
 [![PyPI version shields.io](https://img.shields.io/pypi/v/merlin-models.svg)](https://pypi.python.org/pypi/merlin-models/)
 ![GitHub License](https://img.shields.io/github/license/NVIDIA-Merlin/models)
-[![Documentation](https://img.shields.io/badge/documentation-blue.svg)](https://nvidia-merlin.github.io/models/main/)
+[![Documentation](https://img.shields.io/badge/documentation-blue.svg)](https://nvidia-merlin.github.io/models/)
 
 The Merlin Models library provides standard models for recommender systems with an aim for high-quality implementations
 that range from classic machine learning models to highly-advanced deep learning models.
@@ -17,7 +17,7 @@ In our initial releases, Merlin Models features a TensorFlow API. The PyTorch AP
 
 ### Benefits of Merlin Models
 
-**[RecSys model implementations](https://nvidia-merlin.github.io/models/main/models_overview.html)** - The library provides a high-level API for classic and state-of-the-art deep learning architectures for recommender models.
+**[RecSys model implementations](https://nvidia-merlin.github.io/models/stable/models_overview.html)** - The library provides a high-level API for classic and state-of-the-art deep learning architectures for recommender models.
 These models include both retrieval (e.g. Matrix Factorization, Two tower, YouTube DNN, ..) and ranking (e.g. DLRM, DCN-v2, DeepFM, ...) models.
 
 **Building blocks** - Within Merlin Models, recommender models are built on reusable building blocks.
@@ -28,7 +28,7 @@ The library provides model definition blocks (MLP layers, factorization layers,
 For example, models depend on NVTabular for pre-processing and integrate easily with Merlin Systems for inference.
 The thoughtfully-designed integration makes it straightforward to build performant end-to-end RecSys pipelines.
 
-**[Merlin Models DataLoaders](https://nvidia-merlin.github.io/models/main/api.html#loader-utility-functions)** - Merlin provides seamless integration with common deep learning frameworks, such as TensorFlow, PyTorch, and HugeCTR.
+**[Merlin Models DataLoaders](https://nvidia-merlin.github.io/models/stable/api.html#loader-utility-functions)** - Merlin provides seamless integration with common deep learning frameworks, such as TensorFlow, PyTorch, and HugeCTR.
 When training deep learning recommender system models, data loading can be a bottleneck.
 To address the challenge, Merlin has custom, highly-optimized dataloaders to accelerate existing TensorFlow and PyTorch training pipelines.
 The Merlin dataloaders can lead to a speedup that is nine times faster than the same training pipeline used with the GPU.
@@ -40,7 +40,7 @@ With the Merlin dataloaders, you can:
 - Prepare batches asynchronously into the GPU to avoid CPU-to-GPU communication.
 - Integrate easily into existing TensorFlow or PyTorch training pipelines by using a similar API.
 
-To learn about the core features of Merlin Models, see the [Models Overview](https://nvidia-merlin.github.io/models/main/models_overview.html) page.
+To learn about the core features of Merlin Models, see the [Models Overview](https://nvidia-merlin.github.io/models/stable/models_overview.html) page.
 
 ### Installation
 
@@ -59,7 +59,7 @@ pip install merlin-models
 
 Merlin Models is included in the Merlin Containers.
 
-Refer to the [Merlin Containers](https://nvidia-merlin.github.io/Merlin/main/containers.html) documentation page for information about the Merlin container names, URLs to the container images on the NVIDIA GPU Cloud catalog, and key Merlin components.
+Refer to the [Merlin Containers](https://nvidia-merlin.github.io/Merlin/stable/containers.html) documentation page for information about the Merlin container names, URLs to the container images on the NVIDIA GPU Cloud catalog, and key Merlin components.
 
 #### Installing Merlin Models from Source
 
@@ -75,7 +75,7 @@ cd models && pip install -e .
 Merlin Models makes it straightforward to define architectures that adapt to different input features.
 This adaptability is provided by building on a core feature of the NVTabular library.
 When you use NVTabular for feature engineering, NVTabular creates a schema that identifies the input features.
-You can see the `Schema` object in action by looking at the [From ETL to Training RecSys models - NVTabular and Merlin Models integrated example](https://nvidia-merlin.github.io/models/main/examples/02-Merlin-Models-and-NVTabular-integration.html) example notebook.
+You can see the `Schema` object in action by looking at the [From ETL to Training RecSys models - NVTabular and Merlin Models integrated example](https://nvidia-merlin.github.io/models/stable/examples/02-Merlin-Models-and-NVTabular-integration.html) example notebook.
 
 You can easily build popular RecSys architectures like [DLRM](http://arxiv.org/abs/1906.00091), as shown in the following code sample.
 After you define the model, you can train and evaluate it with a typical Keras model.
@@ -107,11 +107,11 @@ eval_metrics = model.evaluate(valid, batch_size=1024, return_dict=True)
     The target binary feature is also inferred from the schema (i.e., tagged as 'TARGET').
 
 You can find more details and information about a low-level API in our overview of the
-[Deep Learning Recommender Model](https://nvidia-merlin.github.io/models/main/models_overview.html#deep-learning-recommender-model).
+[Deep Learning Recommender Model](https://nvidia-merlin.github.io/models/stable/models_overview.html#deep-learning-recommender-model).
 
 ### Notebook Examples and Tutorials
 
-View the example notebooks in the [documentation](https://nvidia-merlin.github.io/models/main/examples/README.html) to help you become familiar with Merlin Models.
+View the example notebooks in the [documentation](https://nvidia-merlin.github.io/models/stable/examples/README.html) to help you become familiar with Merlin Models.
 
 The same notebooks are available in the `examples` directory from the [Merlin Models](https://github.com/NVIDIA-Merlin/models) GitHub repository.
 
 
@@ -102,7 +102,7 @@ the link is to the repository:
 
 ```markdown
 Refer to the sample Python programs in the
-[examples/blah](https://github.com/NVIDIA-Merlin/models/tree/main/examples/blah)
+[examples/blah](https://github.com/NVIDIA-Merlin/models/tree/stable/examples/blah)
 directory of the repository.
 ```
 
@@ -139,7 +139,7 @@ a relative path works both in the HTML docs page and in the repository browsing
 Use a link to the HTML page like the following:
 
 ```markdown
-<https://nvidia-merlin.github.io/NVTabular/main/Introduction.html>
+<https://nvidia-merlin.github.io/NVTabular/stable/Introduction.html>
 ```
 
 > I'd like to change this in the future. My preference would be to use a relative
 
@@ -27,6 +27,7 @@
 # documentation root, use os.path.abspath to make it absolute, like shown here.
 #
 import os
+import re
 import subprocess
 import sys
 
@@ -115,24 +116,25 @@
 
 if os.path.exists(gitdir):
     tag_refs = subprocess.check_output(["git", "tag", "-l", "v*"]).decode("utf-8").split()
+    tag_refs = [tag for tag in tag_refs if re.match(r"^v[0-9]+.[0-9]+.[0-9]+$", tag)]
     tag_refs = natsorted(tag_refs)[-6:]
     smv_tag_whitelist = r"^(" + r"|".join(tag_refs) + r")$"
 else:
     smv_tag_whitelist = r"^v.*$"
 
-smv_branch_whitelist = r"^main$"
+smv_branch_whitelist = r"^(main|stable)$"
 
 smv_refs_override_suffix = r"-docs"
 
 html_sidebars = {"**": ["versions.html"]}
-html_baseurl = "https://nvidia-merlin.github.io/models/main"
+html_baseurl = "https://nvidia-merlin.github.io/models/stable/"
 
 intersphinx_mapping = {
     "python": ("https://docs.python.org/3", None),
     "cudf": ("https://docs.rapids.ai/api/cudf/stable/", None),
     "distributed": ("https://distributed.dask.org/en/latest/", None),
     "torch": ("https://pytorch.org/docs/stable/", None),
-    "merlin-core": ("https://nvidia-merlin.github.io/core/main/", None),
+    "merlin-core": ("https://nvidia-merlin.github.io/core/stable/", None),
 }
 
 autodoc_inherit_docstrings = False
 
@@ -15,7 +15,7 @@ Merlin Models GitHub Repository
 
 About Merlin
   Merlin is the overarching project that brings together the Merlin projects.
-  See the `documentation <https://nvidia-merlin.github.io/Merlin/main/README.html>`_
+  See the `documentation <https://nvidia-merlin.github.io/Merlin/stable/README.html>`_
   or the `repository <https://github.com/NVIDIA-Merlin/Merlin>`_ on GitHub.
 
 Developer website for Merlin
 
@@ -1409,7 +1409,7 @@
     "\n",
     "In the next notebooks, we will explore multiple ranking models with Merlin Models.\n",
     "\n",
-    "You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/main/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples/scaling-criteo)."
+    "You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/stable/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples/scaling-criteo)."
    ]
   }
  ],
 
@@ -47,7 +47,7 @@
     "\n",
     "In this example, we'll demonstrate how to build and train several popular deep learning-based ranking model architectures. Merlin Models provides a high-level API to define those architectures, but allows for customization  as they are composed by reusable building blocks.\n",
     "\n",
-    "In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
+    "In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
     "\n",
     "### Learning objectives\n",
     "- Preparing the data with NVTabular\n",
@@ -432,7 +432,7 @@
     }
    },
    "source": [
-    "We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/dataset.py#L141)."
+    "We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/dataset.py#L141)."
    ]
   },
   {
 
@@ -141,7 +141,7 @@
    "source": [
     "We use the synthetic train and test datasets generated by mimicking the real [Ali-CCP: Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1) dataset to build our recommender system ranking models. \n",
     "\n",
-    "If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."
+    "If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."
    ]
   },
   {
@@ -459,7 +459,7 @@
     }
    },
    "source": [
-    "In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/main/examples/03-Exploring-different-models.ipynb) notebook."
+    "In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/stable/examples/03-Exploring-different-models.ipynb) notebook."
    ]
   },
   {
@@ -693,7 +693,7 @@
    "source": [
     "We trained and exported our ranking model and NVTabular workflow. In the next step, we will learn how to deploy our trained DLRM model into [Triton Inference Server](https://github.com/triton-inference-server/server) with [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library. NVIDIA Triton Inference Server (TIS) simplifies the deployment of AI models at scale in production. TIS provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. It supports a number of different machine learning frameworks such as TensorFlow and PyTorch.\n",
     "\n",
-    "For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/README.md)."
+    "For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/README.md)."
    ]
   }
  ],
 
@@ -997,7 +997,7 @@
    "id": "155af447-97c4-4875-97ad-84e678fd7b40",
    "metadata": {},
    "source": [
-    "Note that above when we  set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/models/base.py) class. \n",
+    "Note that above when we  set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/models/base.py) class. \n",
     "\n",
     "`unique_rows_by_features` : A utility function allows extracting both unique user and item features tables as Merlin Dataset object that can easily be converted to a cuDF data frame. The function extracts unique rows from a specified dataset (transformed train set) based on a specified id-column tags (`ITEM` and `ITEM_ID`)."
    ]
Original file line number	Diff line number	Diff line change
`@@ -1409,7 +1409,7 @@`
`1409`	`1409`	`"\n",`
`1410`	`1410`	`"In the next notebooks, we will explore multiple ranking models with Merlin Models.\n",`
`1411`	`1411`	`"\n",`
`1412`		- "You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/main/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/main/examples/scaling-criteo)."
	`1412`	+ "You can learn more about NVTabular, its functionality and supported ops by visiting our [github repository](https://github.com/NVIDIA-Merlin/NVTabular/) or exploring the [examples](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples), such as [`Getting Started MovieLens`](https://github.com/NVIDIA-Merlin/NVTabular/blob/stable/examples/getting-started-movielens/02-ETL-with-NVTabular.ipynb) or [`Scaling Criteo`](https://github.com/NVIDIA-Merlin/NVTabular/tree/stable/examples/scaling-criteo)."
`1413`	`1413`	`]`
`1414`	`1414`	`}`
`1415`	`1415`	`],`
Original file line number	Diff line number	Diff line change
`@@ -47,7 +47,7 @@`
`47`	`47`	`"\n",`
`48`	`48`	`"In this example, we'll demonstrate how to build and train several popular deep learning-based ranking model architectures. Merlin Models provides a high-level API to define those architectures, but allows for customization as they are composed by reusable building blocks.\n",`
`49`	`49`	`"\n",`
`50`		- "In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
	`50`	+ "In this example notebook, we use for training and evaluation synthetic data that mimics the schema (features and cardinalities) of [Ali-CCP dataset](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1): Alibaba Click and Conversion Prediction dataset. The Ali-CCP is a dataset gathered from real-world traffic logs of the recommender system in Taobao, the largest online retail platform in the world. To download the raw Ali-CCP training and test datasets visit [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can get the raw dataset via this [get_aliccp() function](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) and generate the parquet files from it to be used in this example.\n",
`51`	`51`	`"\n",`
`52`	`52`	`"### Learning objectives\n",`
`53`	`53`	`"- Preparing the data with NVTabular\n",`
`@@ -432,7 +432,7 @@`
`432`	`432`	`}`
`433`	`433`	`},`
`434`	`434`	`"source": [`
`435`		- "We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/dataset.py#L141)."
	`435`	+ "We're ready to start training, for that, we create our dataset objects, and under the hood we use Merlin `Loader` class for reading chunks of parquet files. `Loader` asynchronously iterate through CSV or Parquet dataframes on GPU by leveraging an NVTabular `Dataset`. To read more about Merlin optimized dataloaders visit [here](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/dataset.py#L141)."
`436`	`436`	`]`
`437`	`437`	`},`
`438`	`438`	`{`
Original file line number	Diff line number	Diff line change
`@@ -141,7 +141,7 @@`
`141`	`141`	`"source": [`
`142`	`142`	`"We use the synthetic train and test datasets generated by mimicking the real [Ali-CCP: Alibaba Click and Conversion Prediction](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1) dataset to build our recommender system ranking models. \n",`
`143`	`143`	`"\n",`
`144`		`- "If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."`
	`144`	`+ "If you would like to use real Ali-CCP dataset instead, you can download the training and test datasets on [tianchi.aliyun.com](https://tianchi.aliyun.com/dataset/dataDetail?dataId=408#1). You can then use [get_aliccp()](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/datasets/ecommerce/aliccp/dataset.py#L43) function to curate the raw csv files and save them as parquet files."`
`145`	`145`	`]`
`146`	`146`	`},`
`147`	`147`	`{`
`@@ -459,7 +459,7 @@`
`459`	`459`	`}`
`460`	`460`	`},`
`461`	`461`	`"source": [`
`462`		`- "In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/main/examples/03-Exploring-different-models.ipynb) notebook."`
	`462`	`+ "In this example, we build, train, and export a Deep Learning Recommendation Model [(DLRM)](https://arxiv.org/abs/1906.00091) architecture. To learn more about how to train different deep learning models, how easily transition from one model to another and the seamless integration between data preparation and model training visit [03-Exploring-different-models.ipynb](https://github.com/NVIDIA-Merlin/models/blob/stable/examples/03-Exploring-different-models.ipynb) notebook."`
`463`	`463`	`]`
`464`	`464`	`},`
`465`	`465`	`{`
`@@ -693,7 +693,7 @@`
`693`	`693`	`"source": [`
`694`	`694`	"We trained and exported our ranking model and NVTabular workflow. In the next step, we will learn how to deploy our trained DLRM model into [Triton Inference Server](https://github.com/triton-inference-server/server) with [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library. NVIDIA Triton Inference Server (TIS) simplifies the deployment of AI models at scale in production. TIS provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. It supports a number of different machine learning frameworks such as TensorFlow and PyTorch.\n",
`695`	`695`	`"\n",`
`696`		- "For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/main/examples/README.md)."
	`696`	+ "For the next step, visit [Merlin Systems](https://github.com/NVIDIA-Merlin/systems) library and execute [Serving-Ranking-Models-With-Merlin-Systems](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/Serving-Ranking-Models-With-Merlin-Systems.ipynb) notebook to deploy our saved DLRM and NVTabular workflow models as an ensemble to TIS and obtain prediction results for a qiven request. In doing so, you need to mount the saved DLRM and NVTabular workflow to the inference container following the instructions in the [README.md](https://github.com/NVIDIA-Merlin/systems/blob/stable/examples/README.md)."
`697`	`697`	`]`
`698`	`698`	`}`
`699`	`699`	`],`
Original file line number	Diff line number	Diff line change
`@@ -997,7 +997,7 @@`
`997`	`997`	`"id": "155af447-97c4-4875-97ad-84e678fd7b40",`
`998`	`998`	`"metadata": {},`
`999`	`999`	`"source": [`
`1000`		- "Note that above when we set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/main/merlin/models/tf/models/base.py) class. \n",
	`1000`	+ "Note that above when we set `validation_data=valid` in the `model.fit()`, we compute evaluation metrics on validation set using the negative sampling strategy used for training. To determine the exact accuracy of our trained retrieval model, we need to compute the similarity score between a given query and all possible candidates. The higher the score of the positive candidate (the one that is already interacted with, i.e. target item_id returned by dataloader), the more accurate the model is. We can do this using the `topk_model` model that we create below via `to_top_k_encoder` method, and the following section shows how to instantiate it. The `to_top_k_encoder()` is a method of the [RetrievalModelV2](https://github.com/NVIDIA-Merlin/models/blob/stable/merlin/models/tf/models/base.py) class. \n",
`1001`	`1001`	`"\n",`
`1002`	`1002`	"`unique_rows_by_features` : A utility function allows extracting both unique user and item features tables as Merlin Dataset object that can easily be converted to a cuDF data frame. The function extracts unique rows from a specified dataset (transformed train set) based on a specified id-column tags (`ITEM` and `ITEM_ID`)."
`1003`	`1003`	`]`