Update quick_start.rst

alilevy · alilevy · commit 5675ea7256e9 · 2024-02-13T23:41:10.000+08:00
diff --git a/docs/source/get_started/quick_start.rst b/docs/source/get_started/quick_start.rst
@@ -9,10 +9,46 @@ We use the [Taxi]_ dataset as an example to show how to use ``EasyTPP`` to train
 Download Dataset
 ===================
 
-The Taxi dataset we used is preprocessed by `HYPRO <https://github.com/iLampard/hypro_tpp>`_ . You can download this dataset `here <https://drive.google.com/drive/folders/1vNX2gFuGfhoh-vngoebaQlj2-ZIZMiBo>`_.
 
 
-Create the dir to save the pkl files.
+The Taxi dataset we used is preprocessed by `HYPRO <https://github.com/iLampard/hypro_tpp>`_ . You can either download the dataset (in pickle) from Google Drive `here <https://drive.google.com/drive/folders/1vNX2gFuGfhoh-vngoebaQlj2-ZIZMiBo>`_ or the dataset (in json) from `HuggingFace <https://huggingface.co/easytpp>`_.
+
+
+Note that if the data sources are pickle files, we need to write the data config (in `Example Config <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/configs/experiment_config.yaml>`_) in the following way
+
+.. code-block:: yaml
+
+    data:
+      taxi:
+        data_format: pickle
+        train_dir: ./data/taxi/train.pkl
+        valid_dir: ./data/taxi/dev.pkl
+        test_dir: ./data/taxi/test.pkl
+
+If we choose to directly load from HuggingFace, we can put it this way:
+
+.. code-block:: yaml
+
+    data:
+      taxi:
+        data_format: json
+        train_dir: easytpp/taxi
+        valid_dir: easytpp/taxi
+        test_dir: easytpp/taxi
+
+
+Meanwhile, it is also feasible to put the local directory of json files downloaded from HuggingFace in the config:
+
+.. code-block:: yaml
+
+    data:
+      taxi:
+        data_format: json
+        train_dir: ./data/taxi/train.json
+        valid_dir: ./data/taxi/dev.json
+        test_dir: ./data/taxi/test.json
+
+
 
 
 Setup the configuration file
@@ -21,12 +57,14 @@ Setup the configuration file
 We provide a preset config file in `Example Config <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/configs/experiment_config.yaml>`_. The details of the configuration can be found in `Training Pipeline <../user_guide/run_train_pipeline.html>`_.
 
 
+
+
 Train the Model
 =========================
 
 At this stage we need to write a script to run the training pipeline. There is a preset script `train_nhp.py <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/train_nhp.py>`_ and one can simply copy it.
 
-After the setup of data, config and running script, the directory structure is as follows:
+Taking the pickle data source for example, after the setup of data, config and running script, the directory structure is as follows:
 
 .. code-block:: bash