Improve grammar and readability in documentation

Rogelio Melo · Rogelio Melo · commit e1ff369f059a · 2025-12-12T18:17:49.000-06:00
diff --git a/nbs/docs/capabilities/cross_validation.ipynb b/nbs/docs/capabilities/cross_validation.ipynb
@@ -78,7 +78,7 @@
    "source": [
     "## 2. Read the data\n",
     "\n",
-    "For this tutorial, we use part of the hourly M4 dataset. It is stored in a parquet file for efficiency. You can use ordinary pandas operations to read your data in other formats likes `.csv`. \n",
+    "For this tutorial, we use part of the hourly M4 dataset. It is stored in a parquet file for efficiencym. However, you can use ordinary pandas operations to read your data in other formats likes `.csv`. \n",
     "\n",
     "The input to `NeuralForecast` is always a data frame in [long format](https://www.theanalysisfactor.com/wide-and-long-data/) with three columns: `unique_id`, `ds` and `y`:\n",
     "\n",
@@ -180,7 +180,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "For simplicity, we use only a single series to explore in detail the cross-validation functionality. Also, let's use the first 700 time steps, such that we work with round numbers, making it easier to visualize and understand cross-validation."
+    "For simplicity, we focus on a single time series to explore the cross-validation functionality in detail. We also use only the first 700 time steps, which allows us to work with round numbers and makes the cross-validation process easier to visualize and understand."
    ]
   },
   {
@@ -449,7 +449,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In the figure above, we see that we have 4 cutoff points, which correspond to our four cross-validation windows. Of course, notice that the windows are set from the end of the dataset. That way, the model trains on past data to predict future data. \n",
+    "In the figure above, we observe four cutoff points, each corresponding to a cross-validation window. Note that these windows are defined from the end of the dataset, ensuring that the model is trained on past data to predict future data.\n",
     "\n",
     ":::{.callout-warning collapse=\"true\"}\n",
     "## Important note\n",
@@ -655,11 +655,17 @@
    "metadata": {},
    "source": [
     "In the figure above, we see that our two folds overlap between time steps 601 and 650, since the step size is 50. This happens because:\n",
+    "\n",
     "- fold 1: model is trained using time steps 0 to 550 and predicts 551 to 650 (h=100)\n",
     "- fold 2: model is trained using time steps 0 to 600 (`step_size=50`) and predicts 601 to 700\n",
     "\n",
     "Be aware that when evaluating a model trained with overlapping cross-validation windows, some time steps have more than one prediction. This may bias your evaluation metric, as the repeated time steps are taken into account in the metric multiple times."
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
   }
  ],
  "metadata": {
diff --git a/neuralforecast/core.py b/neuralforecast/core.py
@@ -225,7 +225,7 @@ def __init__(
         local_static_scaler_type: Optional[str] = None,
     ):
         """The `core.StatsForecast` class allows you to efficiently fit multiple `NeuralForecast` models
-        for large sets of time series. It operates with pandas DataFrame `df` that identifies series
+        for large sets of time series. It operates with a pandas DataFrame `df` that identifies series
         and datestamps with the `unique_id` and `ds` columns. The `y` column denotes the target
         time series variable.
 
@@ -234,9 +234,9 @@ def __init__(
                 see [collection here](./models).
             freq (str or int): Frequency of the data. Must be a valid pandas or polars offset alias, or an integer.
             local_scaler_type (str, optional): Scaler to apply per-serie to temporal features before fitting, which is inverted after predicting.
-                Can be 'standard', 'robust', 'robust-iqr', 'minmax' or 'boxcox'. Defaults to None.
+                Can be 'standard', 'robust', 'robust-iqr', 'minmax' or 'boxcox'. 
             local_static_scaler_type (str, optional): Scaler to apply to static exogenous features before fitting.
-                Can be 'standard', 'robust', 'robust-iqr', 'minmax' or 'boxcox'. Defaults to None.
+                Can be 'standard', 'robust', 'robust-iqr', 'minmax' or 'boxcox'. 
 
         Returns:
             NeuralForecast: Returns instantiated `NeuralForecast` class.
@@ -449,23 +449,23 @@ def fit(
         distributed_config: Optional[DistributedConfig] = None,
         prediction_intervals: Optional[PredictionIntervals] = None,
     ) -> None:
-        """Fit the core.NeuralForecast.
+        """Fit the core.NeuralForecast
 
-        Fit `models` to a large set of time series from DataFrame `df`.
+        Fit `models` to a large set of time series from DataFrame `df`
         and store fitted models for later inspection.
 
         Args:
             df (pandas, polars or spark DataFrame, or a list of parquet files containing the series, optional): DataFrame with columns [`unique_id`, `ds`, `y`] and exogenous variables.
-                If None, a previously stored dataset is required. Defaults to None.
-            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous. Defaults to None.
-            val_size (int, optional): Size of validation set. Defaults to 0.
-            use_init_models (bool, optional): Use initial model passed when NeuralForecast object was instantiated. Defaults to False.
-            verbose (bool): Print processing steps. Defaults to False.
-            id_col (str): Column that identifies each serie. Defaults to 'unique_id'.
-            time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to 'ds'.
-            target_col (str): Column that contains the target. Defaults to 'y'.
+                If None, a previously stored dataset is required. 
+            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous.
+            val_size (int, optional): Size of validation set.
+            use_init_models (bool, optional): Use initial model passed when NeuralForecast object was instantiated.
+            verbose (bool): Print processing steps.
+            id_col (str): Column that identifies each serie.
+            time_col (str): Column that identifies each timestep, its values can be timestamps or integers.
+            target_col (str): Column that contains the target.
             distributed_config (neuralforecast.DistributedConfig): Configuration to use for DDP training. Currently only spark is supported.
-            prediction_intervals (PredictionIntervals, optional): Configuration to calibrate prediction intervals (Conformal Prediction). Defaults to None.
+            prediction_intervals (PredictionIntervals, optional): Configuration to calibrate prediction intervals (Conformal Prediction).
 
         Returns:
             NeuralForecast: Returns `NeuralForecast` class with fitted `models`.
@@ -580,7 +580,7 @@ def make_future_dataframe(
 
         Args:
             df (pandas or polars DataFrame, optional): DataFrame with columns [`unique_id`, `ds`, `y`] and exogenous variables.
-                Only required if this is different than the one used in the fit step. Defaults to None.
+                Only required if this is different than the one used in the fit step. 
         """
         if not self._fitted:
             raise Exception("You must fit the model first.")
@@ -821,14 +821,14 @@ def predict(
 
         Args:
             df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`, `ds`, `y`] and exogenous variables.
-                If a DataFrame is passed, it is used to generate forecasts. Defaults to None.
-            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous. Defaults to None.
-            futr_df (pandas, polars or spark DataFrame, optional): DataFrame with [`unique_id`, `ds`] columns and `df`'s future exogenous. Defaults to None.
-            verbose (bool): Print processing steps. Defaults to False.
+                If a DataFrame is passed, it is used to generate forecasts.
+            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous.
+            futr_df (pandas, polars or spark DataFrame, optional): DataFrame with [`unique_id`, `ds`] columns and `df`'s future exogenous.
+            verbose (bool): Print processing steps.
             engine (spark session): Distributed engine for inference. Only used if df is a spark dataframe or if fit was called on a spark dataframe.
-            level (list of ints or floats, optional): Confidence levels between 0 and 100. Defaults to None.
-            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. Defaults to None.
-            h (int, optional): Forecasting horizon. If None, uses the horizon of the fitted models. Defaults to None.
+            level (list of ints or floats, optional): Confidence levels between 0 and 100.
+            quantiles (list of floats, optional): Alternative to level, target quantiles to predict.
+            h (int, optional): Forecasting horizon. If None, uses the horizon of the fitted models.
             data_kwargs (kwargs): Extra arguments to be passed to the dataset within each model.
 
         Returns:
@@ -1031,14 +1031,14 @@ def explain(
             outputs (list of int, optional): List of outputs to explain for models with multiple outputs. Defaults to [0] (first output).
             explainer (str): Name of the explainer to use. Options are 'IntegratedGradients', 'ShapleyValueSampling', 'InputXGradient'. Defaults to 'IntegratedGradients'.
             df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`, `ds`, `y`] and exogenous variables.
-            If a DataFrame is passed, it is used to generate forecasts. Defaults to None.
-            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous. Defaults to None.
-            futr_df (pandas, polars or spark DataFrame, optional): DataFrame with [`unique_id`, `ds`] columns and `df`'s future exogenous. Defaults to None.
+            If a DataFrame is passed, it is used to generate forecasts.
+            static_df (pandas, polars or spark DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous.
+            futr_df (pandas, polars or spark DataFrame, optional): DataFrame with [`unique_id`, `ds`] columns and `df`'s future exogenous.
             h (int): The forecast horizon. Can be larger than the horizon set during training.
-            verbose (bool): Print processing steps. Defaults to False.
+            verbose (bool): Print processing steps.
             engine (spark session): Distributed engine for inference. Only used if df is a spark dataframe or if fit was called on a spark dataframe.
-            level (list of ints or floats, optional): Confidence levels between 0 and 100. Defaults to None.
-            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. Defaults to None.
+            level (list of ints or floats, optional): Confidence levels between 0 and 100.
+            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. 
             data_kwargs (kwargs): Extra arguments to be passed to the dataset within each model.
 
         Returns:
@@ -1360,24 +1360,24 @@ def cross_validation(
 
         Args:
             df (pandas or polars DataFrame, optional): DataFrame with columns [`unique_id`, `ds`, `y`] and exogenous variables.
-                If None, a previously stored dataset is required. Defaults to None.
+                If None, a previously stored dataset is required.
             static_df (pandas or polars DataFrame, optional): DataFrame with columns [`unique_id`] and static exogenous. Defaults to None.
-            n_windows (int): Number of windows used for cross validation. Defaults to 1.
-            step_size (int): Step size between each window. Defaults to 1.
+            n_windows (int): Number of windows used for cross validation. 
+            step_size (int): Step size between each window. 
             val_size (int, optional): Length of validation size. If passed, set `n_windows=None`. Defaults to 0.
-            test_size (int, optional): Length of test size. If passed, set `n_windows=None`. Defaults to None.
-            use_init_models (bool, optional): Use initial model passed when object was instantiated. Defaults to False.
-            verbose (bool): Print processing steps. Defaults to False.
+            test_size (int, optional): Length of test size. If passed, set `n_windows=None`. 
+            use_init_models (bool, optional): Use initial model passed when object was instantiated. 
+            verbose (bool): Print processing steps. 
             refit (bool or int): Retrain model for each cross validation window.
                 If False, the models are trained at the beginning and then used to predict each window.
-                If positive int, the models are retrained every `refit` windows. Defaults to False.
-            id_col (str): Column that identifies each serie. Defaults to 'unique_id'.
+                If positive int, the models are retrained every `refit` windows. 
+            id_col (str): Column that identifies each serie.
             time_col (str): Column that identifies each timestep, its values can be timestamps or integers. Defaults to 'ds'.
-            target_col (str): Column that contains the target. Defaults to 'y'.
+            target_col (str): Column that contains the target.
             prediction_intervals (PredictionIntervals, optional): Configuration to calibrate prediction intervals (Conformal Prediction). Defaults to None.
-            level (list of ints or floats, optional): Confidence levels between 0 and 100. Defaults to None.
-            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. Defaults to None.
-            h (int, optional): Forecasting horizon. If None, uses the horizon of the fitted models. Defaults to None.
+            level (list of ints or floats, optional): Confidence levels between 0 and 100.
+            quantiles (list of floats, optional): Alternative to level, target quantiles to predict.
+            h (int, optional): Forecasting horizon. If None, uses the horizon of the fitted models.
             data_kwargs (kwargs): Extra arguments to be passed to the dataset within each model.
 
         Returns:
@@ -1533,9 +1533,9 @@ def predict_insample(
         to predict historic values of a time series from the stored dataframe.
 
         Args:
-            step_size (int): Step size between each window. Defaults to 1.
-            level (list of ints or floats, optional): Confidence levels between 0 and 100. Defaults to None.
-            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. Defaults to None.
+            step_size (int): Step size between each window.
+            level (list of ints or floats, optional): Confidence levels between 0 and 100. 
+            quantiles (list of floats, optional): Alternative to level, target quantiles to predict. 
 
         Returns:
             fcsts_df (pandas.DataFrame): DataFrame with insample predictions for all fitted `models`.
@@ -1705,9 +1705,9 @@ def save(
 
         Args:
             path (str): Directory to save current status.
-            model_index (list, optional): List to specify which models from list of self.models to save. Defaults to None.
-            save_dataset (bool): Whether to save dataset or not. Defaults to True.
-            overwrite (bool): Whether to overwrite files or not. Defaults to False.
+            model_index (list, optional): List to specify which models from list of self.models to save. 
+            save_dataset (bool): Whether to save dataset or not.
+            overwrite (bool): Whether to overwrite files or not. 
         """
         # Standarize path without '/'
         if path[-1] == "/":