Skip to content

Docstring Best Practice

Weiyuan Wu edited this page Apr 15, 2020 · 5 revisions

Dataprep uses a few sphinx packages to accelerate docstring writing, thus brings in additional best practices. Here lists all these best practices and please kindly give it a read.

  • Automatic parameter type inference.

    Dataprep strongly enforces typing for all the functions, classes and variables. When writing function parameters, the convention from NumPy says you should write the parameter type after a :. Here, we don't, as long as the type is annotated correctly in the function signature. Take dataprep.eda.basic.plot as an example: Since we have the function signature typed,

      def plot(
          df: Union[pd.DataFrame, dd.DataFrame],
          x: Optional[str] = None,
          y: Optional[str] = None,
          *,
          bins: int = 10,
          ngroups: int = 10,
          largest: bool = True,
          nsubgroups: int = 5,
          bandwidth: float = 1.5,
          sample_size: int = 1000,
          value_range: Optional[Tuple[float, float]] = None,
          yscale: str = "linear",
          tile_size: Optional[float] = None,
      ) -> Figure:
          ...
    1. No Type for Function Parameters

      In the docstring you don't need to write type for a parameter

      Parameters
      ----------
      df
        Dataframe from which plots are to be generated
      

      we already have the type of df from the signature. Also, the documentation will be generated correctly as:

      Generated parameter df

    2. Give the Type for Default Values

      Alternatively, you can still write the parameter type to override the auto-generated one. A very good use case would be default values:

      Parameters
      ----------
      x: Optional[str], default None
          A valid column name from the dataframe.
      

      This gives you

      Notice that how the parameter type changes from bold to italic - this is the sign of ** overridden** parameter types.

    3. No Returns Unless for Comments

      We can also infer the function return type from the signature! This means no need for docstrings like this:

      Returns
      -------
      Figure
        An object of figure
      

      , unless you want to write some meaningful comments for the return type:

      Returns
      -------
      Figure
        A meaningful message!!!
      
  • Make class members private by a leading _.

    Remember all the members without a leading underscore will be shown in the documentation!

Clone this wiki locally