[Docs] PR #84: Improve Tutorials (#82)

drQuesadaUPM · Pabloo22 · gemini-code-assist[bot] · web-flow · commit 8c8b75d579cc · 2025-11-23T16:05:29.000Z
* Mejora: actualizar tutorial inicial de instalación

* [Docs] Fix typo in tutorial 01

Suggestion by gemini

Co-authored-by: gemini-code-assist[bot] &lt;176961590+gemini-code-assist[bot]@users.noreply.github.com&gt;

* Mejora: actualizar tutorial

* [Chore] Ignore `.DS_Store` files

Remove unnecessary .DS_Store files from the repository

---------

Co-authored-by: Pablo Ariño &lt;72697714+Pabloo22@users.noreply.github.com&gt;
Co-authored-by: gemini-code-assist[bot] &lt;176961590+gemini-code-assist[bot]@users.noreply.github.com&gt;
Co-authored-by: Pabloo22 &lt;pablete.arino@gmail.com&gt;
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,6 @@
+# macOS file system metadata
+.DS_Store  
+
 # Development files
 .vscode/
 tests/notebooks
diff --git a/docs/source/tutorial/00-Getting-Started.ipynb b/docs/source/tutorial/00-Getting-Started.ipynb
@@ -6,11 +6,13 @@
    "source": [
     "# Getting Started with Job Shop Lib\n",
     "\n",
+    "Recall that the Job Shop Scheduling Problem consists **of determining the optimal sequence of operations for a set of jobs to be processed on a set of machines**, where each job is composed of a series of operations that must be performed in a specific order, and each operation requires a specific machine for a given processing time. The objective is typically to minimize a performance criterion such as the total completion time (makespan) while respecting constraints such as machine capacity (only one job can be processed on a machine at a time) and job precedence relationships.\n",
+    "\n",
     "The main class of the library is the `JobShopInstance` class, which stores a list of jobs and its operations.\n",
     "\n",
     "Each operation is also a class, which stores the machine(s) in which the operation can be processed and its duration (also known as processing time). Let's see an example of how to use the `JobShopInstance` class to model a JSSP instance.\n",
     "\n",
-    "In this example, we model a simple Job Shop Scheduling Problem using the `JobShopInstance` class. We define three types of machines: CPU, GPU, and Data Center, each represented by a unique identifier."
+    "In this example, we model a simple Job Shop Scheduling Problem using the `JobShopInstance` class. We define three types of machines: CPU, GPU, and Data Center, each represented by a unique identifier. This just defines the problem, later on, we will find solutions and finally we will try to optimize them."
    ]
   },
   {
diff --git a/docs/source/tutorial/01-How-Solutions-are-Represented.ipynb b/docs/source/tutorial/01-How-Solutions-are-Represented.ipynb
@@ -14,7 +14,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "First, the operations from the previous example are organized by their respective machines."
+    "Recall the previous example:"
    ]
   },
   {
@@ -44,6 +44,13 @@
     ")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, the operations are organized by their respective machines."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -135,6 +142,13 @@
     "print(\"Makespan:\", schedule.makespan())"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that the Job Shop Scheduling Problem has a solution, it can be represented as follows:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 6,
@@ -157,6 +171,13 @@
     "_ = plot_gantt_chart(schedule, job_labels=[\"Job 1\", \"Job 2\", \"Job 3\"])"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Without any algorithm it is easy to improve this solution by a simple visual inspection. We can manually rearrange the job sequences to obtain a better solution."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 7,
diff --git a/docs/source/tutorial/02-Solving-the-Problem.ipynb b/docs/source/tutorial/02-Solving-the-Problem.ipynb
@@ -7,7 +7,7 @@
     "## Solving the Problem\n",
     "As you can see, manually creating solutions is a tedious task and requires taking into account each constraint carefully. This is the reason the `Dispatcher` class was created. This class allow us to just define the order in which operations are sequenced and the machines in which they are processed. The `Dispatcher` class will take care of the rest.\n",
     "\n",
-    "Let's see an example of how to use the `Dispatcher` class to solve the previous instance. In this case, a reasonable solution is to process the operations in the order they are defined in the instance. We can do this as follows:"
+    "Let's see an example of how to use the `Dispatcher` class to solve the previous instance. In this case, a reasonable solution is to process, for each job, the operations in the order they are defined in the instance. We can do this as follows:"
    ]
   },
   {
@@ -47,6 +47,13 @@
     "    dispatcher.dispatch(job_3[i], job_3[i].machine_id)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The dispatcher took care of any possible overlaps providing the expected solution, as we can see below"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
@@ -76,8 +83,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "Even though the dispatcher creates solutions easily, they are crafted manually/procedurally and therefore will usually be far from optimal. It is interesting to introduce solvers that find solutions according to certain optimality criteria.\n",
     "\n",
-    "A solver is any `Callable` object that takes as input a `JobShopInstance` class and returns a `Schedule` with a complete solution of the instance.\n",
+    "A **solver** is any `Callable` object that takes as input a `JobShopInstance` class and returns a `Schedule` with a complete solution of the instance.\n",
     "\n",
     "In this example, we are going to use the `CPSolver` class, contained inside `job_shop_lib.solvers` package, which uses [CP-SAT solver from Google OR-Tools](https://developers.google.com/optimization/cp/cp_solver)."
    ]
@@ -118,7 +126,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This class returns a `Schedule` object with a complete solution of the instance. It also set some metadata of the solution, such as the time it took to solve the instance and the status of the solution:"
+    "This class returns a `Schedule` object with a complete solution of the instance. It also incorporates some metadata of the solution, such as the time it took to solve the instance and the status of the solution:"
    ]
   },
   {
@@ -146,7 +154,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Finally, we can plot the gantt chart of the solution using the `plot_gantt_chart` method."
+    "We can now plot the gantt chart of the solution provided by the solver which is not only valid and complete, but clearly better in terms of makespan."
    ]
   },
   {
diff --git a/docs/source/tutorial/03-Generating-New-Problems.ipynb b/docs/source/tutorial/03-Generating-New-Problems.ipynb
@@ -7,7 +7,9 @@
    "source": [
     "# Tutorial 03 - Generating New Problems (Random Instance Generation)\n",
     "\n",
-    "This notebook shows how to generate random **JobShopInstance** objects using the function `modular_instance_generator`.\n",
+    "This notebook shows how to generate random **JobShopInstance** objects using the function `modular_instance_generator`. It’s important to note that this process is not about solving a given problem, but rather about creating entirely new problems. Each JobShopInstance represents a unique problem configuration with its own jobs, machines, and processing times.\n",
+    "\n",
+    "Generating problem instances is a crucial step in machine learning workflows, as it provides the data required for training and evaluating algorithms under a variety of conditions.\n",
     "\n",
     "> **Deprecated:** The classes `InstanceGenerator` and `GeneralInstanceGenerator` are deprecated. Use `modular_instance_generator` instead."
    ]
diff --git a/docs/source/tutorial/04-Simulated-Annealing.ipynb b/docs/source/tutorial/04-Simulated-Annealing.ipynb
@@ -16,7 +16,7 @@
     "3.  **Iteratively explore neighbors:** The algorithm then iteratively explores \"neighboring\" solutions. A neighbor is a new schedule created by making a small change to the current one, for example, by swapping the order of two operations on a single machine.\n",
     "4.  **Accept or reject new solutions:**\n",
     "      * If the neighbor solution has a lower energy (is better), it is always accepted as the new current solution.\n",
-    "      * If the neighbor solution has a higher energy (is worse), it might still be accepted with a certain probability. This probability is higher at the beginning (when the \"temperature\" is high) and decreases over time. This allows the algorithm to escape local optima and explore a wider range of solutions.\n",
+    "      * If the neighbor solution has a higher energy (is worse), it might still be accepted with a certain probability. This probability is higher at the beginning (when a threshold called \"temperature\" is high) and decreases over time. This allows the algorithm to escape local optima and explore a wider range of solutions.\n",
     "5.  **Cooling down:** The \"temperature\" gradually decreases, reducing the probability of accepting worse solutions. The process stops when the system has \"cooled down\" (the temperature is low) or after a certain number of iterations.\n",
     "\n",
     "## Core Components\n",
@@ -28,7 +28,7 @@
     "  * **Neighbor Generators:** These are functions that define how to create a \"neighbor\" schedule from a current one.\n",
     "  * **Objective Functions:** The `energy` function calculates the objective value of a schedule, which is typically the makespan plus any penalties for constraint violations.\n",
     "\n",
-    "## The `SimulatedAnnealingSolver`\n",
+    "### The `SimulatedAnnealingSolver`\n",
     "\n",
     "This is the main entry point for using the solver. When you create an instance of this class, you can configure various parameters of the annealing process:\n",
     "\n",
@@ -38,7 +38,7 @@
     "  * `neighbor_generator`: The function used to generate neighboring solutions. The default is `swap_in_critical_path`.\n",
     "  * `seed`: A random seed for reproducibility.\n",
     "\n",
-    "## Neighbor Generation Strategies\n",
+    "### Neighbor Generation Strategies\n",
     "\n",
     "A key part of the simulated annealing process is how you explore the solution space by moving from one solution to a \"neighbor\". This implementation provides three different neighbor generation strategies in `_neighbor_generators.py`:\n",
     "\n",
@@ -48,7 +48,7 @@
     "\n",
     "3.  **`swap_in_critical_path` (Default):** This is the most sophisticated of the three. It identifies the critical path of the current schedule (the sequence of operations that determines the makespan) and looks for consecutive operations on that path that are on the same machine. It then swaps one of these pairs. The idea is that modifying the critical path is the most direct way to try to reduce the makespan. If no such pair exists, it falls back to a standard adjacent swap.\n",
     "\n",
-    "## The Objective Function\n",
+    "### The Objective Function\n",
     "\n",
     "The objective function is what the simulated annealing algorithm tries to minimize. In the context of job shop scheduling, this is typically the makespan (the total time to complete all jobs) plus any penalties for violating constraints (like deadlines).\n",
     "\n",
@@ -58,7 +58,7 @@
     "\n",
     "### Basic Usage\n",
     "\n",
-    "This example shows how to solve a benchmark instance (\"ft06\") with a specific seed to get a reproducible result."
+    "This example shows how to solve a specific JSSP problem, the benchmark instance \"ft06\", with a specific seed to get a reproducible result."
    ]
   },
   {
@@ -109,7 +109,7 @@
    "source": [
     "### Using a Different Neighbor Generator\n",
     "\n",
-    "Although it's not recommended, you can easily plug in a different neighbor generation strategy by passing it to the `SimulatedAnnealingSolver`'s constructor. Here's how to use `swap_adjacent_operations`:\n"
+    "You can specify a different neighbor generator by passing it to the `SimulatedAnnealingSolver` constructor. Here, we use `swap_adjacent_operations` as an example although its usage is not recommended: it produces very local changes and often generates infeasible neighbors that require repeated retries, which slows the search and reduces effectiveness. However, for the shake of experimentation:\n"
    ]
   },
   {
@@ -313,7 +313,6 @@
     "# Durations between 2 and 15 to have some variability\n",
     "duration_creator = get_default_duration_matrix_creator((2, 15))\n",
     "\n",
-    "\n",
     "def deadlines_creator(duration_matrix, rng):\n",
     "    deadlines: list[list[int]] = []\n",
     "    for job_row in duration_matrix:\n",
@@ -326,7 +325,6 @@
     "        deadlines.append(row)\n",
     "    return deadlines\n",
     "\n",
-    "\n",
     "instance_gen = modular_instance_generator(\n",
     "    machine_matrix_creator=machine_creator,\n",
     "    duration_matrix_creator=duration_creator,\n",
@@ -350,8 +348,6 @@
     "baseline_schedule = baseline_solver.solve(instance)\n",
     "\n",
     "# Helper: count deadline violations\n",
-    "\n",
-    "\n",
     "def count_deadline_violations(schedule):\n",
     "    violations = 0\n",
     "    for machine_sched in schedule.schedule:\n",
@@ -438,17 +434,11 @@
    "source": [
     "Note that, in this case, we needed to use a higher initial temperature to effectively explore the solution space and reduce deadline violations. Otherwise, because of the high penalty, very few solutions would be accepted, hindering the search process. In general, the more violations we expect, the higher the initial temperature should be to allow the algorithm to explore a wider range of solutions."
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "51f0604d",
-   "metadata": {},
-   "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "job-shop-lib-gOF0HMZJ-py3.12",
+   "display_name": "Python 3",
    "language": "python",
    "name": "python3"
   },
@@ -462,7 +452,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.12.3"
+   "version": "3.12.11"
   }
  },
  "nbformat": 4,

Original file line number	Diff line number	Diff line change
`@@ -7,7 +7,7 @@`
`7`	`7`	`"## Solving the Problem\n",`
`8`	`8`	"As you can see, manually creating solutions is a tedious task and requires taking into account each constraint carefully. This is the reason the `Dispatcher` class was created. This class allow us to just define the order in which operations are sequenced and the machines in which they are processed. The `Dispatcher` class will take care of the rest.\n",
`9`	`9`	`"\n",`
`10`		- "Let's see an example of how to use the `Dispatcher` class to solve the previous instance. In this case, a reasonable solution is to process the operations in the order they are defined in the instance. We can do this as follows:"
	`10`	+ "Let's see an example of how to use the `Dispatcher` class to solve the previous instance. In this case, a reasonable solution is to process, for each job, the operations in the order they are defined in the instance. We can do this as follows:"
`11`	`11`	`]`
`12`	`12`	`},`
`13`	`13`	`{`
`@@ -47,6 +47,13 @@`
`47`	`47`	`" dispatcher.dispatch(job_3[i], job_3[i].machine_id)"`
`48`	`48`	`]`
`49`	`49`	`},`
	`50`	`+ {`
	`51`	`+ "cell_type": "markdown",`
	`52`	`+ "metadata": {},`
	`53`	`+ "source": [`
	`54`	`+ "The dispatcher took care of any possible overlaps providing the expected solution, as we can see below"`
	`55`	`+ ]`
	`56`	`+ },`
`50`	`57`	`{`
`51`	`58`	`"cell_type": "code",`
`52`	`59`	`"execution_count": 3,`
`@@ -76,8 +83,9 @@`
`76`	`83`	`"cell_type": "markdown",`
`77`	`84`	`"metadata": {},`
`78`	`85`	`"source": [`
	`86`	`+ "Even though the dispatcher creates solutions easily, they are crafted manually/procedurally and therefore will usually be far from optimal. It is interesting to introduce solvers that find solutions according to certain optimality criteria.\n",`
`79`	`87`	`"\n",`
`80`		- "A solver is any `Callable` object that takes as input a `JobShopInstance` class and returns a `Schedule` with a complete solution of the instance.\n",
	`88`	+ "A solver is any `Callable` object that takes as input a `JobShopInstance` class and returns a `Schedule` with a complete solution of the instance.\n",
`81`	`89`	`"\n",`
`82`	`90`	"In this example, we are going to use the `CPSolver` class, contained inside `job_shop_lib.solvers` package, which uses [CP-SAT solver from Google OR-Tools](https://developers.google.com/optimization/cp/cp_solver)."
`83`	`91`	`]`
`@@ -118,7 +126,7 @@`
`118`	`126`	`"cell_type": "markdown",`
`119`	`127`	`"metadata": {},`
`120`	`128`	`"source": [`
`121`		- "This class returns a `Schedule` object with a complete solution of the instance. It also set some metadata of the solution, such as the time it took to solve the instance and the status of the solution:"
	`129`	+ "This class returns a `Schedule` object with a complete solution of the instance. It also incorporates some metadata of the solution, such as the time it took to solve the instance and the status of the solution:"
`122`	`130`	`]`
`123`	`131`	`},`
`124`	`132`	`{`
`@@ -146,7 +154,7 @@`
`146`	`154`	`"cell_type": "markdown",`
`147`	`155`	`"metadata": {},`
`148`	`156`	`"source": [`
`149`		- "Finally, we can plot the gantt chart of the solution using the `plot_gantt_chart` method."
	`157`	`+ "We can now plot the gantt chart of the solution provided by the solver which is not only valid and complete, but clearly better in terms of makespan."`
`150`	`158`	`]`
`151`	`159`	`},`
`152`	`160`	`{`