See leovan/quarto-pseudocode#11

fradav · fradav · commit c5076382f0ef · 2026-01-20T03:28:34.000+01:00
diff --git a/published-paper-tsne.qmd b/published-paper-tsne.qmd
@@ -148,7 +148,7 @@ Second, although t-SNE introduces strong repulsions between dissimilar datapoint
 Taken together, t-SNE puts emphasis on (1) modeling dissimilar datapoints by means of large pairwise distances, and (2) modeling similar datapoints by means of small pairwise distances. Moreover, as a result of these characteristics of the t-SNE cost function (and as a result of the approximate scale invariance of the Student t-distribution), the optimization of the t-SNE cost function is much easier than the optimization of the cost functions of SNE and UNI-SNE. Specifically, t-SNE introduces long-range forces in the low-dimensional map that can pull back together two (clusters of) similar points that get separated early on in the optimization. SNE and UNI-SNE do not have such long-range forces, as a result of which SNE and UNI-SNE need to use simulated annealing to obtain reasonable solutions. Instead, the long-range forces in t-SNE facilitate the identification of good local optima without resorting to simulated annealing
 
 ```pseudocode
-#| label: alg-tsne
+#| label: algo-tsne
 #| html-indent-size: "1.2em"
 #| html-comment-delimiter: "//"
 #| html-line-number: true
@@ -180,7 +180,7 @@ Taken together, t-SNE puts emphasis on (1) modeling dissimilar datapoints by mea
 
 ## Optimization methods for t-SNE {#sec-optimization_methods_for_tsne}
 
-We start by presenting a relatively simple, gradient descent procedure for optimizing the t-SNE cost function. This simple procedure uses a momentum term to reduce the number of iterations required and it works best if the momentum term is small until the map points have become moderately well organized. Pseudocode for this simple algorithm is presented in @alg-tsne. The simple algorithm can be sped up using the adaptive learning rate scheme that is described by @jacobs:rates, which gradually increases the learning rate in directions in which the gradient is stable.
+We start by presenting a relatively simple, gradient descent procedure for optimizing the t-SNE cost function. This simple procedure uses a momentum term to reduce the number of iterations required and it works best if the momentum term is small until the map points have become moderately well organized. Pseudocode for this simple algorithm is presented in @algo-tsne. The simple algorithm can be sped up using the adaptive learning rate scheme that is described by @jacobs:rates, which gradually increases the learning rate in directions in which the gradient is stable.
 
 Although the simple algorithm produces visualizations that are often much better than those produced by other non-parametric dimensionality reduction techniques, the results can be improved further by using either of two tricks. The first trick, which we call "early compression," is to force the map points to stay close together at the start of the optimization. When the distances between map points are small, it is easy for clusters to move through one another so it is much easier to explore the space of possible global organizations of the data. Early compression is implemented by adding an additional L2-penalty to the cost function that is proportional to the sum of squared distances of the map points from the origin. The magnitude of this penalty term and the iteration at which it is removed are set by hand, but the behavior is fairly robust across variations in these two additional optimization parameters.
 
diff --git a/scripts/plotutils.py b/scripts/plotutils.py
@@ -6,7 +6,7 @@
 import os
 
 pio.templates.default = "plotly_white"
-if os.environ["QUARTO_FIG_FORMAT"] == "pdf":
+if os.environ.get("QUARTO_FIG_FORMAT") == "pdf":
     pio.renderers.default = "pdf"
 
 def plot2d(X, y, manifold_method):