wiseodd
diff --git a/‎.astro/types.d.ts‎
Lines changed: 6 additions & 6 deletions b/‎.astro/types.d.ts‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎bun.lockb‎
0 Bytes b/‎bun.lockb‎
0 Bytes
diff --git a/‎…ntent/post/conditional-gan-tensorflow.md‎ ‎…tent/post/conditional-gan-tensorflow.mdx‎src/content/post/conditional-gan-tensorflow.md renamed to src/content/post/conditional-gan-tensorflow.mdx
Lines changed: 37 additions & 33 deletions b/‎…ntent/post/conditional-gan-tensorflow.md‎ ‎…tent/post/conditional-gan-tensorflow.mdx‎src/content/post/conditional-gan-tensorflow.md renamed to src/content/post/conditional-gan-tensorflow.mdx
Lines changed: 37 additions & 33 deletions
diff --git a/‎src/content/post/conditional-vae.md‎ ‎src/content/post/conditional-vae.mdx‎src/content/post/conditional-vae.md renamed to src/content/post/conditional-vae.mdx
Lines changed: 26 additions & 26 deletions b/‎src/content/post/conditional-vae.md‎ ‎src/content/post/conditional-vae.mdx‎src/content/post/conditional-vae.md renamed to src/content/post/conditional-vae.mdx
Lines changed: 26 additions & 26 deletions
@@ -185,13 +185,13 @@ declare module 'astro:content' {
   collection: "post";
   data: InferEntrySchema<"post">
 } & { render(): Render[".mdx"] };
-"conditional-gan-tensorflow.md": {
-	id: "conditional-gan-tensorflow.md";
+"conditional-gan-tensorflow.mdx": {
+	id: "conditional-gan-tensorflow.mdx";
   slug: "conditional-gan-tensorflow";
   body: string;
   collection: "post";
   data: InferEntrySchema<"post">
-} & { render(): Render[".md"] };
+} & { render(): Render[".mdx"] };
 "conditional-vae.md": {
 	id: "conditional-vae.md";
   slug: "conditional-vae";
@@ -409,13 +409,13 @@ declare module 'astro:content' {
   collection: "post";
   data: InferEntrySchema<"post">
 } & { render(): Render[".mdx"] };
-"mle-vs-map.md": {
-	id: "mle-vs-map.md";
+"mle-vs-map.mdx": {
+	id: "mle-vs-map.mdx";
   slug: "mle-vs-map";
   body: string;
   collection: "post";
   data: InferEntrySchema<"post">
-} & { render(): Render[".md"] };
+} & { render(): Render[".mdx"] };
 "natural-gradient.mdx": {
 	id: "natural-gradient.mdx";
   slug: "natural-gradient";
 
@@ -5,38 +5,40 @@ publishDate: 2016-12-24 05:30
 tags: [machine learning, programming, python, neural networks, gan]
 ---
 
-We have seen the Generative Adversarial Nets (GAN) model in [the previous post]({% post_url 2016-09-17-gan-tensorflow %}). We have also seen the arch nemesis of GAN, the VAE and its conditional variation: Conditional VAE (CVAE). Hence, it is only proper for us to study conditional variation of GAN, called Conditional GAN or CGAN for short.
+import BlogImage from "@/components/BlogImage.astro";
+
+We have seen the Generative Adversarial Nets (GAN) model in the previous post. We have also seen the arch nemesis of GAN, the VAE and its conditional variation: Conditional VAE (CVAE). Hence, it is only proper for us to study conditional variation of GAN, called Conditional GAN or CGAN for short.
 
 ## CGAN: Formulation and Architecture
 
-Recall, in GAN, we have two neural nets: the generator \\( G(z) \\) and the discriminator \\( D(X) \\). Now, as we want to condition those networks with some vector \\( y \\), the easiest way to do it is to feed \\( y \\) into both networks. Hence, our generator and discriminator are now \\( G(z, y) \\) and \\( D(X, y) \\) respectively.
+Recall, in GAN, we have two neural nets: the generator $G(z)$ and the discriminator $D(X)$. Now, as we want to condition those networks with some vector $y$, the easiest way to do it is to feed $y$ into both networks. Hence, our generator and discriminator are now $G(z, y)$ and $D(X, y)$ respectively.
 
-We can see it with a probabilistic point of view. \\( G(z, y) \\) is modeling the distribution of our data, given \\( z \\) and \\( y \\), that is, our data is generated with this scheme \\( X \sim G(X \, \vert \, z, y) \\).
+We can see it with a probabilistic point of view. $G(z, y)$ is modeling the distribution of our data, given $z$ and $y$, that is, our data is generated with this scheme $X \sim G(X \, \vert \, z, y)$.
 
-Likewise for the discriminator, now it tries to find discriminating label for \\( X \\) and \\( X_G \\), that are modeled with \\( d \sim D(d \, \vert \, X, y) \\).
+Likewise for the discriminator, now it tries to find discriminating label for $X$ and $X_G$, that are modeled with $d \sim D(d \, \vert \, X, y)$.
 
-Hence, we could see that both \\( D \\) and \\( G \\) is jointly conditioned to two variables \\( z \\) or \\( X \\) and \\( y \\).
+Hence, we could see that both $D$ and $G$ is jointly conditioned to two variables $z$ or $X$ and $y$.
 
 Now, the objective function is given by:
 
 $$
-
 \min_G \max_D V(D, G) = \mathop{\mathbb{E}}_{x \sim p_{data}(x)} [\log D(x, y)] + \mathop{\mathbb{E}}_{z \sim p_z(z)} [\log(1 - D(G(z, y), y))]
-
-
 $$
 
-If we compare the above loss to GAN loss, the difference only lies in the additional parameter \\( y \\) in both \\( D \\) and \\( G \\).
+If we compare the above loss to GAN loss, the difference only lies in the additional parameter $y$ in both $D$ and $G$.
 
 The architecture of CGAN is now as follows (taken from [1]):
 
-![CGAN arch]({{ site.baseurl }}/img/2016-12-24-conditional-gan-tensorflow/arch.png)
+<BlogImage
+  imagePath='/img/conditional-gan-tensorflow/arch.png'
+  altText='CGAN architecture.'
+/>
 
 In contrast with the architecture of GAN, we now has an additional input layer in both discriminator net and generator net.
 
 ## CGAN: Implementation in TensorFlow
 
-I'd like to direct the reader to the [previous post about GAN]({% post_url 2016-09-17-gan-tensorflow %}), particularly for the implementation in TensorFlow. Implementing CGAN is so simple that we just need to add a handful of lines to the original GAN implementation. So, here we will only look at those modifications.
+Implementing CGAN is so simple that we just need to add a handful of lines to the original GAN implementation. So, here we will only look at those modifications.
 
 The first additional code for CGAN is here:
 
@@ -49,38 +51,37 @@ We are adding new input to hold our variable we are conditioning our CGAN to.
 Next, we add it to both our generator net and discriminator net:
 
 ```python
-def generator(z, y): # Concatenate z and y
-inputs = tf.concat(concat_dim=1, values=[z, y])
+def generator(z, y):
+    # Concatenate z and y
+    inputs = tf.concat(concat_dim=1, values=[z, y])
 
     G_h1 = tf.nn.relu(tf.matmul(inputs, G_W1) + G_b1)
     G_log_prob = tf.matmul(G_h1, G_W2) + G_b2
     G_prob = tf.nn.sigmoid(G_log_prob)
 
     return G_prob
 
-def discriminator(x, y): # Concatenate x and y
-inputs = tf.concat(concat_dim=1, values=[x, y])
+
+def discriminator(x, y):
+    # Concatenate x and y
+    inputs = tf.concat(concat_dim=1, values=[x, y])
 
     D_h1 = tf.nn.relu(tf.matmul(inputs, D_W1) + D_b1)
     D_logit = tf.matmul(D_h1, D_W2) + D_b2
     D_prob = tf.nn.sigmoid(D_logit)
 
     return D_prob, D_logit
-
 ```
 
-The problem we have here is how to incorporate the new variable \\( y \\) into \\( D(X) \\) and \\( G(z) \\). As we are trying to model the joint conditional, the simplest way to do it is to just concatenate both variables. Hence, in \\( G(z, y) \\), we are concatenating \\( z \\) and \\( y \\) before we feed it into the networks. The same procedure is applied to \\( D(X, y) \\).
+The problem we have here is how to incorporate the new variable $y$ into $D(X)$ and $G(z)$. As we are trying to model the joint conditional, the simplest way to do it is to just concatenate both variables. Hence, in $G(z, y)$, we are concatenating $z$ and $y$ before we feed it into the networks. The same procedure is applied to $D(X, y)$.
 
-Of course, as our inputs for \\( D(X, y) \\) and \\( G(z, y) \\) is now different than the original GAN, we need to modify our weights:
+Of course, as our inputs for $D(X, y)$ and $G(z, y)$ is now different than the original GAN, we need to modify our weights:
 
 ```python
-
 # Modify input to hidden weights for discriminator
-
 D_W1 = tf.Variable(shape=[X_dim + y_dim, h_dim]))
 
 # Modify input to hidden weights for generator
-
 G_W1 = tf.Variable(shape=[Z_dim + y_dim, h_dim]))
 ```
 
@@ -89,25 +90,23 @@ That is, we just adjust the dimensionality of our weights.
 Next, we just use our new networks:
 
 ```python
-
 # Add additional parameter y into all networks
-
 G_sample = generator(Z, y)
 D_real, D_logit_real = discriminator(X, y)
 D_fake, D_logit_fake = discriminator(G_sample, y)
 ```
 
-And finally, when training, we also feed the value of \\( y \\) into the networks:
+And finally, when training, we also feed the value of $y$ into the networks:
 
 ```python
 X_mb, y_mb = mnist.train.next_batch(mb_size)
 
-Z*sample = sample_Z(mb_size, Z_dim)
-*, D*loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: Z_sample, y:y_mb})
-*, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: Z_sample, y:y_mb})
+Z_sample = sample_Z(mb_size, Z_dim)
+_, D_loss_curr = sess.run([D_solver, D_loss], feed_dict={X: X_mb, Z: Z_sample, y:y_mb})
+_, G_loss_curr = sess.run([G_solver, G_loss], feed_dict={Z: Z_sample, y:y_mb})
 ```
 
-As an example above, we are training our GAN with MNIST data, and the conditional variable \\( y \\) is the labels.
+As an example above, we are training our GAN with MNIST data, and the conditional variable $y$ is the labels.
 
 ## CGAN: Results
 
@@ -118,24 +117,29 @@ n_sample = 16
 Z_sample = sample_Z(n_sample, Z_dim)
 
 # Create conditional one-hot vector, with index 5 = 1
-
 y_sample = np.zeros(shape=[n_sample, y_dim])
 y_sample[:, 5] = 1
 
 samples = sess.run(G_sample, feed_dict={Z: Z_sample, y:y_sample})
 ```
 
-Above, we just sample \\( z \\), and then construct the conditional variables. In our example case, the conditional variables is a collection of one-hot vectors with value 1 in the 5th index. The last thing we need to is to run the network with those variables as inputs.
+Above, we just sample $z$, and then construct the conditional variables. In our example case, the conditional variables is a collection of one-hot vectors with value 1 in the 5th index. The last thing we need to is to run the network with those variables as inputs.
 
 Here is the results:
 
-![Sample 5]({{ site.baseurl }}/img/2016-12-24-conditional-gan-tensorflow/5.png)
+<BlogImage
+  imagePath='/img/conditional-gan-tensorflow/5.png'
+  altText='Conditional samples.'
+/>
 
 Looks pretty much like digit 5, right?
 
 If we set our one-hot vectors to have value of 1 in the 7th index:
 
-![Sample 7]({{ site.baseurl }}/img/2016-12-24-conditional-gan-tensorflow/7.png)
+<BlogImage
+  imagePath='/img/conditional-gan-tensorflow/7.png'
+  altText='Conditional samples.'
+/>
 
 Those results confirmed that have successfully trained our CGAN.
 
@@ -145,7 +149,7 @@ In this post, we looked at the analogue of CVAE for GAN: the Conditional GAN (CG
 
 The conditional variables for CGAN, just like CVAE, could be anything. Hence it makes CGAN an interesting model to work with for data modeling.
 
-The full code is available at my GitHub repo: <https://github.com/wiseodd/generative-models>.
+The full code is available at my GitHub repo: https://github.com/wiseodd/generative-models.
 
 ## References
 
 
@@ -5,7 +5,9 @@ publishDate: 2016-12-17 11:04
 tags: [programming, python, neuralnet]
 ---
 
-Conditional Variational Autoencoder (CVAE) is an extension of [Variational Autoencoder (VAE)]({% post_url 2016-12-10-variational-autoencoder %}), a generative model that we have studied in the last post. We've seen that by formulating the problem of data generation as a bayesian model, we could optimize its variational lower bound to learn the model.
+import BlogImage from "@/components/BlogImage.astro";
+
+Conditional Variational Autoencoder (CVAE) is an extension of Variational Autoencoder (VAE), a generative model that we have studied in the last post. We've seen that by formulating the problem of data generation as a bayesian model, we could optimize its variational lower bound to learn the model.
 
 However, we have no control on the data generation process on VAE. This could be problematic if we want to generate some specific data. As an example, suppose we want to convert a unicode character to handwriting. In vanilla VAE, there is no way to generate the handwriting based on the character that the user inputted. Concretely, suppose the user inputted character '2', how do we generate handwriting image that is a character '2'? We couldn't.
 
@@ -17,27 +19,27 @@ Recall, on VAE, the objective is:
 
 $$ \log P(X) - D*{KL}[Q(z \vert X) \Vert P(z \vert X)] = E[\log P(X \vert z)] - D*{KL}[Q(z \vert X) \Vert P(z)] $$
 
-that is, we want to optimize the log likelihood of our data \\( P(X) \\) under some "encoding" error. The original VAE model has two parts: the encoder \\( Q(z \vert X) \\) and the decoder \\( P(X \vert z) \\).
+that is, we want to optimize the log likelihood of our data $P(X)$ under some "encoding" error. The original VAE model has two parts: the encoder $Q(z \vert X)$ and the decoder $P(X \vert z)$.
 
-Looking closely at the model, we could see why can't VAE generate specific data, as per our example above. It's because the encoder models the latent variable \\( z \\) directly based on \\( X \\), it doesn't care about the different type of \\( X \\). For example, it doesn't take any account on the label of \\( X \\).
+Looking closely at the model, we could see why can't VAE generate specific data, as per our example above. It's because the encoder models the latent variable $z$ directly based on $X$, it doesn't care about the different type of $X$. For example, it doesn't take any account on the label of $X$.
 
-Similarly, in the decoder part, it only models \\( X \\) directly based on the latent variable \\( z \\).
+Similarly, in the decoder part, it only models $X$ directly based on the latent variable $z$.
 
-We could improve VAE by conditioning the encoder and decoder to another thing(s). Let's say that other thing is \\( c \\), so the encoder is now conditioned to two variables \\( X \\) and \\( c \\): \\( Q(z \vert X, c) \\). The same with the decoder, it's now conditioned to two variables \\( z \\) and \\( c \\): \\( P(X \vert z, c) \\).
+We could improve VAE by conditioning the encoder and decoder to another thing(s). Let's say that other thing is $c$, so the encoder is now conditioned to two variables $X$ and $c$: $Q(z \vert X, c)$. The same with the decoder, it's now conditioned to two variables $z$ and $c$: $P(X \vert z, c)$.
 
 Hence, our variational lower bound objective is now in this following form:
 
 $$ \log P(X \vert c) - D*{KL}[Q(z \vert X, c) \Vert P(z \vert X, c)] = E[\log P(X \vert z, c)] - D*{KL}[Q(z \vert X, c) \Vert P(z \vert c)] $$
 
-i.e. we just conditioned all of the distributions with a variable \\( c \\).
+i.e. we just conditioned all of the distributions with a variable $c$.
 
-Now, the real latent variable is distributed under \\( P(z \vert c ) \\). That is, it's now a conditional probability distribution (CPD). Think about it like this: for each possible value of \\( c \\), we would have a \\( P(z) \\). We could also use this form of thinking for the decoder.
+Now, the real latent variable is distributed under $P(z \vert c )$. That is, it's now a conditional probability distribution (CPD). Think about it like this: for each possible value of $c$, we would have a $P(z)$. We could also use this form of thinking for the decoder.
 
 ## CVAE: Implementation
 
-The conditional variable \\( c \\) could be anything. We could assume it comes from a categorical distribution expressing the label of our data, gaussian expressing some regression target, or even the same distribution as the data (e.g. for image inpainting: conditioning the model to incomplete image).
+The conditional variable $c$ could be anything. We could assume it comes from a categorical distribution expressing the label of our data, gaussian expressing some regression target, or even the same distribution as the data (e.g. for image inpainting: conditioning the model to incomplete image).
 
-Let's use MNIST for example. We could use the label as our conditional variable \\( c \\). In this case, \\( c \\) is categorically distributed, or in other words, it takes form as an one-hot vector of label:
+Let's use MNIST for example. We could use the label as our conditional variable $c$. In this case, $c$ is categorically distributed, or in other words, it takes form as an one-hot vector of label:
 
 ```python
 mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
@@ -51,7 +53,6 @@ n_z = 2
 n_epoch = 20
 
 # Q(z|X,y) -- encoder
-
 X = Input(batch_shape=(m, n_x))
 cond = Input(batch_shape=(m, n_y))
 ```
@@ -70,17 +71,15 @@ Similarly, the decoder is also concatenated with the conditional vector:
 
 ```python
 def sample_z(args):
-mu, log_sigma = args
-eps = K.random_normal(shape=(m, n_z), mean=0., std=1.)
-return mu + K.exp(log_sigma / 2) \* eps
+    mu, log_sigma = args
+    eps = K.random_normal(shape=(m, n_z), mean=0., std=1.)
+    return mu + K.exp(log_sigma / 2) * eps
 
 # Sample z ~ Q(z|X,y)
-
 z = Lambda(sample_z)([mu, log_sigma])
 z_cond = merge([z, cond], mode='concat', concat_axis=1) # <--- NEW!
 
 # P(X|z,y) -- decoder
-
 decoder_hidden = Dense(512, activation='relu')
 decoder_out = Dense(784, activation='sigmoid')
 
@@ -92,35 +91,36 @@ The rest is similar to VAE. Heck, even we don't need to modify the objective. Ev
 
 ```python
 def vae_loss(y_true, y_pred):
-""" Calculate loss = reconstruction loss + KL loss for each data in minibatch """ # E[log P(X|z,y)]
-recon = K.sum(K.binary_crossentropy(y_pred, y_true), axis=1) # D_KL(Q(z|X,y) || P(z|X)); calculate in closed form as both dist. are Gaussian
-kl = 0.5 \* K.sum(K.exp(log_sigma) + K.square(mu) - 1. - log_sigma, axis=1)
+    """ Calculate loss = reconstruction loss + KL loss for each data in minibatch """
+    # E[log P(X|z,y)]
+    recon = K.sum(K.binary_crossentropy(y_pred, y_true), axis=1)
+    # D_KL(Q(z|X,y) || P(z|X)); calculate in closed form as both dist. are Gaussian
+    kl = 0.5 * K.sum(K.exp(log_sigma) + K.square(mu) - 1. - log_sigma, axis=1)
 
     return recon + kl
-
 ```
 
-For the full explanation of the code, please refer to my [original VAE post]({% post_url 2016-12-10-variational-autoencoder %}). The full code could be found in my Github repo: <https://github.com/wiseodd/generative-models>.
+For the full explanation of the code, please refer to my original VAE post. The full code could be found in my Github repo: https://github.com/wiseodd/generative-models.
 
 ## Conditional MNIST
 
 We will test our CVAE model to generate MNIST data, conditioned to its label. With the above model, we could specify which digit we want to generate, as it is conditioned to the label!
 
-First thing first, let's visualize \\( Q(z \vert X, c) \\):
+First thing first, let's visualize $Q(z \vert X, c)$:
 
-![Q(z \vert X)]({{ site.baseurl }}/img/2016-12-17-conditional-vae/z_dist_cvae.png)
+<BlogImage imagePath='/img/conditional-vae/z_dist_cvae.png' />
 
-Things are messy here, in contrast to VAE's \\( Q(z \vert X) \\), which nicely clusters \\( z \\). But if we look at it closely, we could see that given a specific value of \\( c = y \\), \\( Q(z \vert X, c=y) \\) is roughly \\( N(0, 1) \\)! It's because, if we look at our objective above, we are now modeling \\( P(z \vert c) \\), which we infer variationally with a \\( N(0, 1) \\).
+Things are messy here, in contrast to VAE's $Q(z \vert X)$, which nicely clusters $z$. But if we look at it closely, we could see that given a specific value of $c = y$, $Q(z \vert X, c=y)$ is roughly $N(0, 1)$! It's because, if we look at our objective above, we are now modeling $P(z \vert c)$, which we infer variationally with a $N(0, 1)$.
 
 Next, let's try to reconstruct some images:
 
-![Reconstruction]({{ site.baseurl }}/img/2016-12-17-conditional-vae/reconstruction_cvae.png)
+<BlogImage imagePath='/img/conditional-vae/reconstruction_cvae.png' fullWidth />
 
 Subjectively, we could say the reconstruction results are way better than the original VAE! We could argue that because each data under specific label has its own distribution, hence it is easy to sample data with a specific label. If we look back at the result of the original VAE, the reconstructions suffer at the edge cases, e.g. when the model is not sure if it's 3, 8, or 5, as they look very similar. No such problem here!
 
-![Generation]({{ site.baseurl }}/img/2016-12-17-conditional-vae/generation_cvae.png)
+<BlogImage imagePath='/img/conditional-vae/generation_cvae.png' fullWidth />
 
-Now the interesting part. We could generate a new data under our specific condition. Above, for example, we generate new data which has the label of '5', i.e. \\( c = [0, 0, 0, 0, 0, 1, 0, 0, 0, 0] \\). CVAE make it possible for us to do that.
+Now the interesting part. We could generate a new data under our specific condition. Above, for example, we generate new data which has the label of '5', i.e. $c = [0, 0, 0, 0, 0, 1, 0, 0, 0, 0]$. CVAE make it possible for us to do that.
 
 ## Conclusion