Information Theory & LMs

Reichman Universty

Students: Omri Drori & Topaz Freizeit

Guided by Dr. Alon Kipnis

Variational Autoencoders (VAEs) are powerful generative models that compress high-dimensional data into a structured, lower-dimensional latent space. This architecture features a probabilistic encoder to map data into this latent space and a probabilistic decoder to reconstruct the data from it. Training is governed by the Evidence Lower Bound (ELBO) objective, which balances the trade-off between reconstruction accuracy and a regularization term that organizes the latent space. In Natural Language Processing, VAEs can model holistic sentence properties like topic and style, with modern systems using LLMs like BERT as an encoder and GPT-2 as a decoder.

The VAE's training objective can be framed by Rate-Distortion theory, which formalizes the trade-off between compression (Rate) and reconstruction quality (Distortion). A major challenge is "posterior collapse," where the latent code becomes uninformative because the model overly prioritizes compression, effectively ignoring the encoder's output. By creating an "information bottleneck" through this trade-off, VAEs can be encouraged to learn disentangled representations. Advanced VAEs aim to improve this disentanglement by directly minimizing the statistical dependence between latent dimensions, resulting in a more interpretable and controllable generative process.

The attached slides present the theory of VAEs, and the notebook presents the connection between VAEs and text generation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
1stPaper.png		1stPaper.png
BERT_cls.png		BERT_cls.png
Bert_arch.png		Bert_arch.png
BowmanFlow.png		BowmanFlow.png
EmbeddingInjectionCycle.png		EmbeddingInjectionCycle.png
KLannealing.png		KLannealing.png
KVcache.png		KVcache.png
KVinjectionScheme.png		KVinjectionScheme.png
LMVAE.ipynb		LMVAE.ipynb
OptimusFullArch.png		OptimusFullArch.png
OptimusWay.png		OptimusWay.png
README.md		README.md
Reparameterization_Trick.png		Reparameterization_Trick.png
Slides.pdf		Slides.pdf
StandardVSvae.png		StandardVSvae.png
VAE.png		VAE.png
bert_embedding.png		bert_embedding.png
causal_self_attention.png		causal_self_attention.png
deep_latent.png		deep_latent.png
disentanglement.png		disentanglement.png
elbo.png		elbo.png
elbo2.png		elbo2.png
langvae_paper.png		langvae_paper.png
latent_integration.png		latent_integration.png
latent_integration_generated.png		latent_integration_generated.png
optimus_architecture.png		optimus_architecture.png
optimus_architecture_generated.png		optimus_architecture_generated.png
paper1figure1.png		paper1figure1.png
paper2.png		paper2.png
pca.png		pca.png
self_attention_matrix_calculation.png		self_attention_matrix_calculation.png
t_sne.png		t_sne.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Information Theory & LMs

Reichman Universty

Students: Omri Drori & Topaz Freizeit

Guided by Dr. Alon Kipnis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Information Theory & LMs

Reichman Universty

Students: Omri Drori & Topaz Freizeit

Guided by Dr. Alon Kipnis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages