Author: Nikolaos Bermparis
University of the Aegean – Integrated Master’s Thesis 2025
Semantic communication has become an increasing advent area in information transmission. With recent advances in deep learning such as Long Short-Term Memory (LSTM) networks, this thesis develops an end-to-end secure semantic communication framework that encodes natural language into compact semantic vectors. These vectors are modulated using Binary Phase Shift Keying (BPSK) and transmitted through a simulated Additive White Gaussian Noise (AWGN) channel implemented in MATLAB. The received signals are demodulated and decoded via an LSTM-based semantic decoder in Python.
Beyond the baseline communication pipeline this thesis also explores the vulnerabilities of semantic communication systems under adversarial conditions. In particular, we investigate the effect of gradient-based adversarial attacks such as the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). Carefully crafted perturbations can disrupt the semantic integrity of transmitted messages. With that in mind, experiments demonstrate that even small perturbations can induce significant semantic distortion, highlighting the critical need for robustness in practical deployments. The results provide insights into the susceptibility of LSTM-based autoencoders to adversarial manipulation, bridging the gap between classical communication noise models and modern adversarial threat landscapes.
In order to defend against these attacks the thesis integrates cryptographic mechanisms into the communication pipeline. More specifically a hybrid security layer which combines Quantum Key Distribution (QKD) with the use of the BB84 protocol and AES-based encryption. These measures ensure confidentiality and forward secrecy across the transmission of information while adversarial training and semantic correction methods strengthen resilience against gradient-based attacks. The findings have put to the test the security of semantic communication and what can be achieved by unifying advances in deep learning, cryptography and adversarial defense. Further we pave the way for future information systems where meaning is transmitted both reliably and securely across noisy and adversarial environments.
This repository implements an end-to-end semantic communication system integrating Deep Learning (LSTM Autoencoders), classical modulation (BPSK), and Quantum-Cryptographic (QKD + AES-CBC) encryption.
The system encodes natural language into latent vectors, transmits them through a simulated noisy (AWGN) channel in MATLAB, and reconstructs them with semantic fidelity metrics (BLEU, WER, CosSim).
Thesis
│
├── data/
│ ├── hamlet.txt
│ ├── macbeth.txt
│ └── othello.txt
│
├── models/
│ ├── encoder.pt
│ └── decoder.pt
│
├── processed/
├── results/
├── tmp/
│
├── src/
│ ├── python/
│ │ ├── step1_kb_preparation.py
│ │ ├── step2_autoencoder.py
│ │ ├── step2_baseline_decode.py
│ │ ├── step3_export_latent.py
│ │ ├── step3_qkd_encryption.py
│ │ ├── step4_channel.py
│ │ ├── step4_channel_ecc.py
│ │ ├── step4_channel_latent.py
│ │ ├── step4_import_decode.py
│ │ ├── step4_import_decode_ecc.py
│ │ ├── step4_import_decode_latent.py
│ │ ├── step5_eval_bleuwer.py
│ │ ├── step5_poison_attack.py
│ │ ├── step5_poison_test.py
│ │ └── training_progress.py
│ │
│ └── matlab/
│ ├── step_4b_matlab.m
│ ├── step4_channel_latent.m
│ ├── step4b_channel_matlab_ecc.m
│ ├── step4b_channel_latent.m
│ └── step4c_latent_reconstruction.m
│
├── README.md
├── requirements.txt
└── LICENSE
| Stage | Description | Tool |
|---|---|---|
| Step 1 | Knowledge base creation (Shakespeare corpus) | Python |
| Step 2 | LSTM Autoencoder training | PyTorch |
| Step 3 | Latent export + AES/QKD encryption | Python |
| Step 4 | BPSK modulation + AWGN simulation | MATLAB |
| Step 5 | Latent decoding + BLEU/WER evaluation | Python |
| Quantization (q_bits) | SNR (dB) | BLEU ↑ | WER ↓ | CosSim ↑ | BER ↓ | MSE ↓ |
|---|---|---|---|---|---|---|
| 2 | 20 | 0.78 | 0.24 | 0.96 | 0.010 | 0.045 |
| 4 | 20 | 0.91 | 0.12 | 0.99 | 0.008 | 0.023 |
| 6 | 20 | 0.93 | 0.10 | 0.99 | 0.006 | 0.018 |
| 8 | 20 | 0.94 | 0.09 | 0.99 | 0.007 | 0.019 |
Best trade-off: 4–6 bit quantization at SNR ≥ 20 dB - high semantic accuracy with efficient encoding.
I would like to express my sincere gratitude to my supervisor, Konstantinos Maliatsos, for his invaluable guidance, encouragement, and support throughout the development of this thesis. His insights and expertise were instrumental in shaping the research direction and ensuring the successful completion of this work. This work was conducted at the University of the Aegean, Department of Information and Communication Systems Engineering.




