MLOPs and LLMOPs for dissertation analysis workflow
- Download/clone the repository and save to your desired folder
- Create a new virtual environment
- Web Scraping (Bot)
- Cleaning; Preparing and Monitoring
- Identify security and ethical risks in the data and storage
Instagram Account
Training and fine-tuning on a subset of data to track performance, identify errors, and optimize models.
Clean post comments i.e. lemmetize, translate emojies, rename hashtags, lowercase sentences, etc
Multlingual sentence transformers
- Huggingface sentence transformer
- LaBASE (Language Agnostic BERT sentence encoder)
- GCN?
Evaluation Metrics:
- Acuracy, Precision, Recall, F1-Score, Hamming Loss
- Confusion Matrix (Visuallization)
- BERTScore, Sentence Transformer Cosine Similarities
Ongoing monitoring of security and ethical risks
Hugging Face Access, Access to LLM (I used VERDE), GPU Access
Create Structural graph (content-based knowledge representation)
- Viz_weights + generated label_weights
- Viz_weights & generated label_weights + generated captions
- Viz_weights & generated label_weights + generated captions & original post comments
Community Detection (Evaluation of Network Structure):
- Centraility Measures
- ERGM algorithm
- Leidan algorithm
Graph Analysis
- Multipartite, Bipartite
- Multiplex Graphs?