Skip to content

Unified Notes for All Zoomcamp Courses #89

@kavaivaleri

Description

@kavaivaleri

This project aims to build a full knowledge layer on top of all Zoomcamp courses. The goal is to generate structured notes for every video, create a shared vocabulary of key concepts, and introduce an internal linking system.

Together, these components improve navigation, learning experience, SEO, and long-term maintainability.


1. Purpose of the Project

We want every Zoomcamp video to have a clear, consistent written summary. These notes will help learners orient themselves quickly, revisit material without rewatching videos, and navigate across related concepts. At the same time, we want search engines to better understand and index the content.

To achieve this, we combine:

  • high-quality written notes for each lesson,
  • a global glossary of technical terms,
  • automatic cross-linking between related notes,
  • and structured data so search engines can interpret the material.

2. Notes Repository

We will create a dedicated repository containing notes for every video across ML, Data Engineering, MLOps, LLM Zoomcamp, and AI Dev Tools Zoomcamp.

Each note will follow a unified Markdown structure with a summary, key concepts, explanations, and links to related notes and vocabulary entries.

The repository will be organized by course → module → video.

Automation (via transcripts + LLM) will generate initial drafts, which contributors can later refine.


3. Automated Pipeline for Notes Generation

The project includes an automated process that fetches YouTube transcripts, preprocesses them, and uses an LLM to produce structured notes. These drafts will be reviewed manually, but automation ensures consistency and reduces repetitive work.

This pipeline later becomes part of a GitHub Action that can refresh notes when courses update or new videos appear.


4. Vocabulary (Glossary)

Alongside the notes, we will maintain a central vocabulary of the most important terms used across all courses.
Each term will have:

  • a definition,
  • explanations if needed,
  • and references to all notes where the term appears.
    Notes will also link back to vocabulary entries, forming a two-way relationship.

This allows learners to quickly understand terminology and see how concepts connect across multiple Zoomcamps.


5. Internal Linking System

To make navigation easier, we will automatically detect related notes using concept extraction, keyword matching, or embeddings.

Each note will include a “Related Notes” section.

This creates a lightweight knowledge graph inside the repository and helps both learners and search engines understand how lessons relate to one another.


6. SEO & Structured Data

Every notes page and vocabulary term will include JSON-LD structured data.

This helps search engines interpret pages as educational resources, definitions, or course materials.
In addition, a sitemap and clear metadata (titles, descriptions, alt text) will be generated to improve indexing and visibility.


7. Community Contributions

The project will be open for contributors.

We will provide a clear CONTRIBUTING.md, issue templates for adding or editing notes, and a PR template.


8. Long-Term Automation & Maintenance

GitHub Actions will handle recurring tasks:

  • updating transcripts,
  • triggering notes regeneration,
  • refreshing the vocabulary and link graph,
  • validating structured data,
  • and opening PRs with updates automatically.

This ensures the system remains up to date with minimal manual effort.


Outcome

Once complete, this project will become a comprehensive, interconnected, and SEO-friendly knowledge base for all Zoomcamp courses. It helps learners navigate material easily, provides clear definitions, strengthens search discoverability, and offers a structured environment for community contributions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions