This is a Jekyll site that contains frequently asked questions and answers from DataTalks.Club courses.
_questions/
- Individual question markdown files with Jekyll frontmatter_layouts/
- Jekyll layout templatesimages/
- Extracted images from the original FAQ documentsassets/css/
- Custom CSS styles_config.yml
- Jekyll configurationindex.md
- Main landing page[course].md
- Course index pages
Choose your preferred method:
PowerShell (Recommended for Windows):
# Initial setup
.\faq.ps1 setup
# Process FAQ documents and serve site
.\faq.ps1 dev
Batch file (Windows Command Prompt):
# Initial setup
faq setup
# Process FAQ documents and serve site
faq dev
Shell script (Linux/Mac/WSL):
# Initial setup
./faq.sh setup
# Process FAQ documents and serve site
./faq.sh dev
Makefile (Linux/Mac - has Windows compatibility issues):
# Initial setup
make setup
# Process FAQ documents and serve site
make dev
Command | Description |
---|---|
setup |
Initial setup (install Python deps + Jekyll) |
process |
Process FAQ documents and generate Jekyll site |
serve |
Serve Jekyll site locally (http://localhost:4000) |
build |
Build Jekyll site for production |
dev |
Process + serve (development workflow) |
clean |
Clean generated files |
install |
Install Jekyll dependencies only |
stats |
Show site statistics |
If you prefer to run commands manually:
-
Install Jekyll and dependencies:
gem install jekyll bundler bundle install
-
Process FAQ documents (with automatic cleanup):
# Clean questions directory first python clean_questions.py # Extract FAQ data uv run python faq_processor.py # Validate compatibility python validate_questions.py # Generate static site python generate.py
-
Serve the site:
bundle exec jekyll serve
-
Open your browser to
http://localhost:4000
The site contains FAQ content from the following courses:
- Data Engineering Zoomcamp
- Machine Learning Zoomcamp
- MLOps Zoomcamp
Each question is stored as an individual markdown file with Jekyll frontmatter containing:
question
- The question textsection
- The section/category the question belongs tocourse
- The course the question is from
The content was processed from the original Google Docs FAQ documents using the faq_processor.py
script, which:
- Downloads and caches DOCX files
- Extracts embedded images
- Converts content to individual Jekyll question files
- Generates course index pages
- Creates the Jekyll site structure
To ensure compatibility between the FAQ processor and static site generator:
make help # Show available commands
make clean_questions # Remove all files in _questions/ directory
make extract # Clean _questions/ and extract FAQ data from Google Docs
make validate # Validate all question files are compatible with generate.py
make website # Generate static website from markdown files
-
Clean and Extract: Always start by cleaning the questions directory to prevent leftover files:
make extract # This automatically runs clean_questions first
-
Validate: Check that all generated files are compatible:
make validate
-
Generate Site: Create the static HTML site:
make website
If you prefer to run commands individually:
# 1. Clean questions directory
python clean_questions.py
# 2. Extract FAQ data
python faq_processor.py
# 3. Validate compatibility
python validate_questions.py
# 4. Generate static site
python generate.py
The FAQ processor now generates markdown files with properly formatted YAML frontmatter that includes:
id
- Unique identifier for the questionquestion
- The question text (properly quoted for YAML)section
- The section/category (properly quoted for YAML)course
- The course namesort_order
- Numerical sort order
All string values containing special characters (like colons) are automatically quoted to ensure YAML compatibility.
Images are stored in /images/[course]/
directories and referenced using Jekyll's absolute path syntax (/images/...
) for proper display.