Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 42 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,37 @@
[![Python](https://img.shields.io/badge/Python-3.8+-3776AB?logo=python)](https://www.python.org/)
[![GitHub Actions](https://img.shields.io/badge/CI/CD-GitHub%20Actions-2088FF?logo=github-actions)](https://github.com/features/actions)

## 🌐 GitHub Pages

The GitHub Pages for this repository is available at: **[Delta Lake & Apache Iceberg Knowledge Hub](https://analytical-guide.github.io/Datalake-Guide/)**

## 🎯 Vision Statement

**Building the definitive, community-driven knowledge ecosystem for modern data lakehouse technologies.** This repository serves as a living, breathing whitepaper that evolves with the data engineering landscape, combining comprehensive technical comparisons, battle-tested code recipes, and AI-powered content curation to empower data engineers worldwide to make informed architectural decisions and implement best practices for Delta Lake and Apache Iceberg.

## 📁 Repository Content and Structure

This repository is organized into the following sections:

### Core Content

| Section | Location | Description |
|---------|----------|-------------|
| **Feature Matrix** | [`docs/comparisons/feature-matrix.md`](docs/comparisons/feature-matrix.md) | Comprehensive comparison of Delta Lake vs Apache Iceberg |
| **Code Recipes** | [`code-recipes/`](code-recipes/) | Production-ready code examples with validation |
| **Tutorials** | [`docs/tutorials/`](docs/tutorials/) | Step-by-step guides for common use cases |
| **Architecture** | [`docs/architecture/`](docs/architecture/) | Reference architectures and design patterns |
| **Best Practices** | [`docs/best-practices/`](docs/best-practices/) | Industry-tested patterns and recommendations |

### Learning Resources

| Resource | Location | Description |
|----------|----------|-------------|
| **Getting Started** | [`docs/tutorials/getting-started.md`](docs/tutorials/getting-started.md) | Quick start guide for beginners |
| **Migration Guide** | [`docs/tutorials/migration-guide.md`](docs/tutorials/migration-guide.md) | Moving from legacy systems |
| **Knowledge Quiz** | [`quiz/`](quiz/) | Test your Delta Lake & Iceberg knowledge |
| **Design System** | [`docs/design-system.md`](docs/design-system.md) | UI/UX guidelines for the project |

## 📚 Quick Links

- [🔍 **Feature Comparison Matrix**](docs/comparisons/feature-matrix.md) - Detailed side-by-side comparison of Delta Lake vs Apache Iceberg
Expand Down Expand Up @@ -72,6 +99,19 @@ Every recipe in our [code-recipes](code-recipes/) directory follows a standardiz
- **Best Practices**: Industry-tested patterns and anti-patterns
- **Architecture Guides**: Reference implementations for various scales

## 🚀 How to Use This Material

1. **Start with the Feature Comparison**: Begin by reading the [Feature Comparison Matrix](docs/comparisons/feature-matrix.md) for a comprehensive overview of Delta Lake vs Apache Iceberg.

2. **Explore the Getting Started Guide**: Use the [Getting Started Tutorial](docs/tutorials/getting-started.md) to set up your first lakehouse.

3. **Review Code Recipes**: Work through the [Code Recipes](code-recipes/) for hands-on implementation examples.

4. **Follow Best Practices**: Study the [Best Practices](docs/best-practices/) for production-ready implementations.

5. **Test Your Knowledge**: Take the [Knowledge Quiz](quiz/) to validate your understanding.

6. **Visit the Website**: Explore the full content at [GitHub Pages](https://analytical-guide.github.io/Datalake-Guide/).

## 🚀 Getting Started

Expand All @@ -88,7 +128,7 @@ Every recipe in our [code-recipes](code-recipes/) directory follows a standardiz
3. Review the [Code of Conduct](CODE_OF_CONDUCT.md)
4. Submit your first pull request!

## ️ Development & Deployment
## 🛠️ Development & Deployment

### Prerequisites

Expand Down Expand Up @@ -291,7 +331,7 @@ Monitor performance using:
- **WebPageTest**: External performance testing
- **GitHub Actions**: Automated performance checks

## 📈 Repository Stats
## 📈 Repository Stats

![GitHub stars](https://img.shields.io/github/stars/Analytical-Guide/Datalake-Guide?style=social)
![GitHub forks](https://img.shields.io/github/forks/Analytical-Guide/Datalake-Guide?style=social)
Expand Down
4 changes: 4 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ github_username: Analytical-Guide
author: "Analytical Guide Community"
email: "[email protected]"

# Analytics (set to your GA4 measurement ID, e.g., G-XXXXXXXXXX)
google_analytics: "G-LHTGZTRTCX"

# Build settings
markdown: kramdown
highlighter: rouge
Expand All @@ -30,6 +33,7 @@ include:
- community
- _layouts
- _includes
- _data
- assets

# Exclude files
Expand Down
80 changes: 80 additions & 0 deletions _data/navigation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
- title: "I. Overview"
url: "/"
icon: "fas fa-home"
children:
- title: "Getting Started"
url: "/docs/tutorials/getting-started/"
- title: "Feature Comparison"
url: "/docs/comparisons/feature-matrix/"

- title: "II. Comparisons"
url: "/docs/comparisons/feature-matrix/"
icon: "fas fa-balance-scale"
children:
- title: "Feature Matrix"
url: "/docs/comparisons/feature-matrix/"
- title: "Time Travel & Versioning"
url: "/docs/comparisons/feature-matrix/#time-travel"
- title: "Schema Evolution"
url: "/docs/comparisons/feature-matrix/#schema-evolution"

- title: "III. Code Recipes"
url: "/code-recipes/"
icon: "fas fa-code"
children:
- title: "Recipe Catalog"
url: "/code-recipes/"
- title: "Basic Delta Table"
url: "/code-recipes/examples/basic-delta-table/"
- title: "Basic Iceberg Table"
url: "/code-recipes/examples/basic-iceberg-table/"
- title: "Streaming CDC Pipeline"
url: "/code-recipes/examples/streaming-cdc-pipeline/"
- title: "Time Series Forecasting"
url: "/code-recipes/examples/time-series-forecasting/"

- title: "IV. Tutorials"
url: "/docs/tutorials/"
icon: "fas fa-graduation-cap"
children:
- title: "Tutorials Hub"
url: "/docs/tutorials/"
- title: "Getting Started"
url: "/docs/tutorials/getting-started/"
- title: "Migration Guide"
url: "/docs/tutorials/migration-guide/"

- title: "V. Architecture"
url: "/docs/architecture/"
icon: "fas fa-cubes"
children:
- title: "Architecture Patterns"
url: "/docs/architecture/"
- title: "System Overview"
url: "/docs/architecture/system-overview/"
- title: "Blueprint"
url: "/docs/BLUEPRINT/"

- title: "VI. Best Practices"
url: "/docs/best-practices/"
icon: "fas fa-check-circle"
children:
- title: "Best Practices Hub"
url: "/docs/best-practices/"
- title: "Production Readiness"
url: "/docs/best-practices/production-readiness/"

- title: "VII. Quiz"
url: "/quiz/"
icon: "fas fa-brain"

- title: "VIII. Community"
url: "/CONTRIBUTING/"
icon: "fas fa-users"
children:
- title: "Contributing"
url: "/CONTRIBUTING/"
- title: "Code of Conduct"
url: "/CODE_OF_CONDUCT/"
- title: "Awesome List"
url: "/docs/awesome-list/"
10 changes: 10 additions & 0 deletions _includes/analytics.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{% if site.google_analytics and site.google_analytics != "" %}
<!-- Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id={{ site.google_analytics }}"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', '{{ site.google_analytics }}');
</script>
{% endif %}
26 changes: 26 additions & 0 deletions _includes/sidebar.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<aside class="sidebar" id="sidebar">
<button class="sidebar-toggle" aria-label="Toggle sidebar" aria-expanded="true" id="sidebar-toggle">
<i class="fas fa-chevron-left"></i>
</button>
<h2><i class="fas fa-book"></i> Contents</h2>
<ul class="nav-list">
{% for item in site.data.navigation %}
<li class="nav-item {% if page.url == item.url or page.url contains item.url %}active{% endif %}">
<a href="{{ item.url | relative_url }}" class="nav-link {% if page.url == item.url %}active{% endif %}">
{% if item.icon %}<i class="{{ item.icon }}"></i>{% endif %} {{ item.title }}
</a>
{% if item.children %}
<ul class="nav-children">
{% for child in item.children %}
<li>
<a href="{{ child.url | relative_url }}" class="{% if page.url == child.url %}active{% endif %}">
{{ child.title }}
</a>
</li>
{% endfor %}
</ul>
{% endif %}
</li>
{% endfor %}
</ul>
</aside>
Loading
Loading