Welcome!
What is OpenTelemetry?
It's the industry-standard framework for collecting and exporting observability data (traces, metrics, logs) in a vendor-neutral way. Think of it as the unified language that lets you understand what's happening inside your distributed systems.
This is a public learning project designed to help others follow along and build production-ready observability skills step-by-step. Whether you're a complete beginner or looking to get certified, you'll find structured content, hands-on examples, and comprehensive exam preparation here.
| Week | Theme | Focus | What We'll Build |
|---|---|---|---|
| 1 | Observability Fundamentals | Understanding traces, metrics, logs; semantic conventions; auto vs manual instrumentation | Mental models + reading real traces |
| 2 | OpenTelemetry APIs & SDK | Hands-on with Tracing API, Metrics API, Logs API, context propagation, SDK pipelines | Instrumented Node.js apps |
| 3 | OpenTelemetry Collector | Receivers, processors, exporters, transformations (OTTL), deployment models, scaling | Full collector pipeline with OTTL transformations |
| 4 | Production Readiness & Certification | Troubleshooting, debugging, security, Kubernetes Operator, production best practices, OTCA exam prep | Complete observable system + certification readiness |
- Understand observability fundamentals (why it matters, what problems it solves)
- Become fluent in tracing, metrics, and logs (the three pillars of observability)
- Master the OpenTelemetry API & SDK (creating telemetry in code)
- Build and configure the OpenTelemetry Collector (processing and routing telemetry)
- Learn production patterns including:
- Debugging distributed traces and missing telemetry
- Troubleshooting collector pipelines
- Production best practices (security, performance, cost optimization)
- Kubernetes deployment with OpenTelemetry Operator
- Pass the OpenTelemetry Certified Associate (OTCA) exam with comprehensive preparation including:
- Complete coverage of all 4 certification domains
- Practice questions and mock exam
- Quick reference guide and exam strategies
Week 1: Observability Fundamentals
| Day | Topic | What We'll Learn |
|---|---|---|
| 1 | Why Observability Matters | The shift from monoliths to microservices and why traditional monitoring isn't enough |
| 2 | What is OpenTelemetry? | Vendor-neutral telemetry, the API/SDK separation, and why it won |
| 3 | Traces, Metrics, Logs | How requests flow through distributed systems; three lenses on the same event |
| 4 | Spans (Building Blocks) | Anatomy of a span, parent-child relationships, reading waterfall charts |
| 5 | Semantic Conventions | Why http.method not request_method; the shared language of observability |
| 6 | Instrumentation | Auto-instrumentation vs manual; when to use each; how spans get created |
| 7 | Week 1 Review | Consolidation + missing pieces (sampling preview, context propagation details) |
Week 2: OpenTelemetry APIs & SDK
| Day | Topic | What We'll Learn |
|---|---|---|
| 8 | API vs SDK | The architecture that makes OpenTelemetry portable; separation of concerns |
| 9 | Tracing API | Creating nested spans, span events, adding attributes; hands-on examples |
| 10 | Metrics API | Counters, gauges, histograms; measuring what matters |
| 11 | Logs API | Structured logging + trace correlation; logs as part of the telemetry story |
| 12 | Context Propagation | How trace context flows through your app and across services |
| 13 | Break Day | Rest and recharge |
| 14 | Week 2 Review | Recap of APIs, SDK basics, and what we've accomplished |
Week 3: OpenTelemetry Collector Deep Dive
| Day | Topic | What We'll Learn |
|---|---|---|
| 15 | Collector Architecture | Why the Collector exists; receivers → processors → exporters pipeline flow |
| 16 | Receivers | OTLP, Filelog, Prometheus; how telemetry enters the Collector |
| 17 | Processors | Batch, attributes, filter, transform; essential data processing patterns |
| 18 | Exporters | Multi-backend strategies; Jaeger, Prometheus, Loki routing |
| 19 | Transformations (OTTL) | OpenTelemetry Transformation Language; business-friendly data manipulation |
| 20 | Deployment & Scaling | Agent vs Gateway patterns; basic scaling concepts |
| 21 | Week 3 Recap | Insights and wisdom; mindset shifts from learning to architecting |
Week 4: Production Patterns & Certification Prep
| Day | Topic | What We'll Learn |
|---|---|---|
| 22 | Debugging the Collector | Systematic debugging approaches, logging exporters, health checks, metrics analysis |
| 23 | Where the Heck is My Data? | Troubleshooting missing telemetry data through the entire pipeline (application → SDK → collector → backend) |
| 24 | Debugging Distributed Traces | Finding lost spans, broken context propagation, systematic troubleshooting workflows |
| 25 | Production Issues at Scale | Backpressure management, dropped spans, error handling, resource management |
| 26 | Production Best Practices | Security (TLS, PII protection), performance optimization, monitoring strategies |
| 27 | OpenTelemetry Operator Overview | Kubernetes-native observability, auto-instrumentation injector, CRDs, deployment modes |
| 28 | Final Project | Build a complete observability stack with Dash0 integration |
| 29 | Week 4 Recap | Systematic review of debugging trilogy, production patterns, and key concepts mastered |
| 30 | What's Next | Advanced OpenTelemetry topics and continuing your observability journey |
OTCA Certification Resources:
- OTCA Exam Preparation Guide - Complete study guide covering all 4 domains
- OTCA Mock Exam - 20 practice questions with detailed answers
- OTCA Quick Reference - Essential concepts and commands for exam day
To follow along, you'll need:
- Basic JavaScript/Node.js knowledge (examples use Express.js, but you can use any language with OpenTelemetry support)
- Docker installed (for running Jaeger and other backends)
- Terminal/command line familiarity
- Curiosity about observability!
Note: While our examples use Node.js, OpenTelemetry supports many languages including Python, Go, Java, .NET, Ruby, PHP, and more. The concepts are the same across all languages.
Optional but helpful:
- Experience with distributed systems
- Basic understanding of HTTP and APIs
- Familiarity with monitoring concepts
This is a learning project, which means I will make mistakes. If you spot:
- ❌ Technical errors or outdated information
- 💡 Better ways to explain concepts
- 🐛 Broken code examples
- 📚 Resources I should check out
- ❓ Questions I should explore
Please:
- Open an Issue (for discussion)
- Submit a PR (for fixes)
- Leave a comment on social media posts
- DM me directly
All contributions welcome, from typo fixes to major corrections.
Official Docs:
Books:
- Practical OpenTelemetry: Adopting Open Observability Standards Across Your Organization by Daniel Gomez Blanco
- Learning OpenTelemetry by Ted Young and Austin Parker
Community:
- OpenTelemetry Slack (#opentelemetry channel)
- CNCF OpenTelemetry SIG meetings
I'll add more as I discover them.
Learning OpenTelemetry too? Feel free to reach out! I'd love to connect with fellow learners and exchange insights.
Inspired by:
- The OpenTelemetry community
- OpenTelemetry Official Docs
- 90DaysofDevOps by my friend Michael Cade
- Everyone sharing their OpenTelemetry journey online
⭐ Star this repo if you find it helpful!
Let's make observability accessible to everyone. Happy learning!
This project is open source under the MIT License.