Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 186 additions & 0 deletions projects/collector-docs-refactoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# Collector Documentation Refactoring

## Background and description

The OpenTelemetry Collector is a critical component of the OpenTelemetry
ecosystem, serving as a vendor-agnostic way to receive, process, and export
telemetry data. The Collector documentation on
[opentelemetry.io](https://opentelemetry.io/docs/collector/) is currently
undergoing a significant refactoring effort to improve organization, clarity,
and completeness.

This project coordinates the ongoing refactoring of Collector documentation,
which is being tracked through the `sig:collector:refactor` label and organized
into multiple phases.

**Note**: This project is already underway with active issues and a milestone.
This document serves to formalize the project scope and track progress within
the community projects framework.

### Current challenges

The current Collector documentation has several issues being addressed:

- **Information architecture needs improvement**: Pages are not organized
optimally for different user journeys (installation, deployment,
configuration, extension).
- **Large monolithic pages**: Some pages (like installation) need to be split
into focused child pages.
- **Inconsistent examples**: Examples sometimes use core vs. contrib
inconsistently.
- **Missing operational guidance**: Best practices, troubleshooting, and
disaster recovery documentation is incomplete.
- **Copy editing needed**: Many pages need style guide compliance, grammar
fixes, and clarity improvements.
- **Cross-reference maintenance**: Page renames require updating
cross-references throughout the documentation.

**Current structure verified in repository (`content/en/docs/collector/`):**

```
collector/
├── _index.md # Landing page
├── architecture.md # Architecture overview
├── benchmarks.md # Performance benchmarks
├── building/ # ✅ Custom collector building
│ ├── _index.md
│ ├── authenticator-extension.md
│ ├── connector/
│ └── receiver.md
├── components/ # ✅ Component documentation
│ ├── connector.md
│ ├── exporter.md
│ ├── extension.md
│ ├── processor.md
│ └── receiver.md
├── configuration.md # Configuration reference
├── custom-collector.md # Custom collector guide
├── deployment/ # ✅ Deployment patterns
│ ├── _index.md
│ ├── agent.md
│ ├── gateway/
│ └── no-collector.md
├── distributions.md # Distribution information
├── installation.md # Installation (needs splitting)
├── internal-telemetry.md # Internal metrics
├── management.md # OpAMP management
├── quick-start.md # Quick start guide
├── resiliency.md # Resiliency features
├── scaling.md # Scaling guide
├── transforming-telemetry.md
└── troubleshooting.md # Troubleshooting guide
```

**Progress on refactoring:**

- ✅ `building/` section exists with custom component guides
- ✅ `deployment/` section has agent, gateway, no-collector patterns
- ✅ `components/` section documents all component types
- ⚠️ `installation.md` still monolithic (issue #8353)
- ⚠️ Troubleshooting exists but needs expansion

### Goals, objectives, and requirements

The goal of this project is to:

1. **Restructure information architecture**: Create a logical hierarchy
(Install, Deploy, Extend sections).
2. **Split large pages**: Break monolithic pages into focused, scannable
content.
3. **Standardize examples**: Ensure consistent use of Collector distributions in
examples.
4. **Add operational content**: Include best practices, troubleshooting, and
scaling documentation.
5. **Copy edit all pages**: Ensure style guide compliance and clear writing.
6. **Maintain cross-references**: Update all links when pages are renamed or
moved.

## Deliverables

### Phase 1 (otelcol-phase-1) - Current

Focus on moving existing content to conform to the new information architecture:

- Rename and split Collector installation page.
- Rename Collector deployment pages and create new child pages.
- Create new "Extend the Collector" section.
- Copy edit installation, deployment, and extension pages.
- Update cross-references throughout documentation.

### Phase 2 (otelcol-phase-2)

Focus on writing and editing content (details TBD based on Phase 1 completion).

### Phase 3 (otelcol-phase-3)

Gap analysis of the new documentation set:

- Audit documentation for missing or incorrect content.
- Identify which examples to add first.
- Use AI, user interviews, and community feedback for analysis.
- Address disaster recovery recommendations.
- Follow up on deployment documentation improvements.

## Timeline

- **Phase 1**: Due December 31, 2025.
- **Phase 2**: TBD.
- **Phase 3**: Due March 20, 2026.

## Labels

- `sig:collector`
- `sig:collector:refactor`
- `docs:information-architecture`
- `help wanted`
- `good first issue` (for some tasks)

## Related issues

### Phase 1 issues (otelcol-phase-1 milestone)

- [#8353 - Rename and split up Collector installation page](https://github.com/open-telemetry/opentelemetry.io/issues/8353)
(open)
- [#8354 - Copy edit the Collector installation page](https://github.com/open-telemetry/opentelemetry.io/issues/8354)
(open)
- [#8355 - Rename Collector deployment pages and create new child pages](https://github.com/open-telemetry/opentelemetry.io/issues/8355)
(open)
- [#8356 - Copy edit the Collector deployment landing, no collector, and agent pages](https://github.com/open-telemetry/opentelemetry.io/issues/8356)
(closed)
- [#8358 - Copy edit the Collector deployment gateway page](https://github.com/open-telemetry/opentelemetry.io/issues/8358)
(closed)
- [#8360 - Create new "Extend the Collector" section and rename child pages](https://github.com/open-telemetry/opentelemetry.io/issues/8360)
(open)
- [#8361 - Copy edit the Building a custom Collector page](https://github.com/open-telemetry/opentelemetry.io/issues/8361)
(open)
- [#8362 - Copy edit the custom component pages](https://github.com/open-telemetry/opentelemetry.io/issues/8362)
(open)

### Phase 3 issues (otelcol-phase-3 milestone)

- [#5932 - Disaster recovery recommendations for Daemonset deployments](https://github.com/open-telemetry/opentelemetry.io/issues/5932)
(open)
- [#2692 - Follow up on collector deployment](https://github.com/open-telemetry/opentelemetry.io/issues/2692)
(open)

### Other related issues

- [#7268 - Add example for validating internal telemetry connection](https://github.com/open-telemetry/opentelemetry.io/issues/7268)
(open)
- [#3699 - Benchmarks page clarity](https://github.com/open-telemetry/opentelemetry.io/issues/3699)
(open)

## Project Board

The project is tracked via GitHub milestones:

- [otelcol-phase-1](https://github.com/open-telemetry/opentelemetry.io/milestone/20)
- [otelcol-phase-3](https://github.com/open-telemetry/opentelemetry.io/milestone/22)

## SIG Meetings and Other Info

This project is coordinated through SIG Communications and SIG Collector
meetings.

- **Slack channel**: `#otel-comms`
- **Meeting notes**: To be linked upon project start.
178 changes: 178 additions & 0 deletions projects/docs-automation-cicd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# Documentation Automation and CI/CD Enhancement

## Background and description

The OpenTelemetry documentation website relies on various automated processes
for building, validating, and maintaining content. However, several manual
processes remain, and existing automation has limitations that cause maintenance
burden and occasional build failures.

This project aims to enhance documentation automation, improve CI/CD pipelines,
and implement systems for tracking documentation freshness and detecting
outdated content.

### Current challenges

The current documentation automation has several gaps:

- **Manual Collector metrics maintenance**: The list of Collector internal
metrics on the website is manually maintained and can become out of sync with
the actual metrics generated by `mdatagen`. This leads to incomplete or
inaccurate documentation.
- **Version update automation issues**: The automated version update process for
Collector releases can break builds when patch releases don't include all
components (e.g., Collector Builder or Supervisor). The `cascade` field in
Hugo front matter causes issues with partial releases.
- **No automated outdated content detection**: There's no systematic way to
identify documentation that has become stale or needs updating. Content can
remain outdated for extended periods without maintainer awareness.
- **Missing mdatagen integration**: The `mdatagen` tool generates documentation
for Collector components, but this isn't integrated into the website
documentation pipeline.
- **Limited pre-commit checks**: Contributors may submit PRs that fail CI checks
because there are no local pre-commit hooks to catch common issues before
pushing.

**Existing automation verified in repository (`scripts/`):**

```
scripts/
├── auto-update/ # Version update scripts
│ ├── all-versions.sh # Updates all component versions
│ └── version-in-file.sh # Updates versions in specific files
├── git/
│ └── pre-commit.sh # Basic pre-commit (only runs `npm run fix:format:staged`)
├── _md-rules/ # Markdown validation rules
│ ├── trim-code-blank-lines.js
│ ├── unindent-code-blocks.js
│ └── validate-links.js
├── check-build-log.sh # Build log validation
├── check-i18n.sh # Localization drift detection
├── check-registry-urls # Registry URL validation
├── registry-scanner/ # Registry scanning tools
├── stats/ # Documentation statistics
└── content-modules/ # Content module processing
```

**Gap analysis:**

- ✅ Version update scripts exist but have issues with patch releases
- ⚠️ Pre-commit hook exists but is minimal (only formatting)
- ⚠️ Markdown rules exist but not integrated with pre-commit
- ❌ No mdatagen integration
- ❌ No automated freshness tracking
- ❌ No husky setup for comprehensive pre-commit hooks

If these challenges are not addressed:

- Documentation will continue to drift from actual software behavior.
- Maintainers will spend excessive time on manual updates and build fixes.
- Users will encounter inaccurate or outdated information.
- Contributor experience will suffer from slow feedback loops.

### Goals, objectives, and requirements

The goal of this project is to:

1. **Automate Collector metrics documentation**: Integrate mdatagen output into
the website.
2. **Fix version update automation**: Handle patch releases correctly without
breaking builds.
3. **Detect outdated content**: Implement systems to identify stale
documentation.
4. **Improve contributor experience**: Add pre-commit hooks and local validation
tools.
5. **Track documentation freshness**: Create visibility into documentation age
and update needs.

**Motivations for starting now**:

- The Collector documentation refactoring project (sig:collector:refactor) would
benefit from automated tooling.
- Recent build failures from patch releases highlight the urgency of fixing
version automation.
- Community feedback indicates that outdated content is a significant problem.

## Deliverables

### mdatagen integration

- **Automated metrics documentation**: Pipeline to extract and publish Collector
component metrics from mdatagen.
- **Component documentation sync**: Automated sync of component READMEs to
website.
- **Stability level tracking**: Automated tracking and display of component
stability levels.

### Version automation improvements

- **Patch release handling**: Fix the cascade version update mechanism to handle
partial releases.
- **Component-specific versioning**: Separate version tracking for Collector
core, contrib, Builder, and Supervisor.
- **Release notes integration**: Automated linking to release notes for each
version.

### Outdated content detection

- **Content age tracking**: System to track when content was last updated.
- **Drift detection alerts**: Automated notifications when content hasn't been
updated within a threshold.
- **External link validation**: Regular checks for broken external links.
- **API/SDK version comparisons**: Alerts when documented versions fall behind
latest releases.

### Contributor tooling

- **Pre-commit hooks**: Using [husky](https://github.com/typicode/husky) for:
- Markdown linting.
- Spell checking.
- Format validation.
- Link checking (local).
- **Local validation scripts**: Easy-to-run scripts for common validation tasks.
- **Documentation**: Guide for setting up and using local tooling.

### Freshness tracking

- **Documentation freshness dashboard**: Visibility into documentation age
across sections.
- **Update priority scoring**: System to prioritize which content needs updates
most urgently.
- **Maintainer notifications**: Automated notifications for high-priority
updates.

## Timeline

- **Month 1**: Audit current automation, document requirements.
- **Month 2**: Implement pre-commit hooks and local tooling.
- **Month 3**: Fix version update automation for patch releases.
- **Month 4**: Implement mdatagen integration for Collector metrics.
- **Month 5**: Build content freshness tracking system.
- **Month 6**: Testing, documentation, and rollout.

## Labels

- `docs:CI/infra`
- `sig:comms`
- `sig:collector`

## Related issues

- [#4523 - Automate the Collector list of internal metrics using `mdatagen` docs](https://github.com/open-telemetry/opentelemetry.io/issues/4523)
(open) - mdatagen integration for metrics documentation.
- [#7546 - Collector patch release updates break build because not all components are included](https://github.com/open-telemetry/opentelemetry.io/issues/7546)
(open) - Version automation issues.
- [#8313 - Set up git pre-commit and other hooks to make for a smoother contributor experience](https://github.com/open-telemetry/opentelemetry.io/issues/8313)
(open) - Pre-commit hooks using husky.

## Project Board

To be created upon project approval.

## SIG Meetings and Other Info

This project will be coordinated through SIG Communications meetings with input
from SIG Collector.

- **Slack channel**: `#otel-comms`
- **Meeting notes**: To be linked upon project start.
Loading
Loading