Skip to content

Commit a661e2d

Browse files
mjwolfclaude
andauthored
Advanced developer docs (#2569)
This adds advanced developer documentation for ECS tooling, which documents all steps of the generation pipeline, adds information on the ECS-OTel mapping process, adds pydoc to all functions. It also adds documentation on field re-use, subset and exclude filters. --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 74084a3 commit a661e2d

30 files changed

+4775
-559
lines changed

USAGE.md

Lines changed: 176 additions & 315 deletions
Large diffs are not rendered by default.

docs/reference/ecs-artifacts.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ applies_to:
88

99
# Generated artifacts [ecs-artifacts]
1010

11-
ECS maintains a collection of artifacts which are generated based on the schema. Examples include Elasticsearch index templates, CSV, and Beats field mappings. The maintained artifacts can be found in the [ECS Github repo](https://github.com/elastic/ecs/blob/master/generated#artifacts-generated-from-ecs).
11+
ECS maintains a collection of artifacts which are generated based on the schema. Examples include Elasticsearch index templates, CSV, and Beats field mappings. The maintained artifacts can be found in the [ECS Github repo](https://github.com/elastic/ecs/blob/main/generated#artifacts-generated-from-ecs).
1212

13-
Users can generate custom versions of these artifacts using the ECS project’s tooling. See the tooling [usage documentation](https://github.com/elastic/ecs/blob/master/USAGE.md) for more detail.
13+
Users can generate custom versions of these artifacts using the ECS project’s tooling. See the tooling [usage documentation](https://github.com/elastic/ecs/blob/main/USAGE.md) for more detail.
1414

docs/reference/ecs-converting.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Before you start a conversion, be sure that you understand the basics below.
2222

2323
Make sure you understand the distinction between Core and Extended fields, as explained in the [Guidelines and Best Practices](/reference/ecs-guidelines.md).
2424

25-
Core and Extended fields are documented in the [*ECS Field Reference*](/reference/ecs-field-reference.md) or, for a single page representation of all fields, please see the [generated CSV of fields](https://github.com/elastic/ecs/blob/master/generated/csv/fields.csv).
25+
Core and Extended fields are documented in the [*ECS Field Reference*](/reference/ecs-field-reference.md) or, for a single page representation of all fields, please see the [generated CSV of fields](https://github.com/elastic/ecs/blob/main/generated/csv/fields.csv).
2626

2727

2828
### An approach to mapping an existing implementation [ecs-conv]

docs/reference/ecs-field-reference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ ECS defines multiple groups of related fields. They are called "field sets". The
1616

1717
All other field sets are defined as objects in Elasticsearch, under which all fields are defined.
1818

19-
For a single page representation of all fields, please see the [generated CSV of fields](https://github.com/elastic/ecs/blob/master/generated/csv/fields.csv).
19+
For a single page representation of all fields, please see the [generated CSV of fields](https://github.com/elastic/ecs/blob/main/generated/csv/fields.csv).
2020

2121

2222
## Field sets [ecs-fieldsets]

docs/reference/ecs-user-usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,5 +345,5 @@ Like the other fields in the [related](/reference/ecs-related.md) field set, `re
345345

346346
## Mapping examples [ecs-user-usage-mappings]
347347

348-
For examples of mapping events from various sources, you can look at [RFC 0007 in section Source Data](https://github.com/elastic/ecs/blob/master/rfcs/text/0007-multiple-users.md#source-data).
348+
For examples of mapping events from various sources, you can look at [RFC 0007 in section Source Data](https://github.com/elastic/ecs/blob/main/rfcs/text/0007-multiple-users.md#source-data).
349349

docs/reference/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,5 +41,5 @@ ECS is a permissive schema. If your events have additional data that cannot be m
4141

4242
ECS improvements are released following [Semantic Versioning](https://semver.org/). Major ECS releases are planned to be aligned with major Elastic Stack releases.
4343

44-
Any feedback on the general structure, missing fields, or existing fields is appreciated. For contributions please read the [Contribution Guidelines](https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md).
44+
Any feedback on the general structure, missing fields, or existing fields is appreciated. For contributions please read the [Contribution Guidelines](https://github.com/elastic/ecs/blob/main/CONTRIBUTING.md).
4545

scripts/docs/README.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# ECS Scripts Developer Documentation
2+
3+
This directory contains developer-focused documentation for the ECS generation scripts.
4+
5+
## Purpose
6+
7+
The ECS repository includes a comprehensive toolchain for generating various artifacts from schema definitions. These developer guides explain:
8+
9+
- **How each component works** internally
10+
- **Architecture and design decisions**
11+
- **How to make changes** and extend functionality
12+
- **Troubleshooting** common issues
13+
14+
## Documentation Structure
15+
16+
### Module-Specific Guides
17+
18+
Each major generator module has its own detailed guide:
19+
20+
- **[otel-integration.md](otel-integration.md)** - OpenTelemetry Semantic Conventions integration
21+
- Validation of ECS ↔ OTel mappings
22+
- Loading OTel definitions from GitHub
23+
- Generating alignment summaries
24+
25+
- **[markdown-generator.md](markdown-generator.md)** - Markdown documentation generation
26+
- Rendering ECS schemas to human-readable docs
27+
- Jinja2 template system and customization
28+
- OTel alignment documentation
29+
- Adding new page types
30+
31+
- **[intermediate-files.md](intermediate-files.md)** - Intermediate file generation
32+
- Flat and nested format representations
33+
- Bridge between schema processing and artifact generation
34+
- Top-level vs. reusable fieldsets
35+
- Data structure reference
36+
37+
- **[es-template.md](es-template.md)** - Elasticsearch template generation
38+
- Composable vs. legacy template formats
39+
- Field type mapping conversion
40+
- Template customization and settings
41+
- Installation and troubleshooting
42+
43+
- **[csv-generator.md](csv-generator.md)** - CSV field reference generation
44+
- Spreadsheet-compatible field export
45+
- Column structure and multi-field handling
46+
- Analysis and integration examples
47+
- Usage in Excel, Google Sheets, databases
48+
49+
- **[beats-generator.md](beats-generator.md)** - Beats field definition generation
50+
- YAML field definitions for Elastic Beats
51+
- Default field selection and allowlist
52+
- Contextual naming and field groups
53+
- Integration with Beat modules
54+
55+
### Quick Reference
56+
57+
For high-level usage information, see:
58+
- **[../../USAGE.md](../../USAGE.md)** - User guide for running the generators
59+
- **[../../CONTRIBUTING.md](../../CONTRIBUTING.md)** - Contribution guidelines
60+
61+
## Scripts Overview
62+
63+
The `scripts/` directory contains several key components:
64+
65+
### Core Modules
66+
67+
| Module | Purpose | Documentation |
68+
|--------|---------|---------------|
69+
| `generator.py` | **Main entry point** - orchestrates complete pipeline | Comprehensive docstrings in file |
70+
| `generators/otel.py` | OTel integration and validation | [otel-integration.md](otel-integration.md) |
71+
| `generators/markdown_fields.py` | Markdown documentation generation | [markdown-generator.md](markdown-generator.md) |
72+
| `generators/intermediate_files.py` | Intermediate format generation | [intermediate-files.md](intermediate-files.md) |
73+
| `generators/es_template.py` | Elasticsearch template generation | [es-template.md](es-template.md) |
74+
| `generators/csv_generator.py` | CSV field reference export | [csv-generator.md](csv-generator.md) |
75+
| `generators/beats.py` | Beats field definition generation | [beats-generator.md](beats-generator.md) |
76+
| `generators/ecs_helpers.py` | Shared utility functions | See docstrings in file |
77+
78+
### Schema Processing
79+
80+
The schema processing pipeline transforms YAML schema definitions through multiple stages. See [schema-pipeline.md](schema-pipeline.md) for complete pipeline documentation.
81+
82+
| Module | Purpose | Documentation |
83+
|--------|---------|---------------|
84+
| **Pipeline Overview** | Complete schema processing flow | **[schema-pipeline.md](schema-pipeline.md)** |
85+
| `schema/loader.py` | Load and parse YAML schemas → nested structure | [schema-pipeline.md#1-loaderpy---schema-loading](schema-pipeline.md#1-loaderpy---schema-loading) |
86+
| `schema/cleaner.py` | Validate, normalize, apply defaults | [schema-pipeline.md#2-cleanerpy---validation--normalization](schema-pipeline.md#2-cleanerpy---validation--normalization) |
87+
| `schema/finalizer.py` | Perform field reuse, calculate names | [schema-pipeline.md#3-finalizerpy---field-reuse--name-calculation](schema-pipeline.md#3-finalizerpy---field-reuse--name-calculation) |
88+
| `schema/visitor.py` | Traverse field hierarchies (visitor pattern) | [schema-pipeline.md#visitorpy---field-traversal](schema-pipeline.md#visitorpy---field-traversal) |
89+
| `schema/subset_filter.py` | Filter to include only specified fields | [schema-pipeline.md#4-subset_filterpy---subset-filtering-optional](schema-pipeline.md#4-subset_filterpy---subset-filtering-optional) |
90+
| `schema/exclude_filter.py` | Explicitly remove specified fields | [schema-pipeline.md#5-exclude_filterpy---exclude-filtering-optional](schema-pipeline.md#5-exclude_filterpy---exclude-filtering-optional) |
91+
92+
### Types
93+
94+
| Module | Purpose |
95+
|--------|---------|
96+
| `ecs_types/schema_fields.py` | Core ECS type definitions |
97+
| `ecs_types/otel_types.py` | OTel-specific types |
98+
99+
## Getting Started
100+
101+
If you're new to the ECS generator codebase:
102+
103+
1. **Start with the main orchestrator**: Read `generator.py` docstrings to understand the pipeline
104+
2. **Understand schema processing**: Read [schema-pipeline.md](schema-pipeline.md)
105+
3. **Pick a generator**: Choose a specific generator that interests you
106+
4. **Read its documentation**: Start with the module-specific guide
107+
5. **Explore the code**: Read the source with the guide as reference
108+
6. **Run it**: Try generating artifacts to see it in action
109+
110+
### Quick Command Reference
111+
112+
```bash
113+
# Standard generation (from local schemas)
114+
python scripts/generator.py --semconv-version v1.24.0
115+
116+
# From specific git version
117+
python scripts/generator.py --ref v8.10.0 --semconv-version v1.24.0
118+
119+
# With custom schemas
120+
python scripts/generator.py --include custom/schemas/ --semconv-version v1.24.0
121+
122+
# Generate subset only
123+
python scripts/generator.py --subset schemas/subsets/minimal.yml --semconv-version v1.24.0
124+
125+
# Strict validation mode
126+
python scripts/generator.py --strict --semconv-version v1.24.0
127+
128+
# Intermediate files only (fast iteration)
129+
python scripts/generator.py --intermediate-only --semconv-version v1.24.0
130+
```
131+
132+
See `generator.py` docstrings for complete argument documentation.
133+
134+
## Contributing Documentation
135+
136+
When adding or modifying generator code:
137+
138+
1. **Update docstrings**: Add comprehensive Python docstrings to all functions and classes
139+
2. **Update/create guide**: Ensure a markdown guide exists explaining the component
140+
3. **Update this README**: Add links to new documentation
141+
4. **Include examples**: Show practical usage examples
142+
5. **Document edge cases**: Explain tricky parts and gotchas
143+
144+
### Documentation Standards
145+
146+
- **Python docstrings**: Use Google-style docstrings with Args, Returns, Raises, Examples
147+
- **Markdown guides**: Include Overview, Architecture, Usage Examples, Troubleshooting
148+
- **Code examples**: Should be runnable (or clearly marked as pseudocode)
149+
- **Diagrams**: Use ASCII/Unicode diagrams for flow visualization
150+
- **Tables**: Use markdown tables for structured comparisons
151+
152+
## Questions?
153+
154+
For questions about:
155+
- **Using the tools**: See [USAGE.md](../../USAGE.md) or ask in the [Elastic community forums](https://discuss.elastic.co/)
156+
- **Contributing**: See [CONTRIBUTING.md](../../CONTRIBUTING.md)
157+
- **Architecture**: Read the relevant module guide in this directory
158+
- **Bugs**: [Open an issue](https://github.com/elastic/ecs/issues)
159+

0 commit comments

Comments
 (0)