Skip to content

Commit 224b089

Browse files
theletterfMpdreamz
andauthored
Add docs-builder format command (#2084)
* Add format command * Add docs * Have the command parse docset * Various refactors * Remove test spaces * Update src/Elastic.Markdown/Myst/Linters/SpaceNormalizer.cs Co-authored-by: Martijn Laarman <[email protected]> * Add --check and --write commands * Further refactors --------- Co-authored-by: Martijn Laarman <[email protected]>
1 parent 20eedb5 commit 224b089

File tree

13 files changed

+527
-133
lines changed

13 files changed

+527
-133
lines changed

docs/_docset.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@ toc:
123123
- file: index.md
124124
- file: build.md
125125
- file: diff-validate.md
126+
- file: format.md
126127
- file: index-command.md
127128
- file: mv.md
128129
- file: serve.md

docs/cli/docset/format.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# format
2+
3+
Format documentation files by fixing common issues like irregular space
4+
5+
## Usage
6+
7+
```
8+
docs-builder format --check [options...]
9+
docs-builder format --write [options...]
10+
```
11+
12+
## Options
13+
14+
`--check`
15+
: Check if files need formatting without modifying them. Exits with code 1 if formatting is needed, 0 if all files are properly formatted. (required, mutually exclusive with --write)
16+
17+
`--write`
18+
: Write formatting changes to files. (required, mutually exclusive with --check)
19+
20+
`-p|--path` `<string>`
21+
: Path to the documentation folder, defaults to pwd. (optional)
22+
23+
## Description
24+
25+
The `format` command automatically detects and fixes formatting issues in your documentation files. The command only processes Markdown files (`.md`) that are included in your `_docset.yml` table of contents, ensuring that only intentional documentation files are modified.
26+
27+
You must specify exactly one of `--check` or `--write`:
28+
- `--check` validates formatting without modifying files, useful for CI/CD pipelines
29+
- `--write` applies formatting changes to files
30+
31+
Currently, it handles irregular space characters that may impair Markdown rendering.
32+
33+
### Irregular Space Detection
34+
35+
The format command detects and replaces 24 types of irregular space characters with regular spaces, including:
36+
37+
- No-Break Space (U+00A0)
38+
- En Space (U+2002)
39+
- Em Space (U+2003)
40+
- Zero Width Space (U+200B)
41+
- Line Separator (U+2028)
42+
- Paragraph Separator (U+2029)
43+
- And 18 other irregular space variants
44+
45+
These characters can cause unexpected rendering issues in Markdown and are often introduced accidentally through copy-paste operations from other applications.
46+
47+
## Examples
48+
49+
### Check if formatting is needed (CI/CD)
50+
51+
```bash
52+
docs-builder format --check
53+
```
54+
55+
Exit codes:
56+
- `0`: All files are properly formatted
57+
- `1`: Some files need formatting
58+
59+
### Apply formatting changes
60+
61+
```bash
62+
docs-builder format --write
63+
```
64+
65+
### Check specific documentation folder
66+
67+
```bash
68+
docs-builder format --check --path /path/to/docs
69+
```
70+
71+
### Format specific documentation folder
72+
73+
```bash
74+
docs-builder format --write --path /path/to/docs
75+
```
76+
77+
## Output
78+
79+
### Check mode output
80+
81+
When using `--check`, the command reports which files need formatting:
82+
83+
```
84+
Checking documentation in: /path/to/docs
85+
86+
Formatting needed:
87+
Files needing formatting: 2
88+
irregular space fixes needed: 3
89+
90+
Run 'docs-builder format --write' to apply changes
91+
```
92+
93+
### Write mode output
94+
95+
When using `--write`, the command reports the changes made:
96+
97+
```
98+
Formatting documentation in: /path/to/docs
99+
Formatted index.md (2 change(s))
100+
101+
Formatting complete:
102+
Files processed: 155
103+
Files modified: 1
104+
irregular space fixes: 2
105+
```
106+
107+
## Future Enhancements
108+
109+
The format command is designed to be extended with additional formatting capabilities in the future, such as:
110+
111+
- Line ending normalization
112+
- Trailing whitespace removal
113+
- Consistent heading spacing
114+
- And other formatting fixes

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ navigation_title: Elastic Docs v3
44

55
# Welcome to Elastic Docs v3
66

7-
Elastic Docs V3 is our next-generation documentation platform designed to improve the experience of learning, using, and contributing to Elastic products. Built on a foundation of modern authoring tools and scalable infrastructure, V3 offers faster builds, streamlined versioning, and enhanced navigation to guide users through Elastic’s complex ecosystem.
7+
Elastic Docs V3 is our next-generation documentation platform designed to improve the experience of learning, using, and contributing to Elastic products. Built on a foundation of modern authoring tools and scalable infrastructure, V3 offers faster builds, streamlined versioning, and enhanced navigation to guide users through Elastic’s complex ecosystem.
88

99
## What do you want to do today?
1010

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
// Licensed to Elasticsearch B.V under one or more agreements.
2+
// Elasticsearch B.V licenses this file to you under the Apache 2.0 License.
3+
// See the LICENSE file in the project root for more information
4+
5+
using System.Buffers;
6+
using Elastic.Markdown.Diagnostics;
7+
using Markdig;
8+
using Markdig.Helpers;
9+
using Markdig.Parsers;
10+
using Markdig.Parsers.Inlines;
11+
using Markdig.Renderers;
12+
using Markdig.Renderers.Html;
13+
using Markdig.Renderers.Html.Inlines;
14+
using Markdig.Syntax.Inlines;
15+
16+
namespace Elastic.Markdown.Myst.Linters;
17+
18+
public static class SpaceNormalizerBuilderExtensions
19+
{
20+
public static MarkdownPipelineBuilder UseSpaceNormalizer(this MarkdownPipelineBuilder pipeline)
21+
{
22+
pipeline.Extensions.AddIfNotAlready<SpaceNormalizerBuilderExtension>();
23+
return pipeline;
24+
}
25+
}
26+
27+
public class SpaceNormalizerBuilderExtension : IMarkdownExtension
28+
{
29+
public void Setup(MarkdownPipelineBuilder pipeline) =>
30+
pipeline.InlineParsers.InsertBefore<EmphasisInlineParser>(new SpaceNormalizerParser());
31+
32+
public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer) =>
33+
renderer.ObjectRenderers.InsertAfter<EmphasisInlineRenderer>(new SpaceNormalizerRenderer());
34+
}
35+
36+
public class SpaceNormalizerParser : InlineParser
37+
{
38+
// Collection of irregular space characters that may impair Markdown rendering
39+
private static readonly char[] IrregularSpaceChars =
40+
[
41+
'\u000B', // Line Tabulation (\v) - <VT>
42+
'\u000C', // Form Feed (\f) - <FF>
43+
'\u00A0', // No-Break Space - <NBSP>
44+
'\u0085', // Next Line
45+
'\u1680', // Ogham Space Mark
46+
'\u180E', // Mongolian Vowel Separator - <MVS>
47+
'\ufeff', // Zero Width No-Break Space - <BOM>
48+
'\u2000', // En Quad
49+
'\u2001', // Em Quad
50+
'\u2002', // En Space - <ENSP>
51+
'\u2003', // Em Space - <EMSP>
52+
'\u2004', // Tree-Per-Em
53+
'\u2005', // Four-Per-Em
54+
'\u2006', // Six-Per-Em
55+
'\u2007', // Figure Space
56+
'\u2008', // Punctuation Space - <PUNCSP>
57+
'\u2009', // Thin Space
58+
'\u200A', // Hair Space
59+
'\u200B', // Zero Width Space - <ZWSP>
60+
'\u2028', // Line Separator
61+
'\u2029', // Paragraph Separator
62+
'\u202F', // Narrow No-Break Space
63+
'\u205F', // Medium Mathematical Space
64+
'\u3000' // Ideographic Space
65+
];
66+
private static readonly SearchValues<char> SpaceSearchValues = SearchValues.Create(IrregularSpaceChars);
67+
68+
// Track which files have already had the hint emitted to avoid duplicates
69+
private static readonly HashSet<string> FilesWithHintEmitted = [];
70+
71+
public SpaceNormalizerParser() => OpeningCharacters = IrregularSpaceChars;
72+
73+
public override bool Match(InlineProcessor processor, ref StringSlice slice)
74+
{
75+
var span = slice.AsSpan().Slice(0, 1);
76+
if (span.IndexOfAny(SpaceSearchValues) == -1)
77+
return false;
78+
79+
processor.Inline = IrregularSpace.Instance;
80+
81+
// Emit a single hint per file on first detection
82+
var context = processor.GetContext();
83+
var filePath = context.MarkdownSourcePath.FullName;
84+
85+
lock (FilesWithHintEmitted)
86+
{
87+
if (!FilesWithHintEmitted.Contains(filePath))
88+
{
89+
_ = FilesWithHintEmitted.Add(filePath);
90+
processor.EmitHint(processor.Inline, 1, "Irregular space detected. Run 'docs-builder format --write' to automatically fix all instances.");
91+
}
92+
}
93+
94+
slice.SkipChar();
95+
return true;
96+
}
97+
}
98+
99+
public class IrregularSpace : LeafInline
100+
{
101+
public static readonly IrregularSpace Instance = new();
102+
};
103+
104+
public class SpaceNormalizerRenderer : HtmlObjectRenderer<IrregularSpace>
105+
{
106+
protected override void Write(HtmlRenderer renderer, IrregularSpace obj) =>
107+
renderer.Write(' ');
108+
}

src/Elastic.Markdown/Myst/Linters/WhiteSpaceNormalizer.cs

Lines changed: 0 additions & 127 deletions
This file was deleted.

src/Elastic.Markdown/Myst/MarkdownParser.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ public static MarkdownPipeline Pipeline
169169
.UseEnhancedCodeBlocks()
170170
.UseHtmxLinkInlineRenderer()
171171
.DisableHtml()
172-
.UseWhiteSpaceNormalizer()
172+
.UseSpaceNormalizer()
173173
.UseHardBreaks();
174174
_ = builder.BlockParsers.TryRemove<IndentedCodeBlockParser>();
175175
PipelineCached = builder.Build();

0 commit comments

Comments
 (0)