This file provides guidance for AI coding agents working on the sphinx-codelinks repository.
sphinx-codelinks is a Sphinx extension that provides fast source code traceability for Sphinx-Needs. It enables:
- Code analysis: Scan source code files (C++, Python, C#, Rust, YAML) for special comment markers
- Automatic documentation generation: Create Sphinx-Needs items from discovered code markers
- Source tracing: Link documentation to exact source code locations with line numbers
- Multiple languages: Support for various programming languages via tree-sitter parsers
- CLI interface: Command-line tools for analyzing code and generating RST documentation
The project integrates with Sphinx-Needs to provide seamless source code traceability in technical documentation.
pyproject.toml # Project configuration and dependencies
tox.ini # Tox test environment configuration
README.md # Project README
LICENSE # MIT License
src/sphinx_codelinks/ # Main source code
├── __init__.py # Package init with Sphinx setup() entry point
├── cmd.py # CLI commands using Typer
├── config.py # Configuration models using Pydantic
├── logger.py # Logging utilities
├── needextend_write.py # Write RST files with Sphinx-Needs directives
├── analyse/ # Code analysis module
│ ├── analyse.py # Main analysis orchestration
│ ├── models.py # Pydantic models for analysis results
│ ├── oneline_parser.py # One-line comment parser
│ ├── projects.py # Project-specific analyzers (C++, Python, etc.)
│ └── utils.py # Analysis utilities
├── source_discover/ # Source file discovery
│ ├── config.py # Discovery configuration
│ └── source_discover.py # File discovery logic
└── sphinx_extension/ # Sphinx extension components
├── source_tracing.py # Main Sphinx extension setup
├── html_wrapper.py # HTML output wrapper for traced source
├── debug.py # Debug utilities
├── ub_sct.css # CSS for source tracing UI
└── directives/ # Custom Sphinx directives
tests/ # Test suite
├── __init__.py
├── conftest.py # Pytest fixtures and configuration
├── test_analyse.py # Analysis tests
├── test_*.py # Various test modules
├── __snapshots__/ # Syrupy snapshot test fixtures
└── data/ # Test data and fixtures
docs/ # Documentation source (RST)
├── conf.py # Sphinx configuration
├── source/
│ ├── index.rst # Documentation index
│ ├── basics/ # Basic usage documentation
│ ├── components/ # Component documentation
│ └── development/ # Development documentation
All commands should be run via tox for consistency. The project uses tox-uv for faster environment creation.
# Run default test environment
tox
# Run tests for specific Python/Sphinx combination
tox -e py312-sphinx8
# Run a specific test file
tox -e py312-sphinx8 -- tests/test_analyse.py
# Run a specific test function
tox -e py312-sphinx8 -- tests/test_analyse.py::test_function_name
# Run with coverage
tox -e py312-sphinx8 -- --cov=sphinx_codelinks
# Update snapshot test fixtures
tox -e py312-sphinx8 -- --snapshot-update# Build docs (clean)
tox -e docs-clean
# Build docs (incremental, after clean build)
tox -e docs-update
# Build with different builder (e.g., linkcheck)
BUILDER=linkcheck tox -e docs-clean
# Live rebuild with browser auto-reload
tox -e docs-live# Type checking with mypy
tox -e mypy
# Linting with ruff (check only)
tox -e ruff-check
# Auto-format with ruff
tox -e ruff-fmt
# Run pre-commit hooks on all files
pre-commit run --all-files- Formatter/Linter: Ruff (configured in
pyproject.toml) - Type Checking: Mypy with strict settings (configured in
pyproject.toml) - Markdown: Follow markdownlint rules for consistent and well-formatted Markdown files
- Pre-commit: Use pre-commit hooks for consistent code style
- Type annotations: Use complete type annotations for all function signatures. Use Pydantic models for configuration and data structures.
- Docstrings: Use Sphinx-style docstrings (
:param:,:return:,:raises:). Types are not required in docstrings as they should be in type hints. - Markdown formatting: Write clear, well-structured Markdown that adheres to markdownlint rules. Use proper headings, lists, and code blocks.
- Immutability: Prefer immutable data structures where possible. Use frozen Pydantic models for configuration.
- Pure functions: Where possible, write pure functions without side effects.
- Error handling: Raise descriptive exceptions with helpful error messages. Use custom exception types where appropriate.
- Testing: Write tests for all new functionality. Use syrupy for snapshot testing of complex outputs.
def discover_source_files(
root_dir: Path,
include_patterns: list[str],
exclude_patterns: list[str],
*,
respect_gitignore: bool = True,
) -> list[Path]:
"""Discover source files matching the given patterns.
:param root_dir: The root directory to search from.
:param include_patterns: Glob patterns for files to include.
:param exclude_patterns: Glob patterns for files to exclude.
:param respect_gitignore: Whether to respect .gitignore rules.
:return: List of discovered file paths.
:raises ValueError: If root_dir does not exist.
"""
...- Tests use
pytestwith fixtures fromconftest.py - Snapshot testing uses
syrupyfor complex output comparisons - Test data is in
tests/data/directory - Sphinx integration tests use actual Sphinx projects in
tests/doc_test/
- For code analysis tests, create test data in
tests/data/with source files - Use
syrupyfor comparing complex analysis outputs (JSON, doctrees, etc.) - For Sphinx integration, create minimal projects in
tests/doc_test/ - Use parametrized tests for testing multiple language parsers
- Test coverage: Write tests for all new functionality and bug fixes
- Isolation: Each test should be independent and not rely on state from other tests
- Descriptive names: Test function names should describe what is being tested
- Snapshot testing: Use
snapshot.assert_match()for complex output comparisons - Parametrization: Use
@pytest.mark.parametrizefor multiple test scenarios - Fixtures: Define reusable fixtures in
conftest.py
import pytest
from pathlib import Path
def test_analyse_cpp_file(snapshot, tmp_path):
"""Test C++ file analysis produces correct output."""
# Arrange
source_file = tmp_path / "test.cpp"
source_file.write_text("""
// @req{REQ-001}
void function() {}
""")
# Act
result = analyse_file(source_file)
# Assert
assert snapshot == resultUse this format:
<EMOJI> <KEYWORD>: Summarize in 72 chars or less (#<PR>)
Optional detailed explanation.
Keywords:
✨ NEW:– New feature🐛 FIX:– Bug fix👌 IMPROVE:– Improvement (no breaking changes)‼️ BREAKING:– Breaking change📚 DOCS:– Documentation🔧 MAINTAIN:– Maintenance changes only (typos, etc.)🧪 TEST:– Tests or CI changes only♻️ REFACTOR:– Refactoring
Use the same as for the commit message format,
but for the title you can omit the KEYWORD and only use EMOJI
When submitting changes:
- Description: Include a meaningful description or link explaining the change
- Tests: Include test cases for new functionality or bug fixes
- Documentation: Update docs if behavior changes or new features are added
- Changelog: Update relevant changelog or release notes
- Code Quality: Ensure
pre-commit run --all-filespasses
The code analysis follows a multi-stage pipeline:
Source Files → Discovery → Parsing → Analysis → Results (JSON) → RST Generation
- Discovery (
source_discover/): Scan directories for source files matching patterns - Parsing (
analyse/oneline_parser.py): Use tree-sitter to parse source code AST - Analysis (
analyse/analyse.py,analyse/projects.py): Extract markers and metadata - Output (
needextend_write.py): Generate RST with Sphinx-Needs directives
The Sphinx extension hooks into multiple build events to provide source tracing:
flowchart TB
subgraph init["Initialization (config-inited)"]
setup["setup() in __init__.py"]
load_toml["load_config_from_toml()"]
sn_options["update_sn_extra_options()"]
sn_types["update_sn_types()"]
check_config["check_sphinx_configuration()"]
end
subgraph prepare["Build Preparation"]
builder_init["builder_inited: Copy CSS assets"]
env_prepare["env-before-read-docs: prepare_env()"]
end
subgraph generate["Page Generation (html-collect-pages)"]
gen_pages["generate_code_page()"]
html_wrap["html_wrapper()"]
end
subgraph context["Page Context (html-page-context)"]
add_css["add_custom_css()"]
end
subgraph finish["Build Finished"]
warnings["emit_warnings()"]
timing["debug.process_timing()"]
end
setup --> load_toml --> sn_options --> sn_types --> check_config
check_config --> builder_init --> env_prepare
env_prepare --> gen_pages --> html_wrap
html_wrap --> add_css --> warnings --> timing
style load_toml fill:#e1f5fe
style gen_pages fill:#e1f5fe
style html_wrap fill:#e1f5fe
The extension connects to these Sphinx events (in execution order):
| Event | Handler | Purpose |
|---|---|---|
config-inited |
load_config_from_toml() |
Load configuration from TOML file if specified |
config-inited |
update_sn_extra_options() |
Register sphinx-needs extra options (project, file, directory, URLs) |
config-inited |
update_sn_types() |
Add srctrace need type to sphinx-needs |
config-inited |
check_sphinx_configuration() |
Validate configuration and raise errors |
builder-inited |
builder_inited() |
Copy CSS assets to output directory |
env-before-read-docs |
prepare_env() |
Initialize timing measurements and debug filters |
html-collect-pages |
generate_code_page() |
Generate HTML pages for traced source files |
html-page-context |
add_custom_css() |
Inject custom CSS for source tracing UI |
build-finished |
emit_warnings() |
Emit collected warnings from analysis |
build-finished |
debug.process_timing() |
Output timing measurements if enabled |
-
sphinx-needs Dependency: The extension requires sphinx-needs and checks for its presence in
setup(). It adds extra options (project,file,directory, URL fields) and a custom need type (srctrace). -
TOML Configuration: Configuration can be loaded from a TOML file specified in
conf.pyviasrc_trace_config_from_toml. The TOML is parsed and values are set on the Sphinx config object. -
Source Page Generation: The
generate_code_page()function yields tuples of(pagename, context, template)for each traced source file, allowing Sphinx to generate standalone HTML pages with syntax-highlighted source code and line-number anchors. -
CSS Injection: Custom CSS (
ub_sct.css) is copied to_static/source_tracing/and added only to pages that contain traced source code.
Pydantic models define all configuration options:
AnalyseConfig: Main analysis configuration with source paths, patterns, markers- Uses Pydantic v2 with validation and serialization
- Configuration loaded from TOML files
discover_source_files(): Find source files matching include/exclude patterns- Respects
.gitignorerules usinggitignore-parser - Returns filtered list of files to analyze
analyse.py: Main orchestrator that coordinates analysis across all source filesprojects.py: Language-specific analyzers (C++, Python, C#, Rust, YAML)oneline_parser.py: Tree-sitter based parser for extracting comment markersmodels.py: Pydantic models for analysis results (markers, line ranges, etc.)utils.py: Helper functions for path handling, marker extraction
- Uses tree-sitter parsers for each supported language
- Extracts comments from AST nodes
- Parses special marker syntax (e.g.,
@req{ID},@test{ID}) - Maintains line number information for source tracing
source_tracing.py: Main extension setup withsetup()functionhtml_wrapper.py: Wraps source code blocks with tracing metadatadebug.py: Debug utilities for development- Hooks into Sphinx build events to inject source tracing information
The CLI uses Typer for command definitions:
codelinks analyse <config>: Analyze source code and output JSONcodelinks write <format> <input> --outpath <file>: Generate RST from JSON
pyproject.toml- Project configuration, dependencies, and tool settingssrc/sphinx_codelinks/__init__.py- Package entry point withsetup()for Sphinxsrc/sphinx_codelinks/cmd.py- CLI commands and argument parsingsrc/sphinx_codelinks/config.py- Pydantic configuration modelssrc/sphinx_codelinks/analyse/analyse.py- Main analysis orchestrationsrc/sphinx_codelinks/analyse/projects.py- Language-specific analyzerssrc/sphinx_codelinks/analyse/oneline_parser.py- Tree-sitter comment parsersrc/sphinx_codelinks/sphinx_extension/source_tracing.py- Sphinx extension setuptests/conftest.py- Pytest fixtures and test configuration
- Use
--pdbwith pytest to drop into debugger on failures:tox -e py312-sphinx8 -- --pdb - Use
-vfor verbose test output:tox -e py312-sphinx8 -- -v - Build docs with
-Tflag for full tracebacks:tox -e docs-clean -- -T - Set logging level in tests:
tox -e py312-sphinx8 -- --log-cli-level=DEBUG - Use
debug.pymodule functions for development debugging
-
Add tree-sitter parser dependency to
pyproject.toml(e.g.,tree-sitter-java) -
Create language-specific analyzer in
analyse/projects.py:class JavaAnalyzer(BaseAnalyzer): language = "java" parser_language = "java" def get_comment_nodes(self, tree): # Return comment nodes from tree
-
Register analyzer in
LANGUAGE_ANALYZERSdict inprojects.py -
Add test files in
tests/data/<language>/ -
Add tests in
tests/test_analyse.py
- Update marker regex patterns in
config.pyor analyzer - Update
models.pyif new fields are needed - Update parsing logic in
oneline_parser.py - Update RST generation in
needextend_write.pyif needed - Add tests with new marker examples
-
Add command function in
cmd.pyusing Typer decorators:@app.command() def new_command(arg: str = typer.Argument(..., help="Description")): """Command description.""" # Implementation
-
Add tests in
tests/test_cmd.py -
Update documentation in
docs/source/components/cli.rst
- Add field to
AnalyseConfigor relevant Pydantic model inconfig.py - Add validation if needed using Pydantic validators
- Update TOML configuration examples in
docs/andtests/data/configs/ - Add tests for new configuration option
- Document in
docs/source/components/configuration.rst