-
Notifications
You must be signed in to change notification settings - Fork 16.6k
feat(docs): auto-generate database documentation from lib.py #36805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
1 similar comment
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Rebuild the database documentation system so that lib.py is the single source of truth. The script outputs JSON that React components consume to render the documentation pages. Changes: - Add comprehensive DATABASE_DOCS dictionary to lib.py with 53 databases - Create generate-database-docs.mjs build script - Create DatabaseIndex and DatabasePage React components - Replace 1900 lines of manual markdown with component-based rendering - Integrate into docs build pipeline (yarn start/build) To update documentation, just update DATABASE_DOCS in lib.py. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Run the diagnostic tests with Flask context to get actual feature scores for each database engine spec. Top scores: - Presto: 159/201 - Trino: 149/201 - Apache Hive/Spark: 140/201 - PostgreSQL: 104/201 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add compatible databases (YugabyteDB, TimescaleDB, Hologres) to the overview table with a link to their parent database's documentation. Compatible DBs show a "PostgreSQL compatible" tag and inherit feature scores from their parent. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update generate-database-docs.mjs to create individual MDX files
- Each database now has its own page at /docs/configuration/databases/{slug}
- Overview page at /docs/configuration/databases/ with filterable table
- Fix category counts in filter dropdown
- Links in table now point to individual pages
- Use cached databases.json when it has full diagnostic data
Generated 64 database pages + index page.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move Databases section from Configuration to top-level navigation - Add Databases to Documentation dropdown menu in navbar - Set "Next" version as default documentation version - Improve database page layout with larger logos (height: 120) - Hide duplicate H1 headings via hide_title frontmatter - Fix diagnostics preservation in fallback mode when Flask context unavailable - Add logos and homepage URLs to DATABASE_DOCS in lib.py - Show compatible databases (e.g., YugabyteDB) in overview table - Dynamically generate front page database grid from databases.json 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The generate-database-docs script now updates the main README.md with database logos between marker comments: - <!-- SUPPORTED_DATABASES_START --> - <!-- SUPPORTED_DATABASES_END --> This ensures the README stays in sync with DATABASE_DOCS in lib.py. Also updated docs links to point to new /docs/databases path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The generate-database-docs script now only updates README.md when explicitly requested via: - --update-readme flag - UPDATE_README=true env var Added npm script: yarn update:readme-db-logos This prevents CI from failing due to uncommitted README changes during docs builds. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes CodeQL security alert: incomplete string escaping. Backslashes must be escaped before quotes to prevent malformed YAML frontmatter in generated MDX files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes CodeQL security warning about shell commands built from environment values. Now uses spawnSync with: - cwd option instead of cd in shell command - env option for environment variables - arguments passed as array (no shell parsing) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security fix for CodeQL warning about shell command injection. Converted extractDatabaseDocs() and extractDatabaseDocsSimple() to use spawnSync with cwd option instead of execSync with shell string interpolation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Compatible databases can share the same name across multiple parent engines. Using only the name as rowKey leads to duplicate React keys. Fixed by combining parent engine name with database name for compatible database entries. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The language prop was defined but never used since CodeBlock doesn't implement syntax highlighting. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added three new licensing categories to DatabaseCategory: - OPEN_SOURCE: Self-hosted open source databases (PostgreSQL, MySQL, ClickHouse, etc.) - HOSTED_OPEN_SOURCE: Managed services running open source software (Aurora, MotherDuck, Databricks) - PROPRIETARY: Closed source databases (Snowflake, BigQuery, Oracle, etc.) Updated all 60 database engine specs with appropriate licensing categories. Also added categories to compatible_databases entries (Aurora MySQL/PostgreSQL, MotherDuck, IBM Db2 for i) and updated CompatibleDatabase TypedDict to support the categories field. This gives users three dimensions to filter databases: 1. Cloud provider (AWS, GCP, Azure) 2. Database type (Analytical, RDBMS, NoSQL, etc.) 3. Licensing (Open Source, Hosted Open Source, Proprietary) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated the docs generation and React component to properly handle the categories array (instead of singular category): - generate-database-docs.mjs: Fixed byCategory stats to use docs.categories array and map constant names to display names - DatabaseIndex.tsx: Updated to render multiple category tags per database and filter by any matching category - types.ts: Changed category to categories (array) in TypeScript types - Regenerated databases.json with correct category mappings Each database now correctly shows all its categories (database type, cloud provider, and licensing) as separate tags. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added 'categories' to NON_INHERITABLE_FIELDS in the deep_merge function. This prevents child classes from accumulating parent categories, which was causing databases like Apache Spark SQL to show duplicate category tags. Each engine spec class now defines only its own categories without inheriting from parent classes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The generate-database-docs.mjs script was not adding a trailing newline when writing databases.json, causing the end-of-file-fixer pre-commit hook to fail. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
CodeAnt AI is running Incremental review Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
When hovering over the time grain count in the database index table, users now see a tooltip listing all supported time grains for that database (e.g., "Second, Minute, Hour, Day, Week, Month, Quarter, Year"). Time grain names are formatted for readability (e.g., FIVE_MINUTES -> "5 min"). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Display time grains as individual tags wrapped in a flex container instead of a comma-separated string. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added 5 new feature indicators to the Features column: - File Upload (38 DBs) - Can upload CSV/Excel files - Query Cancel (15 DBs) - Can cancel running queries - Cost Estimation (7 DBs) - Can estimate query cost before running - User Impersonation (7 DBs) - Supports user impersonation for RLS - SQL Validation (2 DBs) - Can validate SQL syntax All features are filterable in the table header dropdown. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added compatible_databases for cloud-hosted versions of open source databases: - MotherDuck: Now has its own metadata with motherduck.png logo (was inheriting DuckDB's logo) - StarRocks: Added CelerData (cloud-hosted StarRocks) - ClickHouse: Added ClickHouse Cloud and Altinity.Cloud - Trino: Added Starburst Galaxy and Starburst Enterprise - Elasticsearch: Added Elastic Cloud and Amazon OpenSearch Service Also deduplicated the logo wall on the docs homepage by filtering out duplicate logo filenames (fixes duplicate DB2 logos appearing). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Imply is the enterprise/cloud distribution of Apache Druid. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…rds smaller Added unique logos for: - CelerData (starrocks cloud) - Starburst (trino cloud/enterprise) - Altinity (clickhouse managed) - Imply (druid cloud/enterprise) Also made the database logo cards smaller on the homepage: - 8 columns instead of 5 - Smaller card height (80px vs 120px) - Tighter spacing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added optional `link` prop to SectionHeader component and used it to link the "Supported Databases" title to /docs/databases. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…s" sidebar
Reorganized the sidebar structure so individual database pages are
nested under a collapsible "Supported Databases" section:
Before:
- Databases
- Overview
- Amazon Athena
- Apache Druid
- ...
After:
- Databases
- Overview
- Supported Databases (collapsible)
- Amazon Athena
- Apache Druid
- ...
Updated:
- generate-database-docs.mjs to output MDX to supported/ subdirectory
- DatabaseIndex.tsx links to use /supported/ path
- Homepage database card links to use /supported/ path
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
This PR introduces an automated database documentation system that generates documentation pages from engine spec
metadataattributes. Each database engine spec now contains its own documentation metadata, providing a single source of truth.Key Features
metadataattribute (removed 1150+ lineDATABASE_DOCSdict from lib.py)metadatavia AST - no Flask context required for CIArchitecture
How to Add a New Database
superset/db_engine_specs/(e.g.,mydatabase.py)metadataattribute with required fields:python superset/db_engine_specs/lint_metadata.pydocs/static/img/databases/Changes
New Files:
superset/db_engine_specs/arc.py- Arc data platform stub specsuperset/db_engine_specs/d1.py- Cloudflare D1 stub specsuperset/db_engine_specs/hologres.py- Alibaba Cloud Hologres (PostgreSQL-compatible)superset/db_engine_specs/timescaledb.py- TimescaleDB (PostgreSQL-compatible)superset/db_engine_specs/yugabytedb.py- YugabyteDB (PostgreSQL-compatible)superset/db_engine_specs/lint_metadata.py- Metadata completeness lintersuperset/db_engine_specs/METADATA_STATUS.md- Auto-generated status reportModified Files:
superset/db_engine_specs/lib.py- RemovedDATABASE_DOCSdict (~1150 lines), updatedget_documentation_metadata()superset/db_engine_specs/README.md- Added comprehensive "How to Add a Database" guidesuperset/db_engine_specs/*.py- Addedmetadataattributes to 60+ engine specsdocs/scripts/generate-database-docs.mjs- Simplified to read from engine specmetadatavia AST (removed DATABASE_DOCS fallback)Documentation Build Modes
The
generate-database-docs.mjsscript supports two modes:diagnose()to get detailed feature scores. Requires Superset installed locally.metadataattributes. Works without Flask.Metadata Completeness
Current status (63 engine specs with metadata):
Run
python superset/db_engine_specs/lint_metadata.pyto see the full report.Screenshots
Test Plan
python superset/db_engine_specs/lint_metadata.pyto verify metadata extractioncd docs && yarn buildto verify documentation generationdatabases.jsonfor all 63 databases/docs/databases/<database-name>