-
Notifications
You must be signed in to change notification settings - Fork 132
Migrate studio documentation to datachain #1394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate studio documentation to datachain #1394
Conversation
Migrate comprehensive Studio documentation from the DVC.org repository (removed in iterative/dvc.org#5446) to DataChain documentation under the Studio section. ## Changes ### New Documentation Structure - **studio/index.md**: DataChain Studio overview and introduction - **studio/user-guide/**: Complete user guide with sections for: - Account management and authentication (SSO, OpenID Connect) - Datasets (create, explore, share, visualize) - Jobs (create, run, monitor) - Git connections (GitHub App, GitLab) - Team collaboration and troubleshooting - **studio/api/index.md**: Comprehensive REST API documentation - **studio/self-hosting/**: Self-hosting guides and configuration ### Content Adaptations - Updated all references from "DVC Studio"/"Iterative Studio" to "DataChain Studio" - Adapted content for DataChain workflows (datasets and jobs vs experiments) - Updated URLs from studio.iterative.ai to studio.datachain.ai - Revised feature descriptions to match DataChain Studio capabilities ### Navigation Updates - Added comprehensive Studio navigation structure to mkdocs.yml - Organized documentation into logical sections with proper hierarchy - Ensured all links are properly structured for the new layout ### Technical Changes - Fixed mkdocstrings configuration for compatibility - Updated navigation paths to match new file structure - Maintained existing webhooks.md with minor updates ## Files Added - 20 new Studio documentation files - Complete user guide covering all major Studio features - Self-hosting documentation for enterprise deployments - API documentation adapted for DataChain Studio ## Validation - Tested mkdocs build in strict mode - Validated navigation structure and internal links - Ensured proper markdown formatting and compatibility This migration provides DataChain users with comprehensive Studio documentation while maintaining the existing structure and adding DataChain-specific adaptations.
Reviewer's GuideThis PR migrates the Studio documentation into the DataChain repository by updating the MkDocs configuration and adding a complete Studio section under docs/studio, including an overview page, a comprehensive user guide, API and Webhooks references, and self-hosting instructions—all rebranded and structured for DataChain Studio. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Cursor Agent can help with this pull request. Just |
Deploying datachain-documentation with
|
| Latest commit: |
208166a
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://66c4d776.datachain-documentation.pages.dev |
| Branch Preview URL: | https://cursor-migrate-studio-docume.datachain-documentation.pages.dev |
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1394 +/- ##
=======================================
Coverage 87.77% 87.77%
=======================================
Files 160 160
Lines 15161 15161
Branches 2173 2173
=======================================
Hits 13307 13307
Misses 1351 1351
Partials 503 503
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Create all missing self-hosting documentation files to resolve mkdocs strict mode warnings: ## Files Added - installation/aws-ami.md - AWS AMI installation guide - installation/k8s-helm.md - Kubernetes Helm installation guide - configuration/index.md - Main configuration overview - configuration/ssl-tls.md - SSL/TLS certificate configuration - configuration/ca-certificates.md - Custom CA certificate setup - configuration/git-forges/*.md - Git forge integration guides - upgrading/*.md - Upgrade procedures (regular and airgap) - troubleshooting/*.md - Troubleshooting guides and support bundle ## Content Features - Comprehensive installation guides for AWS AMI and Kubernetes - Detailed configuration documentation with examples - Complete Git forge integration (GitHub, GitLab, Bitbucket) - Step-by-step upgrade procedures for both connected and air-gapped environments - Troubleshooting guides for common issues like 502 errors - Support bundle generation for diagnostic information ## Technical Validation - ✅ mkdocs builds successfully in strict mode (no warnings) - ✅ All navigation links resolve correctly - ✅ Content adapted for DataChain Studio terminology - ✅ Internal links and references properly structured This completes the comprehensive Studio documentation migration with full self-hosting support for enterprise deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @amritghimire, your pull request is larger than the review limit of 150000 diff characters
|
@amritghimire some tests are failing, PTAL |
just remove it, or better mention that it also has DVC + Git integration for experiments, and model registry.
find mentions of Git - remove them when context is not about DVC please |
|
@amritghimire I see that you removed the whole thing about DVC and experiments? Can we keep it as a separate section? It is still good to have I think. |
Co-authored-by: cursor <[email protected]>
|
pre-commit.ci autofix |
Co-authored-by: cursor <[email protected]>
for more information, see https://pre-commit.ci
Address feedback on Git workflow mentions and restore DVC capabilities: ## Key Improvements ### Content Clarifications - **Dual Workflow Support**: Updated index to clearly show Studio supports both: - DataChain workflows for unstructured data processing - DVC + Git workflows for ML experiment tracking and model registry - **Context-Specific Git References**: Updated Git workflow mentions to be specific to DVC-based projects where appropriate - **Architecture Description**: Changed 'Git Integration' to 'Repository Integration' for broader accuracy ### DVC Experiments Section Added - **New experiments/index.md**: Comprehensive guide for DVC experiment tracking - **Navigation Updated**: Added Experiments (DVC) section to documentation structure - **Feature Coverage**: Documents experiment tracking, model registry, visualization - **Integration Guidance**: Shows how DataChain and DVC workflows complement each other - **Migration Guide**: Helps users transition from standalone DVC to Studio ### Workflow Clarity - **Separated Concerns**: Clear distinction between DataChain jobs and DVC experiments - **Use Case Guidance**: When to use each workflow type - **Hybrid Workflows**: How to use both approaches together - **Best Practices**: Integration patterns for teams using both systems ## Technical Validation - ✅ mkdocs builds successfully in strict mode (0 warnings) - ✅ All internal links resolve correctly - ✅ Pre-commit hooks pass (trailing whitespace fixed) - ✅ Navigation structure properly updated This maintains the comprehensive nature of the documentation while providing clear guidance on both DataChain and DVC capabilities.
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
|
@sourcery-ai review |
|
Sorry @amritghimire, your pull request is larger than the review limit of 150000 diff characters |
1 similar comment
|
Sorry @amritghimire, your pull request is larger than the review limit of 150000 diff characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates the comprehensive DataChain Studio documentation from DVC.org to the DataChain repository. The content has been rebranded from DVC Studio to DataChain Studio and integrated into the docs/studio/ section, providing complete documentation for user guides, API references, webhooks, self-hosting, and troubleshooting.
- Adds complete DataChain Studio documentation covering all aspects from user guides to enterprise self-hosting
- Updates mkdocs configuration to include the Studio documentation navigation structure
- Migrates and rebrands content to reference "DataChain Studio" and "studio.datachain.ai"
Reviewed Changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pyproject.toml | Adds duplicate mkdocs-section-index dependency version |
| mkdocs.yml | Extensive navigation updates and mkdocstrings configuration changes |
| Multiple docs/studio/ files | Complete Studio documentation migration with user guides, API, webhooks, and self-hosting content |
Comments suppressed due to low confidence (1)
mkdocs.yml:1
- The show_submodules configuration appears to be duplicated between the old rendering section and the new options section. The old rendering section should be completely removed to avoid conflicts.
site_name: 'DataChain'
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
@sourcery-ai review |
|
Sorry @amritghimire, your pull request is larger than the review limit of 150000 diff characters |
|
I have reviewed the code and think this is good enough for first pass. PTAL cc. @shcheklein |
|
Sorry @amritghimire, your pull request is larger than the review limit of 150000 diff characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 46 out of 46 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (2)
pyproject.toml:1
- The mkdocs-section-index package is listed twice with conflicting version constraints. Keep a single entry (prefer the newer >=0.3.10) to avoid resolver ambiguity and keep dependencies clear.
[build-system]
docs/studio/user-guide/model-registry/use-models.md:1
- The and elements are not recognized by MkDocs/Material; they will not render as interactive tabs. Replace with Material tabs syntax (e.g.,
=== \"CLI (DVC)\"sections) or remove the UI constructs.
# Use models
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amritghimire check GItHub AI reviews, to make sure everything works fine.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
|
@shcheklein Updated the pull request. Lets merge this and pick up on followup if we need to change anything. |
Migrate Studio documentation from DVC.org to the DataChain repository, updating all content for DataChain branding and features.
This PR reintroduces comprehensive Studio documentation, including user guides, API references, and self-hosting guides, which were previously removed from DVC.org. The content has been adapted to reference "DataChain Studio" and "studio.datachain.ai" and integrated into the
docs/studio/section of this repository.Completely generated by AI.
Summary by Sourcery
Migrate and integrate the full DataChain Studio documentation into the DataChain repository, rebranding and adapting content previously hosted on DVC.org and updating the mkdocs configuration to include Studio’s user guides, API reference, webhooks, self-hosting and troubleshooting materials.
Build:
Documentation:
(Closes https://github.com/iterative/itops/issues/5861)