Skip to content

Proposal: Automating content sync and delivery for the Actor Whitepaper Astro Site #57

@vancura

Description

@vancura

Below is an overview and proposal for merging and automating the content from this repository to the Astro-based site currently in another repository: @vancura/developer-actor. The goal is to create a single source of truth for the Whitepaper, ensure that content is published cleanly to the Astro site, and automate the process using GitHub Actions or a similar CI/CD approach.

Background and current setup

  1. The actor-whitepaper content (this repo) is currently manually copied into the Astro-based site. This repetitive process is prone to errors and omissions.
  2. The Astro site code lives in a separate repo, @vancura/developer-actor.
  3. The original @apify/actor-whitepaper repository includes:
  4. The Astro site uses its own structure under /src/content/pages with lowercase filenames (e.g., key-value-store-schema.mdx).
  5. We want to unify and automate the flow so that updates to @apify/actor-whitepaper automatically appear on the Astro website.
  6. The Astro site is currently built and hosted by Vercel.

Requirements and goals

1. Single source of truth

The @apify/actor-whitepaper repository should contain all the canonical content. All new changes and additions happen here.

2. Automated syncing or processing

Changes in @apify/actor-whitepaper should feed into the Astro site automatically (e.g., via GitHub Actions), converting .md to .mdx as needed.

3. Filename normalization

Convert uppercase filenames in the Whitepaper repo to lowercase .mdx in the Astro site.

4. Image handling and optimization

Move images from /img in the Whitepaper repo to the appropriate location under src/content/pages/img in the Astro repo so that Sharp can properly handle them.

5. Formatting differences and MDX compatibility

Adjust syntax, insert frontmatter, and use Astro components (<Illustration>, <Picture>, <CodeSwitcher>) as needed.

6. Repository structure decision

Decide between keeping two separate repos (with automated sync) or merging into one. Consider commit history “noise” and potential reuse of Whitepaper content elsewhere.

7. Delivery method

Confirm whether to continue using Vercel or switch to GitHub Actions for build and deployment. Check build times and runner constraints.


Possible approaches

Two-repository model with automated sync

  • Keep @apify/actor-whitepaper as a standalone repo:
    • All content changes happen here in Markdown.
    • Can remain public or internal.
  • The Astro site repo remains separate:
    • Houses Astro configuration, styling, and additional site content.
    • Should be moved to the @apify account.
  • Sync step with GitHub Actions:
    • On push/release to @apify/actor-whitepaper, an Action:
      • Clones/checks out the Astro site repo.
      • Copies/transforms .md to .mdx, normalizes filenames.
      • Copies images into the Astro repo.
      • Commits and pushes changes to the Astro site repo.
    • Astro site repo triggers its deployment pipeline (GitHub Actions or Vercel).

Pros

  • Clear separation between content and site code.
  • Keeps large doc changes out of the site commit history.

Cons

  • Maintains two repos.
  • More complexity in CI and potential debugging if sync fails.

Single monorepo (combining Whitepaper + Astro)

  • Merge @apify/actor-whitepaper content into the Astro site’s codebase.
  • Store raw content in /docs or a dedicated folder and import directly.
  • Build and deploy from a single repository.

Pros

  • One repository to manage, no sync pipeline.
  • Simpler local development.

Cons

  • Larger commit logs mixing docs and code.
  • Harder to reuse Whitepaper content independently.

Subtree or submodule approach

  • Keep @apify/actor-whitepaper as a submodule in the Astro site repository.
  • A hybrid approach similar to the two-repo model, but possibly more complex for contributors.

Detailed considerations

Renaming & transforming files

  • Rename KEY_VALUE_STORE_SCHEMA.md to key-value-store-schema.mdx etc.
  • Insert frontmatter in MDX if required.
  • Convert Markdown links, images, and code blocks to MDX-compatible syntax.

Formatting & linting

  • The Astro site may run Prettier or custom linters on .mdx.
  • Ensure formatting is consistent. Possibly run formatting after syncing.

MDX differences & components

  • Astro uses custom MDX components like <Illustration>, <Picture>, <CodeExample>, <CodeSwitcher>.
  • Standard Markdown images (![Alt](img.png)) might need converting into <Picture> or <Illustration> tags.
  • Some .md files may contain comment placeholders like <!-- IMAGE: right, illuApifyStore, "Apify Actor Store" --> to guide the script on how to transform images.
  • Code snippets might need grouping into <CodeSwitcher> components to provide tabbed code examples.

Special markup (e.g., layout reset)

  • <div class="clear-both" /> might be needed after floated images to prevent wrapping.
  • Insert these strategically where required, possibly hinted by placeholders in original .md files.

Frontmatter & layouts

  • .mdx files may need a frontmatter block specifying a layout.
  • The sync script can insert a standard frontmatter snippet at the top of each file.

Image handling

  • Copy images from @apify/actor-whitepaper/img to astro-repo/src/content/pages/img/whitepaper (or similar).
  • Ensure references are updated to the new paths, and if needed, wrap them in <Picture> or <Illustration>.

CI/CD

  • Use GitHub Actions for syncing and possibly also for building/deploying.
  • If Vercel continues to be used, just trigger a rebuild via a webhook after syncing.

Git history & scope

  • The two-repo model keeps docs separate from the site code history.
  • Single repo mixes everything. It might be okay for smaller projects, but less flexible.

Handling special site files (e.g., 404.mdx)

  • The Astro site may have additional MDX files like 404.mdx or other custom pages not present in the Whitepaper repo.
  • The sync script must not overwrite or remove these special site-level pages.
  • Ensure the pipeline only updates files originating from the Whitepaper repository’s content, leaving the Astro site’s unique pages intact.

Recommendation & example workflow

Recommended: Two-repository model with automated sync

  1. Keep @apify/actor-whitepaper as the source of truth.
  2. Add a GitHub Action in @apify/actor-whitepaper that on push:
    • Checks out the Astro site repo.
    • Runs a custom script to:
      • Convert .md to .mdx, fix filenames, and insert frontmatter.
      • Transform images using placeholders into <Illustration> or <Picture>.
      • Group related code blocks into <CodeSwitcher> components.
      • Copy images to the correct folder.
      • Avoid overwriting special site pages (like 404.mdx).
    • Commits and pushes changes back to the Astro site repo.
  3. The Astro site repo deploys automatically (via GitHub Actions or Vercel).

If we ever decide a single repo is simpler, we can merge them later. For now, this approach ensures clarity and maintainability.

Why not a single repo?

Keeping the Whitepaper separate is beneficial if it may be reused in multiple contexts (e.g., NPM packages and other sites). It reduces mixing deployment code with document content and keeps the content portable.

Another critical driver for keeping them separate is that the actor-whitepaper repository is intended as a narrowly focused RFC project, where contributors only see and discuss changes or issues related to the specification itself. If we unify the website build code in the same repo, people interested solely in the Whitepaper’s evolution would also receive GitHub notifications and issues about the site’s build or styling changes. This can be noisy and confusing for those who want to collaborate on the specs without going deep through the broader website infrastructure.

Potential pitfalls

  • Sync script might need maintenance if the file structure changes drastically.
  • Parsing comment placeholders and grouping code blocks may be non-trivial.
  • Good documentation and stable workflows will reduce risks.

Conclusion

By preserving @apify/actor-whitepaper as the single source of truth and automatically updating the Astro site repository, we achieve the following:

  • Automated syncing (no manual duplication).
  • Clean separation of docs and site code.
  • Potential for reusing the Whitepaper content elsewhere.

We can revisit this decision later if we prefer a single codebase. For now, the two-repo sync approach via GitHub Actions is the least disruptive path to unifying content and deployment.

/c @B4nan @mtrunkat @netmilk

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions