Skip to content

Latest commit

 

History

History
511 lines (415 loc) · 27.1 KB

File metadata and controls

511 lines (415 loc) · 27.1 KB

AGENTS.md

Orientation guide for AI agents (and humans) working in plotly/graphing-library-docs.

This repo is a Jekyll site that publishes Plotly's graphing-library documentation to https://plotly.com/{graphing-libraries,javascript,python,r,julia,matlab,csharp,fsharp,...}. Some content lives here; some is fetched from sibling repos at build time. Read The "what's in this repo, what isn't" map before editing anything under _posts/.


TL;DR for agents

  • Static site generator: Jekyll (Ruby 2.7), with vanilla CSS + a SCSS/Gulp toolchain.
  • Content lives in _posts/<language>/<category>/<yyyy-mm-dd-name>.{html,md} with YAML front-matter. The front-matter is load-bearing — it drives URLs, sidebar placement, ordering, and search indexing.
  • Only JavaScript tutorial content is authored here (_posts/plotly_js/). Python, R, Julia, Matlab, C#, F# tutorial content is cloned from upstream repos by make fetch_upstream_files at CI time and never committed.
  • Two CI gates must pass for every PR: front-matter-ci.py and check-or-enforce-order.py. Both can be run locally. The latter has an enforce mode that rewrites front-matter to fix ordering.
  • Build flow: make fetch_upstream_filespython front-matter-ci.py _postspython check-or-enforce-order.py <dir> (per language) → python generate-sitemaps.pybundle exec jekyll build → Percy snapshots → deploy _site/ to the plotly/documentation repo's gh-pages branch. The full pipeline is in .github/workflows/build.yml.
  • Search is a separate pipeline from the site build. Algolia indexes are updated by dedicated make update_*_search targets, not by jekyll build.

What's in this repo, what isn't

Source path Authored here? Where the content actually lives
_posts/plotly_js/ Yes This repo. JavaScript tutorials, HTML files with embedded JS examples.
_posts/python/ No (fetched) plotly/plotly.py-docs (built branch)
_posts/r/md/ No (fetched) plotly/plotly.r-docs (built branch)
_posts/ggplot2/md/ No (fetched) Same plotly.r-docs repo, ggplot2/ subdir, moved into place by make fetch_upstream_files
_posts/julia/html/ No (fetched) plotly/plotlyjs.jl-docs (built branch)
_posts/matlab/md/ No (fetched) plotly/plotly.matlab-docs (built branch)
_posts/fsharp/html/ No (fetched) plotly/plotly.net-docs (built branch)
_posts/csharp/html/ No (fetched) Same plotly.net-docs repo, csharp/ subdir, moved into place by make fetch_upstream_files
_posts/reference_pages/ Yes (mostly) Schema-driven reference pages, regenerated by _data/make_ref_pages.py from plotschema.json
_posts/python-v3/ Yes (legacy) Chart-Studio-era Python v3 docs; not actively maintained.
_posts/{nodejs,scala,misc,dashboards}/ Yes (legacy) Older content kept for historical permalinks.

Implication for agents: If a user asks you to edit a Python or R tutorial, the file is not in this repo — it lives in plotly.py (under doc/) or plotly.r-docs. Direct them to the upstream repo. JavaScript tutorials are edited here.

Implication for grep/exploration: After make fetch_upstream_files runs locally, the _posts/{python,r,ggplot2,julia,matlab,csharp,fsharp}/... directories will be populated — but those files are gitignored and not part of this repo's history. git status and the file mtimes will tell you whether you're looking at fetched content or authored content.


Directory map

.
├── _config.yml                # Production Jekyll config
├── _config_dev.yml            # Local dev config — excludes _posts/plotly_js and python-v3 for speed
├── _config_personal.yml       # User-local override, gitignored
├── _config_python_search.yml  # Algolia config for Python index (Jekyll plugin reads this)
├── _config_r_search.yml       # Algolia config for R/ggplot2 index
├── _layouts/                  # Top-level Jekyll layouts: base.html, langindex.html, blank.html
├── _includes/                 # Reusable Liquid partials, organized into:
│   ├── layouts/               #   chrome (header, sidebar, footer, head, breadcrumb, etc.)
│   ├── posts/                 #   content rendering (auto_examples.html, documentation_eg.html,
│   │                          #   reference-block.html, reference-trace.html, plotschema-reference.html)
│   └── both/                  #   shared notes (py4_note.html)
├── _plugins/                  # Three Ruby plugins:
│   ├── jekyll_notifier.rb     #   desktop notification when Jekyll finishes
│   ├── _capitalize_all.rb     #   `capitalize_all` Liquid filter
│   └── roundnum.rb            #   `round_num` Liquid filter
├── _data/                     # Data accessible to Liquid as site.data.*
│   ├── display_as.yml         #   section-id ↔ sidebar-label mapping (legacy / single-lang)
│   ├── display_as_py_r_js.yml #   same, for the multi-lang py/r/js view
│   ├── orderings.json         #   canonical attribute order applied to plotschema.json by get_plotschema.py
│   ├── plotschema.json        #   committed snapshot of the plotly.js schema (drives reference pages)
│   ├── jsversion.json         #   plotly.js version used in CodePen "Try It" links
│   ├── pyversion.json         #   plotly.py version (used in plug-ins / display)
│   ├── get_plotschema.py      #   regenerates plotschema.json + jsversion.json for a new plotly.js release
│   └── make_ref_pages.py      #   regenerates _posts/reference_pages/<lang>/*.html from plotschema.json
├── _posts/                    # All published content; see "What's in this repo" table
├── _sass/, scss/, all_static/css/  # Styling. See style_README.md.
├── all_static/                # Images, JS, fonts, vendored libraries served at /all_static/
├── _site/                     # Jekyll build output, gitignored
├── .github/workflows/build.yml  # CI: validate → build → Percy → deploy to plotly/documentation
├── makefile                   # Entry point for setup, upstream fetches, search-index updates
├── front-matter-ci.py         # Validates front-matter across _posts/
├── check-or-enforce-order.py  # Per-language `order:` sequencing check (with enforce mode)
├── generate-sitemaps.py       # Writes python/sitemap.xml and javascript/sitemap.xml
├── update_js_docs_search.py   # Pushes js_docs Algolia index (called by `make update_js_search`)
├── update_ref_search.py       # Pushes schema Algolia index from plotschema.json
├── process_python_md.py       # Hoists Jupytext front-matter for the Python search index
├── process_r_md.sh            # Renames .Rmd → .md for the R search index
├── ref_names.txt              # Long list of reference attribute names (used by tooling)
└── params.json                # GitHub Pages project metadata (legacy)

Build and run

One-time setup

Requires Ruby 2.7.4 and a recent Python 3 (CI uses 3.12).

make setup

That runs (in order): gem install bundler && bundle install, pip install -r requirements.txt, npm install, make fetch_upstream_files.

If you only want to iterate on Python doc content locally without re-cloning from GitHub each time, keep a plotly.py checkout next to this one and run make fetch_adjacent_python_files instead — it copies from ../plotly.py/doc/build/html.

Serving locally

bundle exec jekyll serve --config _config_dev.yml

Site comes up at http://localhost:4000. _config_dev.yml excludes _posts/plotly_js and _posts/python-v3 for build speed.

To customize what's included, copy _config_dev.yml to _config_personal.yml (gitignored) and adjust the exclude: list, then serve with --config _config_personal.yml.

Styling workflow

Two parallel CSS systems coexist:

  • SCSS via Gulp. Source in scss/, compiled to all_static/css/main.css. gulp build regenerates main.css and updates _data/cache_bust_css.yml (used as a cache-busting query string). Commit both.
  • Vanilla CSS. all_static/css/css.css for one-off fixes that don't need the SCSS toolchain.

bundle exec jekyll build itself regenerates styles.css, which can clobber your SCSS output — run gulp build after Jekyll if both are in play. See style_README.md for the full SCSS workflow.


Front-matter contract

Every post under _posts/ is a YAML-front-matter file. The front-matter is parsed by Jekyll and by the Python CI scripts, so its shape matters. The fields agents most often touch:

Field Required? Purpose
name Yes* Human-readable page title and search hit text. Required unless this post is a redirect_to stub.
description Recommended SEO meta description and Algolia search hit body.
permalink Yes Final URL. Must end with a trailing slash (CI check). Must be unique across the whole site.
language Yes One of python, python/v3, plotly_js, r, julia, matlab, nodejs, fsharp, ggplot2.
display_as Yes Section key — drives sidebar grouping and the per-language landing page. See list below.
order Yes Integer position within display_as. Must form a consecutive sequence per display_as per language.
layout Yes base for examples, langindex for landing/index pages.
thumbnail Yes (CI) Path under https://images.plot.ly/plotly-documentation/ for the landing-page card image.
page_type Sometimes example_index for posts with order < 5 (forces inclusion on the index page); reference for ref pages.
redirect_from Optional Old permalinks to 301 from. Validated for uniqueness.
redirect_to Optional If present, the post is a pure redirect stub and name is not required.
markdown_content Optional Markdown rendered above examples (headings here appear in the right-hand TOC).
plot_url Optional (JS) If set, the post embeds an iframe instead of running an inline example.
arrangement Optional horizontal to stretch the example full-width (twelve columns) instead of split.
suite Optional (JS) Groups a set of small example snippets under a parent chart-type index page.

display_as valid values (from _data/display_as.yml and _data/display_as_py_r_js.yml): file_settings, basic (also chart_type historically), statistical, scientific, financial, maps, 3d_charts, multiple_axes, controls, animations, streaming, chart_events, ai_ml, bio, advanced_charts, chart_studio, advanced_opt, aesthetics, geoms, faceting, other, theme, plus several specialty keys used by individual landing pages.

The CI categories that get strict ordering enforcement are a subset: file_settings, basic, financial, statistical, scientific, maps, 3d_charts, multiple_axes, ai_ml (see categories list in both front-matter-ci.py and check-or-enforce-order.py).

Example: a JavaScript tutorial post

---
description: How to make a D3.js-based bar chart in javascript.
display_as: basic
language: plotly_js
layout: base
name: Bar Charts
order: 3
page_type: example_index
permalink: javascript/bar-charts/
redirect_from: javascript-graphing-library/bar-charts/
thumbnail: thumbnail/bar.jpg
---
var data = [{ x: ['a','b','c'], y: [1,2,3], type: 'bar' }];
Plotly.newPlot('myDiv', data);

For the JS examples, the body is plain JavaScript. _includes/posts/auto_examples.html detects the 'myDiv' target and auto-generates the rendered chart plus a "Try It on CodePen" button using the version from _data/jsversion.json.

Example: a suite example snippet (no front-matter name is needed for the landing page; many small examples roll up into a single index)

---
name: Select Hover Points
language: plotly_js
suite: area
order: 3
sitemap: false
arrangement: horizontal
---

CI checks (run them locally before pushing)

front-matter-ci.py

Validates all posts whose language is in {python, python/v3, plotly_js, r} and whose display_as is one of the strict categories.

python front-matter-ci.py _posts

Checks (every one fails the build if violated):

  1. Non-redirect posts have name.
  2. No posts use title (use name).
  3. No duplicate permalink or redirect_from.
  4. Every post has thumbnail.
  5. Every permalink ends with /.

In upstream repos this script also runs against build/html (plotly.py-docs) and build (plotly.r-docs), which is why the path is a CLI argument.

check-or-enforce-order.py

Per-language: verifies that, within each display_as category, the order field forms [1, 2, 3, ..., N] with no gaps or duplicates.

python check-or-enforce-order.py _posts/plotly_js          # check only
python check-or-enforce-order.py _posts/plotly_js enforce  # rewrite order fields in place

CI runs the check (without enforce) on _posts/{python,python-v3,r,matlab,plotly_js}. If you add a post with order: 3 to a category that already has 1, 2, 3, 4, 5 you'll need to renumber — enforce mode does this for you by sorting paths by their current order and assigning sequential integers from 1.

Caveat: in plotly.py docs, the script reads/writes post.metadata["jupyter"]["plotly"]["order"] (Jupytext-style) instead of the top-level order. In this repo, it's the top-level order.

generate-sitemaps.py

Walks _posts/python/ and _posts/plotly_js/ (plus their reference-page counterparts) and writes python/sitemap.xml and javascript/sitemap.xml. The build pipeline then copies python/sitemap.xml into _site/python/. It filters out dash.plotly.com URLs and chart-studio links.

What CI does, end-to-end

From .github/workflows/build.yml:

checkout
→ setup Ruby 2.7 (bundler cache) + Python 3.12
→ uv pip install PyYAML==6.0.1 python-frontmatter==0.5.0
→ make fetch_upstream_files                       # clones py, r, julia, matlab, csharp, fsharp content
→ write MAPBOX_TOKEN secret into _data/mapbox_token.yml
→ python front-matter-ci.py _posts
→ python check-or-enforce-order.py _posts/python
→ python check-or-enforce-order.py _posts/python-v3
→ python check-or-enforce-order.py _posts/r/
→ python check-or-enforce-order.py _posts/matlab
→ python check-or-enforce-order.py _posts/plotly_js
→ python generate-sitemaps.py
→ md5sum all_static/css/main.css > _data/cache_bust_css.yml
→ bundle exec jekyll build
→ cp python/sitemap.xml _site/python/sitemap.xml
→ rm _data/mapbox_token.yml                       # never deployed
→ assemble snapshots/ directory, bundle exec percy snapshot
→ if branch == master AND repo == plotly/graphing-library-docs:
  → mint GitHub App token (GRAPHING_LIBRARIES_CI_GHAPP)
  → checkout plotly/documentation @ gh-pages
  → cp -r _site/* documentation/ && git commit && git push

Deployment is not a git push from this repo — it commits _site/'s contents into plotly/documentation's gh-pages branch and that's what GitHub Pages serves.


Search indexes (Algolia)

Search is updated out-of-band with site builds. There are four indexes:

Index Powers search on Update target Script
js_docs https://plotly.com/javascript/ make update_js_search update_js_docs_search.py
python_docs https://plotly.com/python/, https://plotly.com/pandas/ make update_python_search bundle exec jekyll algolia push --config _config_python_search.yml (with a fresh clone of plotly.py)
r_docs https://plotly.com/r/, https://plotly.com/ggplot2/ make update_r_search similar, with plotly.r-docs
schema reference pages make update_ref_search update_ref_search.py

All four require ALGOLIA_API_KEY in the environment (the write key — the read-only key is committed in the configs). Request it via an issue on this repo from a Plotly maintainer.

When to update:

  • Add/remove a tutorial in _posts/plotly_js/make update_js_search.
  • Add/remove a tutorial in upstream plotly.py-docs → make update_python_search.
  • Add/remove a tutorial in upstream plotly.r-docs → make update_r_search.
  • New plotly.js release (after regenerating plotschema.json) → make update_ref_search.

Exclusion rules live in the algolia.excluded_files list inside each _config_*_search.yml. The configs differ from _config.yml mainly in which _posts/<lang>/ directories they exclude — each language's search config excludes every other language so the index stays focused.

The update_js_docs_search.py script is intentionally simple: it grep-parses front-matter line-by-line out of *index* files only, so non-index pages don't end up in the JS search hits. Don't expect it to round-trip exotic YAML.

The schema index is built from _data/plotschema.jsonupdate_ref_search.py walks the layout + traces tree recursively and emits one Algolia object per attribute with a hierarchical name like scatter traces > marker > color.


Reference pages (the _posts/reference_pages/ tree)

Reference pages are not hand-authored. They are generated from _data/plotschema.json by _data/make_ref_pages.py, one set per supported language (Python, JavaScript, MATLAB, R, Julia, F#). Each language gets:

  • One 2020-07-20-<attr>.html file per top-level layout subsection (xaxis, yaxis, coloraxis, scene, polar, ternary, smith, geo, mapbox, sliders, updatemenus, annotations, shapes, images, selections, global).
  • One 2020-07-20-<trace>.html file per trace type (scatter, bar, heatmap, …).

These files are thin Jekyll stubs that pull data from site.data.plotschema and render via _includes/posts/reference-block.html and reference-trace.html.

Refreshing for a new plotly.js release

cd _data
python get_plotschema.py 3.0.0       # downloads & re-orders plot-schema.json from plotly.js v3.0.0
python make_ref_pages.py             # regenerates all _posts/reference_pages/<lang>/*.html
cd ..
make update_ref_search                # push new schema index to Algolia

get_plotschema.py also writes _data/jsversion.json, which is the version string the CodePen "Try It" links inject. Keep these in sync.

orderings.json is the canonical attribute display order. If get_plotschema.py reports missing key in <section>: <key> during regeneration, that's a heads-up that plotly.js added a new attribute — add it to orderings.json in the appropriate position.


Layouts and includes — how a page renders

There are three Jekyll layouts:

  • base.html — the default for tutorial example pages. Page chrome (header, sidebar, breadcrumb, footer) wraps {{ content }}. Most posts use this.
  • langindex.html — landing/index pages for a language or chart category. Same chrome as base.html, but the body section is --tutorial-index and a Plotly Studio promo image is appended for Python and JavaScript pages.
  • blank.html — empty shell, rarely used (1 line).

Key includes you'll touch when adding categories or tweaking rendering:

  • _includes/layouts/side-bar.html — the per-language left sidebar. Reads page.url to figure out which language, then iterates posts to decide which sections to surface. If you add a new display_as value, you'll need to extend this file.
  • _includes/posts/documentation_eg.html — drives the language-landing pages (the grid of chart-type cards). Add a new chart type here when you add a new chart-type directory under _posts/plotly_js/.
  • _includes/posts/auto_examples.html — renders a list of suite examples on an index page. For plotly.js, generates the inline chart + the CodePen "Try It" form (using site.data.jsversion.version and substituting site.data.mapbox_token['token'] into snippets that need it).
  • _includes/posts/reference-block.html / reference-trace.html / plotschema-reference.html — render the schema reference pages from site.data.plotschema.

Adding a new JavaScript tutorial

(Full walkthrough is in _posts/plotly_js/README.md. Summary:)

  1. New chart category? Create a directory under _posts/plotly_js/<chart_type>/ and inside it yyyy-mm-dd-<chart_type>_plotly_js_index.html using the layout: langindex, page_type: example_index, order: 5 template from the README. Then update:

    • _includes/posts/documentation_eg.html (add the card to the landing page)
    • _includes/layouts/side-bar.html (add the sidebar entry)
    • _data/display_as_py_r_js.yml (add the display_as key if new)
    • front-matter-ci.py and check-or-enforce-order.py categories lists (if the new section needs strict ordering)
  2. Add the tutorial post in _posts/plotly_js/<chart_type>/yyyy-mm-dd-<name>.html using the example post template in the README. The body is the JavaScript that draws into myDiv.

  3. Verify locally:

    python front-matter-ci.py _posts
    python check-or-enforce-order.py _posts/plotly_js
    bundle exec jekyll serve --config _config_dev.yml

    (If you're only changing JS content, you'll need a _config_personal.yml that does not exclude _posts/plotly_js.)

  4. Update search after merge: make update_js_search (needs ALGOLIA_API_KEY).

order rules to know:

  • order values per display_as per language must be 1..N consecutive.
  • Posts with order < 5 must also have page_type: example_index (so they appear on the landing page).
  • Index pages for categories themselves use order: 5.

Gotchas / things that bit previous contributors

  • Don't add title: front-matter. Use name:. title: triggers a CI failure.
  • Permalinks must end with /. Easy to miss when copy-pasting; CI catches it.
  • Duplicate permalinks across languages still collide. A redirect_from also counts as a permalink for uniqueness purposes.
  • _posts/python is gitignored after fetch_upstream_files. If you git status and see no Python files, that's normal — they're not authored here.
  • make fetch_upstream_files clones with --depth 1 -b built. It always overwrites local content. Don't put hand-edits there; edit upstream.
  • Jekyll's watch mode is slow on this repo because of the post count. Either use _config_dev.yml (which excludes the two biggest dirs) or rebuild on demand with bundle exec jekyll build.
  • CSS regeneration races. bundle exec jekyll … writes styles.css; gulp build writes all_static/css/main.css. If you're editing SCSS, run gulp build after any Jekyll build to avoid a stale main.css.
  • _data/mapbox_token.yml is created at CI time from a secret and deleted before deploy. It will appear briefly in _data/ during a CI run; never commit it. Locally, Mapbox tiles won't render unless you create the file with token: <your_mapbox_token>.
  • Production config vs. dev config. _config.yml is what production builds use; _config_dev.yml is for local. If you change the production exclude list, change both unless you intentionally want them to diverge.
  • Algolia write key is private. Do not commit it. The committed read_only_api_key is fine.
  • The algoliasearch==1.20.0 SDK pinned in requirements.txt is an old major. APIs differ from current Algolia Python SDKs; keep the pin unless you're prepared to refactor the search scripts.
  • update_js_docs_search.py does line-by-line parsing, not real YAML parsing. Multi-line front-matter values won't survive it; keep search-relevant fields (name, description, display_as, tags, permalink) as single-line strings on JS index pages.

When the user asks you to…

  • "Edit the Python docs for X." → It's not here. Direct them to plotly/plotly.py under doc/. Same for R (plotly/plotly.r-docs), Julia (plotly/plotlyjs.jl-docs), Matlab (plotly/plotly.matlab-docs), C# / F# (plotly/plotly.net-docs).
  • "Edit the JavaScript docs for X."_posts/plotly_js/<category>/... here.
  • "Add a chart type." → Walk the steps in Adding a new JavaScript tutorial and update _includes/posts/documentation_eg.html, _includes/layouts/side-bar.html, _data/display_as_py_r_js.yml. If it's a category that needs ordering enforced, also update the categories list in front-matter-ci.py and check-or-enforce-order.py.
  • "Reorder posts." → Edit order: fields, then run python check-or-enforce-order.py _posts/<lang> enforce to renumber to 1..N. Commit the resulting changes.
  • "Bump the plotly.js version on the docs."cd _data && python get_plotschema.py <new_version>, then python make_ref_pages.py, then make update_ref_search. Commit _data/plotschema.json and _data/jsversion.json and the regenerated _posts/reference_pages/.
  • "The search is missing my new tutorial." → Run the appropriate make update_*_search target. The site build does not update search.
  • "The build passed locally but failed in CI." → Likely either (a) you didn't run make fetch_upstream_files locally so a cross-language permalink collision wasn't visible, or (b) a display_as/order interaction with the fetched upstream content. Reproduce with make fetch_upstream_files then the CI commands above.
  • "Deploy this change to prod." → Merging to master triggers the workflow, which after Percy approval auto-deploys to plotly/documentation@gh-pages. There is no manual deploy step from this repo.