|
| 1 | +--- |
| 2 | +pcx_content_type: how-to |
| 3 | +title: AI consumability |
| 4 | +meta: |
| 5 | + title: AI consumability | How we docs |
| 6 | +--- |
| 7 | + |
| 8 | +import { Tabs, TabItem, Width } from "~/components"; |
| 9 | + |
| 10 | +We have various approaches for making our content visible to AI as well as making sure it's easily consumed in a plain-text format. |
| 11 | + |
| 12 | +## AI discoverability |
| 13 | + |
| 14 | +The primary proposal in this space is [`llms.txt`](https://llmstxt.org/), offering a well-known path for a Markdown list of all your pages. |
| 15 | + |
| 16 | +We have implemented `llms.txt`, `llms-full.txt` and also created per-page Markdown links as follows: |
| 17 | + |
| 18 | +- [`llms.txt`](/llms.txt) |
| 19 | +- [`llms-full.txt`](/llms-full.txt) |
| 20 | + - We also provide a `llms-full.txt` file on a per-product basis, i.e [`/workers/llms-full.txt`](/workers/llms-full.txt) |
| 21 | +- [`/$page/index.md`](index.md) |
| 22 | + - Add `/index.md` to the end of any page to get the Markdown version, i.e [`/style-guide/index.md`](/style-guide/index.md) |
| 23 | +- [`/markdown.zip`](/markdown.zip) |
| 24 | + - An export of all of our documentation in the aforementioned `index.md` format. |
| 25 | + |
| 26 | +In the top right of this page, you will see a `Page options` button where you can copy the current page as Markdown that can be given to your LLM of choice. |
| 27 | + |
| 28 | +<Width size="medium"> |
| 29 | +  |
| 31 | +</Width> |
| 32 | + |
| 33 | +## Textual representation of interactive elements |
| 34 | + |
| 35 | +HTML is easily parsed - after all, the browser has to parse it to decide how to render the page you're reading now - it tends to not be very _portable_. This limitation is especially painful in an AI context, because all the extra presentation information consumes additional tokens. |
| 36 | + |
| 37 | +For example, given our [`Tabs`](/style-guide/components/tabs/), the panels are hidden until the tab itself is clicked: |
| 38 | + |
| 39 | +<Tabs> |
| 40 | + <TabItem label="One">One Content</TabItem> |
| 41 | + <TabItem label="Two">Two Content</TabItem> |
| 42 | +</Tabs> |
| 43 | + |
| 44 | +If we run the resulting HTML from this component through a solution like [`turndown`](https://www.npmjs.com/package/turndown): |
| 45 | + |
| 46 | +```md |
| 47 | +- [One](#tab-panel-6) |
| 48 | +- [Two](#tab-panel-7) |
| 49 | + |
| 50 | +One Content |
| 51 | + |
| 52 | +Two Content |
| 53 | +``` |
| 54 | + |
| 55 | +The references to the panels `id`, usually handled by JavaScript, are visible but non-functional. |
| 56 | + |
| 57 | +### Turning our components into "Markdownable" HTML |
| 58 | + |
| 59 | +To solve this, we created a [`rehype plugin`](https://github.com/cloudflare/cloudflare-docs/blob/d5a19deded110bce6a7c5d45e702d36527da0a4e/src/plugins/rehype/filter-elements.ts) for: |
| 60 | + |
| 61 | +- Removing non-content tags (`script`, `style`, `link`, etc) via a [tags allowlist](https://github.com/cloudflare/cloudflare-docs/blob/d5a19deded110bce6a7c5d45e702d36527da0a4e/src/plugins/rehype/filter-elements.ts#L19-L104) |
| 62 | +- [Transforming custom elements](https://github.com/cloudflare/cloudflare-docs/blob/d5a19deded110bce6a7c5d45e702d36527da0a4e/src/plugins/rehype/filter-elements.ts#L189-L227) like `starlight-tabs` into standard unordered lists |
| 63 | +- [Adapting our Expressive Code codeblocks HTML](https://github.com/cloudflare/cloudflare-docs/blob/d5a19deded110bce6a7c5d45e702d36527da0a4e/src/plugins/rehype/filter-elements.ts#L143-L178) to the [HTML that CommonMark expects](https://spec.commonmark.org/0.31.2/#example-142) |
| 64 | + |
| 65 | +Taking the `Tabs` example from the previous section and running it through our plugin will now give us a normal unordered list with the content properly associated with a given list item: |
| 66 | + |
| 67 | +```md |
| 68 | +- One |
| 69 | + |
| 70 | + One Content |
| 71 | + |
| 72 | +- Two |
| 73 | + |
| 74 | + Two Content |
| 75 | +``` |
| 76 | + |
| 77 | +For example, take a look at our Markdown test fixture (or any page by appending `/index.md` to the URL): |
| 78 | + |
| 79 | +- [`/style-guide/fixtures/markdown/`](/style-guide/fixtures/markdown/) |
| 80 | +- [`/style-guide/fixtures/markdown/index.md`](/style-guide/fixtures/markdown/index.md) |
| 81 | + |
| 82 | +### Saving on tokens |
| 83 | + |
| 84 | +Most AI pricing is around input & output tokens and our approach greatly reduces the amount of input tokens required. |
| 85 | + |
| 86 | +For example, let's take a look at the amount of tokens required for the [Workers Get Started](/workers/get-started/guide/) using [OpenAI's tokenizer](https://platform.openai.com/tokenizer): |
| 87 | + |
| 88 | +- HTML: 15,229 tokens |
| 89 | +- turndown: 3,401 tokens (4.48x less than HTML) |
| 90 | +- index.md: 2,110 tokens (7.22x less than HTML) |
| 91 | + |
| 92 | +When providing our content to AI, we can see a real-world ~7x saving in input tokens cost. |
0 commit comments