Skip to content

Commit 233263f

Browse files
committed
updates README
1 parent c430536 commit 233263f

File tree

1 file changed

+24
-1
lines changed

1 file changed

+24
-1
lines changed

.ai/README.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,5 +46,28 @@ python scripts/generate_llms.py
4646

4747
The scripts for LLM-related files generation are located in `polkadot-docs/scripts` which contains the following:
4848

49-
- **`llms_config.json`**: single point of configuration for the LLM files.
49+
- **`llms_config.json`**: Single point of configuration for the LLM files.
50+
- **`generate_llms.py`**: Pipeline for generating updated LLM files.
51+
- **`generate_ai_pages.py`**: Creates one resolved Markdown file per documentation page and outputs them to the `/.ai/pages` directory.
52+
- **`generate_llms_txt.py`**: Creates the `llms.txt` site index file using the Markdown file URLs and outputs it to the `/polkadot-docs/` directory.
53+
- **`generate_site_index.py`**: Creates two full-site content related files:
54+
- `llms-full.jsonl`: This file contains the entire documentation site, enhanced with metadata for improved indexing and chunking, and replaces the previous `llms-full.txt` file perviously used.
55+
- `site-index.json`: This lightweight version of the full documentation site uses content previews rather than full content bodies to allow for a smaller file size.
56+
- **`generate_category_bundles.py`**: Bundles pages with the same category tag together, along with context via Basics and Reference categories, and outputs them to `/.ai/categories/` as Markdown files.
57+
58+
## FAQs
59+
60+
### Why are we now using Markdown instead of `.txt` files?
61+
62+
- LLMs see a Markdown file and automatically know which semantic clues to look for to identify headings, bullet lists, and other structural elements. In comparison, a `.txt` file presents as a flattened sequence of words where the model has to work harder to identify the structure of the content.
63+
64+
### What do you mean by "resolved Markdown" files?
65+
66+
- The resolved Markdown files are those which are processed to replace all of the code snippet and variable placeholders with their intended contents and strip any HTML comments.
67+
68+
### Why use the `/.ai/pages` and `/.ai/categories` directories rather than ouputting the files to '/llms-files/' like before?
69+
70+
- The Markdown files must be located in a directory that is not included in the site build to prevent Mkdocs from converting the Markdown to HTML elements when building the site.
71+
72+
5073

0 commit comments

Comments
 (0)