|
| 1 | +--- |
| 2 | +title: Checking a Local Folder with URL Remapping |
| 3 | +description: Checking a local folder of HTML files which will be uploaded to a particular URL. |
| 4 | +--- |
| 5 | +{/* vim: set syntax=markdown: */} |
| 6 | + |
| 7 | +import { Code } from "@astrojs/starlight/components"; |
| 8 | + |
| 9 | +Often, you will want to check a local folder of HTML files before the folder |
| 10 | +gets uploaded to a website (as part of a static site workflow, for example). |
| 11 | +Sometimes, this can be complicated {/* verb */} if your local files use fully-qualified |
| 12 | +URLs which point to _future_ online locations of the local files. |
| 13 | + |
| 14 | +For instance, suppose you write a new blog post which will be uploaded to |
| 15 | +`https://example.com/docs/2025-01-01-post.html`. You might use that URL in |
| 16 | +certain places (like permalinks and canonical links), even though the URL |
| 17 | +doesn't exist _yet_. |
| 18 | +<Code |
| 19 | + code={`<h1>My blog post</h1> |
| 20 | +<a href="https://example.com/docs/2025-01-01-post.html">Permalink</a>`} |
| 21 | + lang="html" |
| 22 | + title="docs/2025-01-01-post.html" |
| 23 | +/> |
| 24 | + |
| 25 | +This can cause problems for link checking, because lychee would check these |
| 26 | +links against the currently-online version of the site—this could be outdated or |
| 27 | +missing newly-added files. To solve this problem, we can tell lychee that |
| 28 | +certain online URLs should be *mapped* to local folder paths. |
| 29 | + |
| 30 | +This works by mapping the content's future URLs to local files on your |
| 31 | +computer, using lychee's URL remapping feature. For links to these URLs, lychee |
| 32 | +will check that the corresponding files exist inside the local directory, |
| 33 | +rather than checking the online website. |
| 34 | + |
| 35 | +:::tip[Please give feedback!] |
| 36 | +This page covers a fairly complicated topic, so feedback is appreciated! If |
| 37 | +something is unclear or not working as you expect, please let us know. You |
| 38 | +can open an issue or discussion for [this docs |
| 39 | +website](https://github.com/lycheeverse/lycheeverse.github.io) or [lychee |
| 40 | +itself](https://github.com/lycheeverse/lychee). |
| 41 | +::: |
| 42 | + |
| 43 | +:::note[Limitations] |
| 44 | +This guide uses lychee's URL remapping feature. This is based on regular |
| 45 | +expressions and has certain limitations, see [Limitations](#limitations). |
| 46 | +::: |
| 47 | + |
| 48 | +## Do You Need URL Remapping? |
| 49 | + |
| 50 | +In simple cases, you don't! |
| 51 | + |
| 52 | +By default, lychee can already resolve relative links to adjacent local files. |
| 53 | +By adding [`--root-dir`][root-dir], lychee can also resolve root-relative links |
| 54 | +(beginning with `/`) to the given root directory. In simple cases, this is all |
| 55 | +you need. |
| 56 | + |
| 57 | +Continue reading if: |
| 58 | +- you have fully-qualified links to files which exist locally but aren't online yet, or |
| 59 | +- your local folder will be uploaded to a _subdirectory_ of the website domain. |
| 60 | + |
| 61 | +[root-dir]: /recipes/root-dir/ |
| 62 | + |
| 63 | +## Mapping Remote Domain to a Local Folder |
| 64 | + |
| 65 | +Suppose you have a local directory `out` and this will be uploaded to |
| 66 | +the domain *root* at `https://docs.example.com`. |
| 67 | + |
| 68 | +You can map URLs beginning with this domain into the local directory: |
| 69 | +```bash |
| 70 | +lychee ./out --root-dir ./out --remap "^https://docs\.example\.com file://$(pwd)/out" |
| 71 | +``` |
| 72 | +This will remap URLs so `https://docs.example.com/page.html` becomes |
| 73 | +`./out/page.html`, for example. |
| 74 | + |
| 75 | + |
| 76 | +## Mapping a Remote Subfolder to a Local Folder |
| 77 | + |
| 78 | +If, instead, your local folder will be uploaded to a _subdirectory_ of the website |
| 79 | +(rather than the domain root), you will need some more set up. |
| 80 | + |
| 81 | +Suppose that the local directory `out` will be uploaded to a subfolder at |
| 82 | +`https://example.com/docs/`. |
| 83 | + |
| 84 | +:::tip |
| 85 | +Try the "Simple Case" first, even if you're not sure which case to use. If all |
| 86 | +links check successfully, then it's all good! Otherwise, if you see "not found" |
| 87 | +errors or "root dir" errors, move on to [More Complex |
| 88 | +Cases](#more-complex-cases). |
| 89 | +::: |
| 90 | + |
| 91 | +### Simple Case ("Portable" Websites) |
| 92 | + |
| 93 | +If your website files are _portable_, then you can use a simple setup |
| 94 | +akin to the mapping whole domain case. |
| 95 | + |
| 96 | +Portable means that the local folder could be uploaded to any path on any |
| 97 | +domain and all its pages would work correctly. This is common for HTML files |
| 98 | +generated by a documentation generator such as Doxygen or Javadoc. |
| 99 | + |
| 100 | +As a guide, a local folder is likely to be portable if: |
| 101 | +- the local folder contains all needed resources (e.g., CSS, JS, images), and |
| 102 | +- the local HTML files _do not_ use root-relative links (beginning with `/`). |
| 103 | + |
| 104 | +In this simple case, you can use: |
| 105 | +```bash |
| 106 | +lychee ./out --remap "^https://example\.com/docs file://$(pwd)/out" |
| 107 | +``` |
| 108 | +This remaps remote URLs within the `/docs` subpath into the local folder. |
| 109 | +`--root-dir` is intentionally omitted because root-relative links cannot work |
| 110 | +correctly without more setup—see below if you need root directory support. |
| 111 | + |
| 112 | +### More Complex Cases |
| 113 | + |
| 114 | +To make `--root-dir` work in this context, your folder structure has to mimic |
| 115 | +the structure of the remote website. We can make a "temporary root dir" which |
| 116 | +has the right structure and sits next to the original local folder. In this |
| 117 | +way, we avoid needing to change our existing local folder structure, and we can |
| 118 | +use symbolic links to point to the existing files. |
| 119 | +``` |
| 120 | +├── out |
| 121 | +│ └── page.html |
| 122 | +└── temp-root-dir |
| 123 | + └── docs -> ../out |
| 124 | +``` |
| 125 | +```bash |
| 126 | +mkdir temp-root-dir |
| 127 | +ln -s ../out temp-root-dir/docs |
| 128 | +``` |
| 129 | + |
| 130 | +Additionally, since the local folder is only a subset of the website, certain |
| 131 | +relative links should be treated as links to the online website (for example, |
| 132 | +the root link `/`). In effect, this means that paths inside `./temp-root-dir` |
| 133 | +but outside of `./temp-root-dir/docs` must be redirected to the online website. |
| 134 | + |
| 135 | +Putting it all together, the lychee command looks like this: |
| 136 | +```bash |
| 137 | +lychee ./out \ |
| 138 | + --root-dir ./temp-root-dir \ |
| 139 | + --remap "^https://example\.com/docs file://$(pwd)/out" \ |
| 140 | + --remap "file://$(pwd)/temp-root-dir/docs file://$(pwd)/temp-root-dir/docs" \ |
| 141 | + --remap "file://$(pwd)/temp-root-dir https://example\.com" |
| 142 | +``` |
| 143 | +Note that the order of remaps is significant—earlier remaps are tried |
| 144 | +first and have priority over later ones. |
| 145 | + |
| 146 | +## Limitations |
| 147 | + |
| 148 | +- Remaps are applied textually. As an example, the remap |
| 149 | + ``` |
| 150 | + --remap "^https://example\.com/docs file://$(pwd)/out" |
| 151 | + ``` |
| 152 | + applies to any URL _beginning_ with that string even if it's inside a different subfolder. |
| 153 | + For instance, it would also apply to a URL of `https://example.com/docs-2/page`. |
| 154 | + |
| 155 | + If you need to guard against this, you can change the regex to end with |
| 156 | + `([?#/]|$)` and add `$1` to the replacement, like so: |
| 157 | + ``` |
| 158 | + --remap "^https://example\.com/docs([?#/]|$) file://$(pwd)/out\$1" |
| 159 | + ``` |
| 160 | + Note that the `$1` must be escaped to avoid being treated as a shell |
| 161 | + variable. |
| 162 | + |
| 163 | +- Remap patterns are regular expressions, so many common URL symbols |
| 164 | + should be escaped to avoid being treated as regex metacharacters |
| 165 | + (including `.?$+` and brackets). For example, the remaps in this |
| 166 | + page use `\.` in domain names. |
| 167 | + |
| 168 | +- If you are using remaps for multiple purposes, be aware of potential |
| 169 | + conflicts between them. For each URL, remaps are tried in order and the |
| 170 | + *first* matching remap will be applied. |
| 171 | + |
| 172 | +## See Also |
| 173 | + |
| 174 | +If your URLs make use of automatic index files or automatic file extensions, see |
| 175 | +[Pretty URLs](/recipes/pretty-urls/) to enable the same features for local |
| 176 | +files. |
| 177 | + |
| 178 | +This documentation page was motivated by certain issue reports |
| 179 | +([#1918](https://github.com/lycheeverse/lychee/issues/1918), |
| 180 | +[#1594](https://github.com/lycheeverse/lychee/issues/1594)). |
| 181 | +In particular, the UX/documentation issue was discussed in |
| 182 | +[#1718](https://github.com/lycheeverse/lychee/issues/1718). |
| 183 | +These links are for historical background and might not reflect |
| 184 | +the current version of lychee. |
| 185 | + |
0 commit comments