-
Notifications
You must be signed in to change notification settings - Fork 85
Description
Background
Currently there is a feature to crawl through rendered html for additional links.
This is passed in as a boolean.
An issue #127 was opened asking for the ability to disallow some domains to be rendered.
Change
We could change the crawl behaviour from a boolean to an optional function the consumer can pass in to decide whether a link should be rendered.
Option 1: On all HTML
Crawl function could get called once a render is complete.
It would be responsible for looking for all links on the page and returning an array of new pages to render.
Optionally we could add a getHrefsFromHtml convenience function to save each consumer writing this parser.
import StaticSiteGeneratorPlugin, { getHrefsFromHtml } from 'static-site-generator-webpack-plugin';
new StaticSiteGeneratorPlugin({
crawl = ({html}) => getHrefsFromHtml(html)
.filter(href => !href.includes('bad.com')
});Option 2: On each link
Crawl function could get called after we've parsed the rendered HTML for links.
import StaticSiteGeneratorPlugin from 'static-site-generator-webpack-plugin';
new StaticSiteGeneratorPlugin({
crawl = ({href, html, index}) => !href.includes('bad.com')
});Feedback
Feedback is welcome. Please comment below with your thoughts.