|
1 | | -# proof-html |
| 1 | +# proof-html |
| 2 | + |
| 3 | +proof-html is a [GitHub Action](https://github.com/features/actions) to validate HTML and CSS using the [Nu HTML Validator](https://github.com/validator/validator) and check links, images, and more using [HTMLProofer](https://github.com/gjtorikian/html-proofer). |
| 4 | + |
| 5 | +## Usage |
| 6 | + |
| 7 | +```yaml |
| 8 | +- uses: step-security/proof-html@v2 |
| 9 | + with: |
| 10 | + directory: ./site |
| 11 | +``` |
| 12 | +
|
| 13 | +See below for a [full example](#full-example). |
| 14 | +
|
| 15 | +## Options |
| 16 | +
|
| 17 | +| Name | Description | Default | |
| 18 | +| --- | --- | --- | |
| 19 | +| `directory` | The directory to scan | (required) | |
| 20 | +| `check_html` | Validate HTML | true | |
| 21 | +| `check_css` | Validate CSS | true | |
| 22 | +| `validator_ignore` | Regex of HTML/CSS validator errors to ignore | (empty) | |
| 23 | +| `check_external_hash` | Check whether external anchors exist | true | |
| 24 | +| `check_favicon` | Check whether favicons are valid | true | |
| 25 | +| `check_opengraph` | Check images and URLs in Open Graph metadata | true | |
| 26 | +| `ignore_empty_alt` | Allow images with empty alt tags | false | |
| 27 | +| `ignore_missing_alt` | Allow images with missing alt tags | false | |
| 28 | +| `allow_missing_href` | Allow anchors with missing href tags | false | |
| 29 | +| `enforce_https` | Require that links use HTTPS | true | |
| 30 | +| `swap_urls` | JSON-encoded map of URL rewrite rules | (empty) | |
| 31 | +| `disable_external` | Disables the external link checker | false | |
| 32 | +| `ignore_url` | Newline-separated list of URLs to ignore | (empty) | |
| 33 | +| `ignore_url_re` | Newline-separated list of URL regexes to ignore | (empty) | |
| 34 | +| `connect_timeout` | HTTP connection timeout | 30 | |
| 35 | +| `tokens` | JSON-encoded map of domains to authorization tokens | (empty) | |
| 36 | +| `max_concurrency` | Maximum number of concurrent requests | 50 | |
| 37 | +| `timeout` | HTTP request timeout | 120 | |
| 38 | +| `retries` | Number of times to retry checking links | 3 | |
| 39 | + |
| 40 | +Most of the options correspond directly to [configuration options for |
| 41 | +HTMLProofer](https://github.com/gjtorikian/html-proofer#configuration). |
| 42 | + |
| 43 | +**validator_ignore** |
| 44 | + |
| 45 | +`validator_ignore` is a _regex pattern_ of HTML/CSS validation errors to |
| 46 | +ignore, corresponding to the [`--filterpattern` |
| 47 | +option](https://github.com/validator/validator?tab=readme-ov-file#--filterpattern-regexp) |
| 48 | +of the Nu validator. |
| 49 | + |
| 50 | +For example, you might see the following errors: |
| 51 | + |
| 52 | +``` |
| 53 | +"file:/build/index.html":0.1-0.6: error: Start tag seen without seeing a doctype first. Expected “<!DOCTYPE html>”. |
| 54 | +"file:/build/index.html":1.9-1.15: error: Element “head” is missing a required instance of child element “title”. |
| 55 | +"file:/build/style.css":2.8-2.8: error: CSS: “foo”: Property “foo” doesn't exist. |
| 56 | +``` |
| 57 | + |
| 58 | +If you wanted to ignore the first error, and you wanted to ignore all |
| 59 | +non-existent properties in CSS, you could set the `validator_ignore` argument |
| 60 | +to: |
| 61 | + |
| 62 | +``` |
| 63 | +Start tag seen without seeing a doctype first.*|CSS: “.*”: Property “.*” doesn't exist. |
| 64 | +``` |
| 65 | + |
| 66 | +**tokens** |
| 67 | + |
| 68 | +`tokens` is a _JSON-encoded_ map of domains to authorization tokens. So it's |
| 69 | +"doubly encoded": the workflow file is written in YAML and `tokens` is a string |
| 70 | +(not a map!), a JSON encoding of the data. This option can be used to provide |
| 71 | +bearer tokens to use in certain scenarios, which is useful for e.g. avoiding |
| 72 | +rate limiting. Tokens are only sent to the specified websites. Note that |
| 73 | +domains must not have a trailing slash. Here is an example of an encoding of |
| 74 | +tokens: |
| 75 | + |
| 76 | +```yaml |
| 77 | +tokens: | |
| 78 | + {"https://github.com": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", |
| 79 | + "https://twitter.com": "yyyyyyyyyyyyyyyyyyyyyyy"} |
| 80 | +``` |
| 81 | + |
| 82 | +You can also see the full example below for how to pass on the `GITHUB_TOKEN` |
| 83 | +supplied by the workflow runner. |
| 84 | + |
| 85 | +**swap_urls** |
| 86 | + |
| 87 | +`swap_urls` is a _JSON-encoded_ map, mapping regexes to strings. This can be |
| 88 | +useful to strip a base path for an internal domain. For example: |
| 89 | + |
| 90 | +```yaml |
| 91 | +swap_urls: | |
| 92 | + {"^https:\\/\\/example\\.com\\/": "/"} |
| 93 | +``` |
| 94 | + |
| 95 | +You can also use capture groups and back-references here. For example, to |
| 96 | +ignore checking hashes for GitHub URLs (like |
| 97 | +`https://github.com/step-security/proof-html#options`), you can use: |
| 98 | + |
| 99 | +```yaml |
| 100 | +swap_urls: | |
| 101 | + {"^(https:\\/\\/github\\.com\\/.*)#.*$": "\\1"} |
| 102 | +``` |
| 103 | + |
| 104 | +## Full Example |
| 105 | + |
| 106 | +This is the entire `.github/workflows/build.yml` file for a GitHub Pages / |
| 107 | +[Jekyll](https://jekyllrb.com/docs/github-pages/) site. |
| 108 | + |
| 109 | +```yaml |
| 110 | +name: CI |
| 111 | +on: |
| 112 | + push: |
| 113 | + schedule: |
| 114 | + - cron: '0 8 * * 6' |
| 115 | +jobs: |
| 116 | + build: |
| 117 | + runs-on: ubuntu-latest |
| 118 | + steps: |
| 119 | + - uses: actions/checkout@v5 |
| 120 | + - uses: actions/setup-ruby@v1 |
| 121 | + with: |
| 122 | + ruby-version: 2.7.x |
| 123 | + - uses: actions/cache@v4 |
| 124 | + with: |
| 125 | + path: vendor/bundle |
| 126 | + key: ${{ runner.os }}-gems-${{ hashFiles('**/Gemfile.lock') }} |
| 127 | + restore-keys: | |
| 128 | + ${{ runner.os }}-gems- |
| 129 | + - run: | |
| 130 | + bundle config path vendor/bundle |
| 131 | + bundle install --jobs 4 --retry 3 |
| 132 | + - run: bundle exec jekyll build |
| 133 | + - uses: step-security/proof-html@v2 |
| 134 | + with: |
| 135 | + directory: ./_site |
| 136 | + enforce_https: false |
| 137 | + tokens: | |
| 138 | + {"https://github.com": "${{ secrets.GITHUB_TOKEN }}"} |
| 139 | + ignore_url: | |
| 140 | + http://www.example.com/ |
| 141 | + https://en.wikipedia.org/wiki/Main_Page |
| 142 | + ignore_url_re: | |
| 143 | + ^https://twitter.com/ |
| 144 | +``` |
| 145 | + |
| 146 | +## Running locally |
| 147 | + |
| 148 | +You can build the Docker container locally with `docker build . -t proof-html`. |
| 149 | + |
| 150 | +The GitHub Action is set up to pass arguments as strings through environment |
| 151 | +variables, where an argument like `ignore_url` is passed as `INPUT_IGNORE_URL` |
| 152 | +(capitalize and prepend `INPUT_`) to the Docker container, so you will need to |
| 153 | +do this translation yourself if you're running the Docker container locally. |
| 154 | +You can mount a local directory in the Docker container with the `-v` argument |
| 155 | +and pass the directory name as the `INPUT_DIRECTORY` argument. For example, if |
| 156 | +you compiled a site into the `build` directory, you can run: |
| 157 | + |
| 158 | +```bash |
| 159 | +docker run --rm \ |
| 160 | + -e INPUT_DIRECTORY=build \ |
| 161 | + -v "${PWD}/build:/build" \ |
| 162 | + proof-html:latest |
| 163 | +``` |
| 164 | + |
| 165 | +You can pass additional arguments as additional environment variables, e.g. |
| 166 | +`-e INPUT_FORCE_HTTPS=0` or |
| 167 | +`-e INPUT_TOKENS='{"https://github.com": "your-token-here"}'`. |
| 168 | + |
| 169 | +## License |
| 170 | + |
| 171 | +Copyright (c) Anish Athalye. Copyright (c) StepSecurity. Released under the MIT License. See |
| 172 | +[LICENSE](LICENSE) for details. |
0 commit comments