Skip to content

Commit da2a27b

Browse files
committed
Improve local Dockerfile (and remove its HTTP server)
This changes the Docker-mediated build (i.e. build.sh --docker) to behave the same as the non-Docker-mediated version. In particular, it: * Operates on local input/cache/output directories, instead of copying the input into the container every build, and managing cache and output entirely within the container * No longer creates and exposes a HTTP server, since the output is written directly to the disk (outside the container) * Respects all command-line options, including --no-update Other improvements: * Changes the strategy for clearing directories, since deleting the directory was causing some problems in my local testing * Slims down the size of the Docker container significantly, by using debian:stable-slim, copying less of html-build in, and no longer copying the source directory in repeatedly * Does not bother checking for local Wattsi/highlighter if we're going the Docker route
1 parent df0c72b commit da2a27b

File tree

5 files changed

+70
-108
lines changed

5 files changed

+70
-108
lines changed

.dockerignore

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
.git/
2-
.cache/
3-
.temp/
4-
output/
5-
html/.git/
1+
*
2+
!entities/out
3+
!quotes/out
4+
!*.pl
5+
!build.sh
6+
!lint.sh

Dockerfile

Lines changed: 5 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,13 @@
1-
FROM debian:stable
2-
3-
## dependency installation: nginx and other build tools
1+
FROM debian:stable-slim
42
RUN apt-get update && \
5-
apt-get install -y ca-certificates curl git unzip nginx python3 python3-pip && \
6-
rm -rf /etc/nginx/sites-enabled/* && \
3+
apt-get install -y ca-certificates curl git unzip python3 python3-pip && \
74
rm -rf /var/lib/apt/lists/*
85

96
COPY --from=whatwg/wattsi:latest /whatwg/wattsi/bin/wattsi /bin/wattsi
107

11-
ADD . /whatwg/build
12-
138
RUN pip3 install bs-highlighter
149

15-
ARG html_source_dir
16-
ADD $html_source_dir /whatwg/html
17-
ENV HTML_SOURCE /whatwg/html
18-
19-
WORKDIR /whatwg/build
20-
21-
## build and copy assets to final nginx dir
22-
23-
ARG verbose_or_quiet_flag
24-
ARG no_update_flag
25-
ARG sha_override
26-
27-
# no_update_flag doesn't really work; .cache directory is re-created empty each time
28-
RUN SKIP_BUILD_UPDATE_CHECK=true SHA_OVERRIDE=$sha_override \
29-
./build.sh $verbose_or_quiet_flag $no_update_flag && \
30-
rm -rf /var/www/html && \
31-
mv output /var/www/html && \
32-
chmod -R o+rX /var/www/html && \
33-
cp site.conf /etc/nginx/sites-enabled/
10+
COPY . /whatwg/html-build/
3411

35-
CMD ["nginx", "-g", "daemon off;"]
12+
ENV SKIP_BUILD_UPDATE_CHECK true
13+
ENTRYPOINT ["bash", "/whatwg/html-build/build.sh"]

README.md

Lines changed: 7 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -34,28 +34,9 @@ Run the `build.sh` script from inside your `html-build` working directory, like
3434

3535
The first time this runs, it will ask for your input on where to clone the HTML source from, or where on your system to find it if you've already done that. If you're working to submit a pull request to [whatwg/html](https://github.com/whatwg/html), be sure to give it the URL of your fork.
3636

37-
### Output
38-
39-
After you complete the build steps above, the build will run and generate the single-page version of the spec, the multipage version, and more. If all goes well, you should very soon have all the following in your `output/` directory:
40-
41-
- `404.html`
42-
- `demos/*`
43-
- `dev/*`
44-
- `entities.json`
45-
- `fonts/*`
46-
- `html-dfn.js`
47-
- `images/*`
48-
- `index.html`
49-
- `link-fixup.js`
50-
- `multipage/*`
51-
- `robots.txt`
52-
- `xrefs.json`
53-
54-
Now you're ready to edit the `html/source` file—and after you make your changes, you can run the `build.sh` script again to see the new output.
55-
5637
## Building using a Docker container
5738

58-
The Dockerized version of the build allows you to run the build entirely inside a "container" (lightweight virtual machine). This includes tricky dependencies like a local copy of Wattsi, as well an HTTP server setup similar to that of https://html.spec.whatwg.org.
39+
The Dockerized version of the build allows you to run the build entirely inside a "container" (lightweight virtual machine). This includes tricky dependencies like a local copy of Wattsi and Python.
5940

6041
To perform a Dockerized build, use the `--docker` flag:
6142

@@ -65,9 +46,13 @@ To perform a Dockerized build, use the `--docker` flag:
6546

6647
The first time you do this, Docker will download a bunch of stuff to set up the container properly, but subsequent runs will simply build the standard and be very fast.
6748

68-
After building the standard, this will launch a HTTP server that allows you to view the result at `http://localhost:8080`. (OS X and Windows users will need to use the IP address of their docker-machine VM instead of `localhost`. You can get this with the `docker-machine env` command.)
49+
If you get permissions errors on Windows, you need to first [configure](https://docs.docker.com/docker-for-windows/#file-sharing) your `html-build/` and `html/` directories to be shareable with Docker.
6950

70-
Note that due to the way Docker works, the HTML source repository must be contained in a subdirectory of the `html-build` working directory. This will happen automatically if you let `build.sh` clone for you, but if you have a preexisting clone you'll need to move it.
51+
## Output
52+
53+
After you complete the build steps above, the build will run and generate the single-page version of the spec, the multipage version, and more. If all goes well, you should very soon have an `output/` directory containing important files like `index.html`, `multipage/`, and `dev/`.
54+
55+
Now you're ready to edit the `html/source` file—and after you make your changes, you can run the `build.sh` script again to see the new output.
7156

7257
## A note on Git history
7358

build.sh

Lines changed: 52 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -36,14 +36,14 @@ export HTML_TEMP
3636
# Used specifically when the Dockerfile calls this script
3737
SKIP_BUILD_UPDATE_CHECK=${SKIP_BUILD_UPDATE_CHECK:-false}
3838
SHA_OVERRIDE=${SHA_OVERRIDE:-}
39-
HIGHLIGHT_SERVER_URL="http://127.0.0.1:8080" # this needs to be coordinated with the bs-highlighter package
39+
BUILD_SHA_OVERRIDE=${BUILD_SHA_OVERRIDE:-}
40+
41+
# This needs to be coordinated with the bs-highlighter package
42+
HIGHLIGHT_SERVER_URL="http://127.0.0.1:8080"
4043

4144
function main {
4245
processCommandLineArgs "$@"
4346

44-
checkWattsi
45-
ensureHighlighterInstalled
46-
4747
# $SKIP_BUILD_UPDATE_CHECK is set inside the Dockerfile so that we don't check for updates both inside and outside
4848
# the Docker container.
4949
if [[ $DO_UPDATE == "true" && $SKIP_BUILD_UPDATE_CHECK != "true" ]]; then
@@ -52,14 +52,22 @@ function main {
5252

5353
findHTMLSource
5454

55-
HTML_GIT_DIR="$HTML_SOURCE/.git/"
56-
HTML_SHA=${SHA_OVERRIDE:-$(git --git-dir="$HTML_GIT_DIR" rev-parse HEAD)}
55+
clearDir "$HTML_OUTPUT"
56+
# Set these up so rsync will not complain about either being missing
57+
mkdir -p "$HTML_OUTPUT/commit-snapshots"
58+
mkdir -p "$HTML_OUTPUT/review-drafts"
5759

5860
if [[ $USE_DOCKER == "true" ]]; then
5961
doDockerBuild
6062
exit 0
6163
fi
6264

65+
checkWattsi
66+
ensureHighlighterInstalled
67+
68+
HTML_GIT_DIR="$HTML_SOURCE/.git/"
69+
HTML_SHA=${SHA_OVERRIDE:-$(git --git-dir="$HTML_GIT_DIR" rev-parse HEAD)}
70+
6371
$QUIET || echo "Linting the source file..."
6472
./lint.sh "$HTML_SOURCE/source" || {
6573
echo
@@ -71,11 +79,6 @@ function main {
7179

7280
updateRemoteDataFiles
7381

74-
rm -rf "$HTML_OUTPUT" && mkdir -p "$HTML_OUTPUT"
75-
# Set these up so rsync will not complain about either being missing
76-
mkdir -p "$HTML_OUTPUT/commit-snapshots"
77-
mkdir -p "$HTML_OUTPUT/review-drafts"
78-
7982
startHighlightServer
8083

8184
processSource "source" "default"
@@ -110,7 +113,7 @@ function processCommandLineArgs {
110113
do
111114
case $arg in
112115
clean)
113-
rm -rf "$HTML_CACHE"
116+
clearDir "$HTML_CACHE"
114117
exit 0
115118
;;
116119
help)
@@ -120,7 +123,7 @@ function processCommandLineArgs {
120123
echo " $0 help Show this usage statement."
121124
echo
122125
echo "Build options:"
123-
echo " -d|--docker Use Docker to build in and serve from a container."
126+
echo " -d|--docker Use Docker to build in a container."
124127
echo " -n|--no-update Don't update before building; just build."
125128
echo " -q|--quiet Don't emit any messages except errors/warnings."
126129
echo " -v|--verbose Show verbose output from every build step."
@@ -182,9 +185,8 @@ function checkHTMLBuildIsUpToDate {
182185
# - Output:
183186
# - Either bs-highlighter-server will be in the $PATH, or a warning will be echoed
184187
function ensureHighlighterInstalled {
185-
# If we're using Docker then this will be installed inside the container.
186188
# If we're not using local Wattsi then we won't use the local highlighter.
187-
if [[ $USE_DOCKER != "true" && $LOCAL_WATTSI == "true" ]]; then
189+
if [[ $LOCAL_WATTSI == "true" ]]; then
188190
if hash pip3 2>/dev/null; then
189191
if ! hash bs-highlighter-server 2>/dev/null; then
190192
pip3 install bs-highlighter
@@ -386,33 +388,23 @@ function relativePath {
386388
# Arguments: none
387389
# Output: A web server with the build output will be running inside the Docker container
388390
function doDockerBuild {
389-
if [[ $HTML_SOURCE != $(pwd)/* ]]; then
390-
echo "When using Docker, the HTML source must be checked out in a subdirectory of the html-build repo. Cannot continue."
391-
exit 1
392-
fi
393-
394-
# $SOURCE_RELATIVE helps on Windows with Git Bash, where /c/... is a symlink, which Docker doesn't like.
395-
SOURCE_RELATIVE=$(relativePath "$(pwd)" "$HTML_SOURCE")
396-
397-
VERBOSE_OR_QUIET_FLAG=""
398-
$QUIET && VERBOSE_OR_QUIET_FLAG+="--quiet"
399-
$VERBOSE && VERBOSE_OR_QUIET_FLAG+="--verbose"
400-
401-
NO_UPDATE_FLAG="--no-update"
402-
$DO_UPDATE && NO_UPDATE_FLAG=""
403-
404-
DOCKER_ARGS=( --tag whatwg-html \
405-
--build-arg "html_source_dir=$SOURCE_RELATIVE" \
406-
--build-arg "verbose_or_quiet_flag=$VERBOSE_OR_QUIET_FLAG" \
407-
--build-arg "no_update_flag=$NO_UPDATE_FLAG" \
408-
--build-arg "sha_override=$HTML_SHA" )
409-
if $QUIET; then
410-
DOCKER_ARGS+=( --quiet )
411-
fi
412-
413-
docker build "${DOCKER_ARGS[@]}" .
414-
echo "Running server on http://localhost:8080"
415-
docker run --rm -it -p 8080:80 whatwg-html
391+
DOCKER_BUILD_ARGS=( --tag whatwg-html )
392+
$QUIET && DOCKER_BUILD_ARGS+=( --quiet )
393+
394+
docker build "${DOCKER_BUILD_ARGS[@]}" .
395+
396+
DOCKER_RUN_ARGS=( whatwg-html )
397+
$QUIET && DOCKER_RUN_ARGS+=( --quiet )
398+
$VERBOSE && DOCKER_RUN_ARGS+=( --verbose )
399+
$DO_UPDATE || DOCKER_RUN_ARGS+=( --no-update )
400+
401+
# Pass in the html-build SHA (since there's no .git directory inside the container)
402+
docker run --rm --interactive --tty \
403+
--env "BUILD_SHA_OVERRIDE=$(git rev-parse HEAD)" \
404+
--mount "type=bind,source=$HTML_SOURCE,destination=/whatwg/html-build/html,readonly=1" \
405+
--mount "type=bind,source=$HTML_CACHE,destination=/whatwg/html-build/.cache" \
406+
--mount "type=bind,source=$HTML_OUTPUT,destination=/whatwg/html-build/output" \
407+
"${DOCKER_RUN_ARGS[@]}"
416408
}
417409

418410
# Clears the $HTML_CACHE directory if the build tools have been updated since last run.
@@ -422,13 +414,12 @@ function doDockerBuild {
422414
function clearCacheIfNecessary {
423415
if [[ -d "$HTML_CACHE" ]]; then
424416
PREV_BUILD_SHA=$( cat "$HTML_CACHE/last-build-sha.txt" 2>/dev/null || echo )
425-
CURRENT_BUILD_SHA=$( git rev-parse HEAD )
417+
CURRENT_BUILD_SHA=${BUILD_SHA_OVERRIDE:-$(git rev-parse HEAD)}
426418

427419
if [[ $PREV_BUILD_SHA != "$CURRENT_BUILD_SHA" ]]; then
428420
$QUIET || echo "Build tools have been updated since last run; clearing the cache..."
429421
DO_UPDATE=true
430-
rm -rf "$HTML_CACHE"
431-
mkdir -p "$HTML_CACHE"
422+
clearDir "$HTML_CACHE"
432423
echo "$CURRENT_BUILD_SHA" > "$HTML_CACHE/last-build-sha.txt"
433424
fi
434425
else
@@ -477,7 +468,7 @@ function updateRemoteDataFiles {
477468
# - Output:
478469
# - $HTML_OUTPUT will contain the built files
479470
function processSource {
480-
rm -rf "$HTML_TEMP" && mkdir -p "$HTML_TEMP"
471+
clearDir "$HTML_TEMP"
481472

482473
$QUIET || echo "Pre-processing the source..."
483474
SOURCE_LOCATION="$1"
@@ -529,10 +520,9 @@ function processSource {
529520
cp -p "$HTML_TEMP/wattsi-output/xrefs.json" "$HTML_OUTPUT"
530521

531522
# Multipage HTML and Dev Edition
532-
rm -rf "$HTML_OUTPUT/multipage"
533523
mv "$HTML_TEMP/wattsi-output/multipage-html" "$HTML_OUTPUT/multipage"
534524
mv "$HTML_TEMP/wattsi-output/multipage-dev" "$HTML_OUTPUT/dev"
535-
rm -rf "$HTML_TEMP"
525+
clearDir "$HTML_TEMP"
536526

537527
echo "User-agent: *
538528
Disallow: /commit-snapshots/
@@ -583,8 +573,7 @@ function checkWattsi {
583573
# - $HTML_TEMP/wattsi-output directory will contain the output from Wattsi on success
584574
# - $HTML_TEMP/wattsi-output.txt will contain the output from Wattsi, on both success and failure
585575
function runWattsi {
586-
rm -rf "$2"
587-
mkdir "$2"
576+
clearDir "$2"
588577

589578
WATTSI_ARGS=()
590579
if $QUIET; then
@@ -680,4 +669,17 @@ function stopHighlightServer {
680669
fi
681670
}
682671

672+
# Ensures the given directory exists, but is empty
673+
# Arguments:
674+
# - $1: the directory to clear
675+
# Output: the directory will be empty (but guaranteed to exist)
676+
function clearDir {
677+
# We use this implementation strategy, instead of `rm -rf`ing the directory, because deleting the
678+
# directory itself can run into permissions issues, e.g. if the directory is open in another
679+
# program, or in the Docker case where we have permission to write to the directory but not delete
680+
# it.
681+
mkdir -p "$1"
682+
find "$1" -mindepth 1 -delete
683+
}
684+
683685
main "$@"

site.conf

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
 (0)