-
-
Notifications
You must be signed in to change notification settings - Fork 46
Description
I recently discovered that dispatch-rules and related forms make it hard to handle trailing /s on URLs sensibly.
The URL https://example.com/foo/ (with a trailing /) is treated differently than https://example.com/foo (without a trailing /). Strictly speaking, this is correct: those are two distinct URLs, and in principle one could serve completely unrelated content at each. In practice, though, users don't expect such URLs to be distinct. Serving different non-error content at one than the other is a terrible idea, and even returning a successful response from one variant but, e.g., a 404 error from the other (as my sight was doing) is likely to cause confusion. I expect most programs will want to either redirect the less-preferred variant to the canonical variant or simply treat both variants as equivalent.
Unfortunately, web-server/dispatch doesn't provide a great way to implement such behavior. Here's an example in code of the current state of affairs:
#lang racket
(require web-server/dispatch
web-server/http
web-server/servlet-env
net/url
rackunit)
(define ((handler str) req)
(response/output
(λ (out) (write-string str out))))
(define start
(dispatch-case
[() (handler "a")]
[("") (handler "b")]
[("foo") (handler "c")]
[("foo" "") (handler "d")]
[("foo" "" "") (handler "e")]
[("foo" "" "bar") (handler "f")]))
(define (do-request str)
(port->string (get-pure-port (string->url str))))
(check-equal?
(let ([th (thread
(λ ()
(serve/servlet start
#:servlet-regexp #rx""
#:banner? #f
#:launch-browser? #f)))])
(sleep 5)
(begin0 (list (do-request "http://localhost:8000")
(do-request "http://localhost:8000/")
(do-request "http://localhost:8000/foo")
(do-request "http://localhost:8000/foo/")
(do-request "http://localhost:8000/foo//")
(do-request "http://localhost:8000/foo//bar"))
(kill-thread th)))
'("b" "b" "c" "d" "e" "f"))This illustrates a few things:
- Because of how HTTP requests work, the root URL is always equivalent with or without a trailing
/and has a single, empty path element (""). ()is allowed as a pattern, but (due to the above) will never match anything. I think it might be better to make this a syntax error (or a warning that will become an error in a future release).- A trailing
/adds an empty path element ("") to the end of the URL. - There is no
cleanse-path-like case for multiple adjacent/separators.
I have changed my application to handle trailing / separators by using dispatch-rules+applies instead of dispatch-rules (which involved removing my else clause) and, if the original request does not satisfy the predicate from dispatch-rules+applies but a version with a normalized path would, responding with a redirect to the normalized path.
I think it would be better to extend web-server/dispatch to support this sort of thing, but I'm not sure yet what the best way would be to do so. A few things I've been thinking about so far:
- The current language for
dispatch-rulespatterns doesn't have a notion of "splicing" patterns: each string literal orbidi-match-expanderuse applies to a single path element. - Simply giving the same response to all variants seems relatively straight-forward to add as a keyword option. However, that is (at least arguably) less optimal than redirecting to the canonical URL. Redirects open another can of worms, though, since
301 Moved Permanentlyhas issues for methods other thanGETandHEADand308 Permanent Redirectdoesn't have a straightforward fallback for older browsers. I would want to do a301 Moved PermanentlyforGETandHEADrequests and307 Temporary Redirectfor other methods, which RFC 7231 suggests is reasonable. The higher-level question is the trade-off between a simple API that "does the right thing" and configurability. - Building on that theme, there are some cases other than trailing
/support where a similar ability to handle variants on canonical URLs would be nice: for example, case-insensitivity. I can imagine possible extensions to thedispatch-rulesAPI to add a general concept of non-canonical URLs, and it is appealing from one perspective to avoid making trailing/support a baked-in special feature. On the other hand, though, there seems a risk of getting beyond the scope of this library.