Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions read.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,12 @@ HTTP-REQUEST. Returns NIL if there is no such header amongst
HEADERS."
(when-let (content-type (header-value :content-type headers))
(with-sequence-from-string (stream content-type)
(let* ((*current-error-message* "Corrupted Content-Type header:")
(let* ((*current-error-message* (format nil "Corrupted Content-Type header:(~s)" content-type))
(type (read-token stream))
(subtype (and (assert-char stream #\/)
(read-token stream)))
(subtype (let ((subtype-pos (position #\/ type :test #'char=)))
(cond (subtype-pos
(prog1 (subseq type (1+ subtype-pos))
(setf type (subseq type 0 subtype-pos)))))))
(parameters (read-name-value-pairs stream)))
(values type subtype parameters)))))

Expand Down
42 changes: 28 additions & 14 deletions request.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -223,12 +223,14 @@ headers of the chunked stream \(if any) as a second value."
want-stream
stream
preserve-uri
(encode-unicode-path-p t)
decode-content ; default to nil for backwards compatibility
#+(or abcl clisp lispworks mcl openmcl sbcl)
(connection-timeout 20)
#+:lispworks (read-timeout 20)
#+(and :lispworks (not :lw-does-not-have-write-timeout))
(write-timeout 20 write-timeout-provided-p)
#+sbcl (io-timeout 20)
#+:openmcl
deadline
&aux (unparsed-uri (if (stringp uri) (copy-seq uri) (puri:copy-uri uri))))
Expand Down Expand Up @@ -574,6 +576,10 @@ Any encodings in Transfer-Encoding, such as chunking, are always performed."
connection-timeout
:nodelay :if-supported)))
raw-http-stream http-stream)
#+sbcl
(when io-timeout
(setf (sb-impl::fd-stream-timeout http-stream)
(coerce io-timeout 'single-float)))
#+:openmcl
(when deadline
;; it is correct to set the deadline here even though
Expand Down Expand Up @@ -653,20 +659,28 @@ Any encodings in Transfer-Encoding, such as chunking, are always performed."
(puri:uri-query uri) nil))
(write-http-line "~A ~A ~A"
(string-upcase method)
(if (and preserve-uri
(stringp unparsed-uri))
(trivial-uri-path unparsed-uri)
(puri:render-uri (if (and proxy
(null stream)
(not proxying-https-p)
(not real-host))
uri
(make-instance 'puri:uri
:path (puri:uri-path uri)
:parsed-path (puri:uri-parsed-path uri)
:query (puri:uri-query uri)
:escaped t))
nil))
(let ((uri-string (if (and preserve-uri
(stringp unparsed-uri))
(trivial-uri-path unparsed-uri)
(puri:render-uri (if (and proxy
(null stream)
(not proxying-https-p)
(not real-host))
uri
(make-instance 'puri:uri
:path (puri:uri-path uri)
:parsed-path (puri:uri-parsed-path uri)
:query (puri:uri-query uri)
:escaped t))
nil))))
(if encode-unicode-path-p
(with-output-to-string (*standard-output*)
(loop for c across uri-string
if (> (char-code c) 255)
;; It's not a latin-1 character, so we need to encode it.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URLs must only contain US-ASCII characters, everything else must be encoded.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drakma raised an exception when encountering URLs with Unicode characters in the path or the query parameters. To make URLs more accessible for non-English users, many websites have tried to incorporate Unicode characters in these sections of the URLs, even the HTTP protocol says a URL only contain US-ASCII characters.

I wonder whether we need to support it inside Drakma, if not, I'll try to revert related code change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @jingtaozf,

I understand what you're trying to accomplish. What I mean to say is that in the encoded URL, only US-ASCII characters are permitted, but you're checking for (> (char-code c) 255), which would pass non-US-ASCII characters as well. There also is the issue of determining the correct encoding for those characters. Nowadays, UTF-8 can mostly be assumed, but some web servers may actually try to use the Content-Type to determine the encoding. Some experimentation will be needed, I think.

In any case, I'd recommend that you check for (> (char-code c) 126) and encode using percent encoding using UTF-8.

-Hans

do (write-string (funcall url-encoder (format nil "~c" c) external-format-in))
else do (write-char c)))
uri-string))
(string-upcase protocol))
(when (not (assoc "Host" additional-headers :test #'string-equal))
(write-header "Host" "~A~@[:~A~]" (puri:uri-host uri) (non-default-port uri)))
Expand Down