Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lib/webrick/httpservlet/cgihandler.rb
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def do_GET(req, res)
"Premature end of script headers: #{@script_filename}" if body.nil?

begin
header = HTTPUtils::parse_header(raw_header)
header = HTTPUtils::parse_header(raw_header, cgi_mode: true)
if /^(\d+)/ =~ header['status'][0]
res.status = $1.to_i
header.delete('status')
Expand Down
15 changes: 12 additions & 3 deletions lib/webrick/httputils.rb
Original file line number Diff line number Diff line change
Expand Up @@ -168,17 +168,26 @@ def join(separator = "; ")
"cookie" => CookieHeader,
})

def parse_header(raw)
REGEXP_HEADER_LINE = /^([A-Za-z0-9!\#$%&'*+\-.^_`|~]+):([^\r\n\0]*?)\r\n\z/m
REGEXP_CGI_HEADER_LINE = /^([A-Za-z0-9!\#$%&'*+\-.^_`|~]+):([^\r\n\0]*?)\r?\n\z/m
REGEXP_CONTINUED_HEADER_LINE = /^[ \t]+([^\r\n\0]*?)\r\n/m
REGEXP_CONTINUED_CGI_HEADER_LINE = /^[ \t]+([^\r\n\0]*?)\r?\n/m

def parse_header(raw, cgi_mode: false)
Copy link
Member

@ioquatix ioquatix Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a public interface change? or is it internal to CGIHandler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a public interface change. However, adding an optional keyword argument should be a backwards compatible change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, I'd like to set the bar a little higher on the naming of cgi_mode & related documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the general state of WEBrick's documentation, lack of documentation hardly seems like a blocker (though documentation improvements are obviously welcomed). If you don't like the argument name, please pick a new one (allow_bare_lf?) and I'm sure we can switch to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I've updated the comment for parse_header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separate regexes are a performance optimization, so we don't need to allocate 2 regex per call.

If these are an implementation detail, can we make them private?

Yes, but that's also true of many methods in Ruby, so I don't see why it should be a blocker.

It's not, it's an observation to explain my position.

I don't want to make structural changes when they aren't necessary to fix a bug.

Sometimes the shortest path from A to B is not the best one.

As this is a CGI specific code path, my preference is for this code not to leak outside CGIHandler. I'd like to hear back from @paulownia but Jeremy I don't mind if you merge this after that. I am not planning on fixing WEBRick's design issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed explanation. I understand the point about separating line reading from header parsing, and keeping CGI-specific code within CGIHandler. I agree that a cleaner design would be ideal if possible.

Since no major design changes are required, I will move the CGI-related code. But would simply moving the two constants to CGIHandler be sufficient? I'm not sure this is the best approach—any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and marking them as private would also be a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the code, but it seems better to pass the Regexp itself instead of cgi_mode. This way, we can use private_constant to make the constants completely private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I forgot to request a review. When you have time, could you take a look? I’d appreciate it!

header = Hash.new([].freeze)
field = nil

header_line = cgi_mode ? REGEXP_CGI_HEADER_LINE : REGEXP_HEADER_LINE
continued_header_lines = cgi_mode ? REGEXP_CONTINUED_CGI_HEADER_LINE : REGEXP_CONTINUED_HEADER_LINE

raw.each_line{|line|
case line
when /^([A-Za-z0-9!\#$%&'*+\-.^_`|~]+):([^\r\n\0]*?)\r\n\z/om
when header_line
field, value = $1, $2
field.downcase!
header[field] = HEADER_CLASSES[field].new unless header.has_key?(field)
header[field] << value
when /^[ \t]+([^\r\n\0]*?)\r\n/om
when continued_header_lines
unless field
raise HTTPStatus::BadRequest, "bad header '#{line}'."
end
Expand Down
2 changes: 1 addition & 1 deletion sig/httputils.rbs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ module WEBrick

HEADER_CLASSES: Hash[String, untyped]

def self?.parse_header: (String raw) -> Hash[String, Array[String]]
def self?.parse_header: (String raw, ?cgi_mode: bool) -> Hash[String, Array[String]]

def self?.split_header_value: (String str) -> Array[String]

Expand Down
11 changes: 11 additions & 0 deletions test/webrick/test_cgi.rb
Original file line number Diff line number Diff line change
Expand Up @@ -145,4 +145,15 @@ def test_bad_header
assert_not_match(CtrlPat, s)
}
end

def test_bare_lf_in_cgi_header
TestWEBrick.start_cgi_server do |server, addr, port, log|
http = Net::HTTP.new(addr, port)
req = Net::HTTP::Get.new("/webrick_bare_lf.cgi")
assert_nothing_raised do
res = http.request(req)
assert_equal res['Content-Type'], 'text/plain'
end
end
end
end
8 changes: 8 additions & 0 deletions test/webrick/webrick_bare_lf.cgi
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!ruby

body = "test for bare LF in cgi header"

print "Content-Type: text/plain\n"
print "Content-Length: #{body.size}\n"
print "\n"
print body