Skip to content

Commit 67685b7

Browse files
Improve handling for wildcard URLs
fixes #38
1 parent f7c0f1a commit 67685b7

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

lib/wayback_machine_downloader.rb

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,18 +184,29 @@ def initialize params
184184
end
185185

186186
def backup_name
187-
url_to_process = @base_url.end_with?('/*') ? @base_url.chomp('/*') : @base_url
187+
url_to_process = @base_url
188+
url_to_process = url_to_process.chomp('/*') if url_to_process&.end_with?('/*')
189+
188190
raw = if url_to_process.include?('//')
189191
url_to_process.split('/')[2]
190192
else
191193
url_to_process
192194
end
193195

196+
# if it looks like a wildcard pattern, normalize to a safe host-ish name
197+
if raw&.start_with?('*.')
198+
raw = raw.sub(/\A\*\./, 'all-')
199+
end
200+
194201
# sanitize for Windows (and safe cross-platform) to avoid ENOTDIR on mkdir (colon in host:port)
195202
if Gem.win_platform?
196203
raw = raw.gsub(/[:*?"<>|]/, '_')
197204
raw = raw.gsub(/[ .]+\z/, '')
205+
else
206+
# still good practice to strip path separators (and maybe '*' for POSIX too)
207+
raw = raw.gsub(/[\/:*?"<>|]/, '_')
198208
end
209+
199210
raw = 'site' if raw.nil? || raw.empty?
200211
raw
201212
end

0 commit comments

Comments
 (0)