-
Notifications
You must be signed in to change notification settings - Fork 469
Description
When I first time using ParamSpider, I triggered an ValueError due to a call to urlparse, here is the info:
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Scripts\paramspider.exe_main.py", line 7, in
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\paramspider\main.py", line 161, in main
fetch_and_clean_urls(domain, extensions, args.stream, args.proxy, args.placeholder)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\paramspider\main.py", line 100, in fetch_and_clean_urls
cleaned_urls = clean_urls(urls, extensions, placeholder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\paramspider\main.py", line 71, in clean_urls
cleaned_url = clean_url(url)
^^^^^^^^^^^^^^
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\paramspider\main.py", line 53, in clean_url
if (parsed_url.port == 80 and parsed_url.scheme == "http") or (parsed_url.port == 443 and parsed_url.scheme == "https"):
^^^^^^^^^^^^^^^
File "C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\urllib\parse.py", line 184, in port
raise ValueError("Port out of range 0-65535")
ValueError: Port out of range 0-65535
I think the data source of Wayback Achives is unreliable. It may require an additional layer of filtering to ensure data availability, but this is my first time using it and I cannot be sure if it is common. Perhaps you have a better solution, so I won't add unnecessary details.