-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Description
Description
This error does not occur every time the code is executed, but it happens randomly, meaning I never know when it will happen.
The problem is when using CDP when the code tries to clear all of a site's storage.
Before executing the CDP command, I am executing a javascript command in the 'top' context of the site so that all requests are blocked in order to stop the generation of new cookies or other data.
I chose to do it this way instead of executing the CDP command to Disable the Network because when disabled by CDP before accessing a new page, you need to enable the NETWORK again and in my test, disabling the Network by CDP was not as efficient for my purpose.
After this, an attempt is made to delete all of the site's storage.
driver.execute_cdp_cmd("Storage.clearDataForOrigin", {
"origin": "https://site.com", # Especifique o domínio do site
"storageTypes": "all"
})
Most of the time it can do this cleaning, but sometimes it catches this error
urllib3.exceptions.ReadTimeoutError
HTTPConnectionPool(host='localhost', port=62196): Read timed out. (read timeout=120)
driver = webdriver.Chrome()
After generating the 'Read timed out' error, the program becomes very slow and can no longer execute the basic driver functions such as driver.get("https://site.com/")
And it takes several minutes to try to execute driver.get or other 'driver' commands, always resulting in an exception.
After the error, for example, when running the command driver.get(site), I noticed that my script could take up to 10 minutes to try to access the site, which in the end always ends up throwing an exception. In other words, the script becomes buggy after the 'Read timed out' error.
The script\program only returns to normal if I go to Chrome initialized by the webdriver and reload the site manually by clicking the reload button.
Reproducible Code
from selenium import webdriver
from urllib3.exceptions import ReadTimeoutError
driver = webdriver.Chrome()
driver.execute_cdp_cmd("Network.enable", {})
def acessar_site():
while True:
try:
driver.get("https://x.com/")
print("Site acessado com Sucesso!")
break
except Exception:
print("Não consegui acessar o site, tentando novamente")
driver.stop_client()
def limpeza_completa_offline():
try:
acessar_site()
# Define o código JavaScript que bloqueia todas as requisições
block_requests_script = """
window.blockAllRequests = () => {
// Substituir fetch
const originalFetch = window.fetch;
window.fetch = (...args) => {
console.log('Requisição fetch bloqueada:', args);
return new Promise(() => {}); // Retorna uma Promise que nunca será resolvida
};
// Substituir XMLHttpRequest
const originalXhrOpen = window.XMLHttpRequest.prototype.open;
const originalXhrSend = window.XMLHttpRequest.prototype.send;
window.XMLHttpRequest.prototype.open = function(method, url, ...rest) {
this._url = url;
originalXhrOpen.call(this, method, url, ...rest);
};
window.XMLHttpRequest.prototype.send = function(...args) {
if (this._url) {
console.log('Requisição XMLHttpRequest bloqueada:', this._url);
}
};
// Substituir EventSource (SSE)
const originalEventSource = window.EventSource;
window.EventSource = function(url, config) {
console.log('Requisição EventSource bloqueada:', url);
};
// Substituir WebSocket
const originalWebSocket = window.WebSocket;
window.WebSocket = function(url, protocols) {
console.log('Requisição WebSocket bloqueada:', url);
throw new Error("WebSocket bloqueado!");
};
};
window.blockAllRequests();
"""
print("Injetando java no contexto da página: bloquear requisições")
# Executa o código JavaScript para bloquear as requisições
driver.execute_script(block_requests_script)
print("Comando CMD: Limpando Local Storage, IndexedDB, Cache Storage, Interrompendo e Desregistrando os Services Workers.")
# Limpa Local Storage, IndexedDB, Cache Storage, Interrompe e Desrigistra os Services Workers.
driver.execute_cdp_cmd("Storage.clearDataForOrigin", {
"origin": "https://x.com", # Especifique o domínio do site
"storageTypes": "all"
})
except ReadTimeoutError as RTE:
print(F"Excessão: {RTE}")
print("Breakpoint; First try to execute the def in pdb and after this try to continue the execution of the script >> acessar_site()")
breakpoint()
acessar_site()
while True:
# At some point the error appears
limpeza_completa_offline()Debugging Logs
HTTPConnectionPool(host='localhost', port=62196): Read timed out. (read timeout=120)