-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Description
What happened?
We run the Jenkins Acceptance Test harness. We recently migrated from Selenium 4.11.0 to 4.18.1 (jenkinsci/acceptance-test-harness#1499) but got stuck because of execution issues that were not occurring before.
Our environment runs subsets of the tests suite inside a docker environment:
- a container running the container image
selenium/standalone-firefox:4.18.1 - another container running the test harness using maven and targetting the selenium server.
All tests run sequentially, so it is expected that only one selenium session is active at a given time.
Obviously both client and server are aligned.
We managed to narrow down the issue to the 4.13.0 -> 4.14.1 selenium upgrade.
The symptoms are the following:
- Between 34 and 36 selenium sessions run successfully
- A new session is requested by the client
- Server http logging indicates the session request is processed, a session is created and returned (
HTTP 200onPOST /session) - However the client never receives the response, and ends up timing out after 3 minutes
- Since the session is never used on server side, it ends up timing out after 5 minutes.
- Sometimes, after a longer while (~1 hour) some tests start working again, but then some later tests go back to the same problem and time out trying to grab a new session.
All the tests are launched using the initialization sequence (link), so the problem is clearly not tied to one test content.
I managed to find a workaround for this problem: I added a proxy server (caddy reverse-proxy --from :4444 --to firefox:4444 --insecure) between the client and selenium server and the problem went away.
How can we reproduce the issue?
I never managed to reproduce the problem locally on a Mac, running the same tests against a local selenium server.
Even though I never tracked down the root cause for this, it is probably network stack related. The fact it started to appear just after migrating to 4.14, matching with the JDK http client switch, makes me think it could be related to the problem, even though it seems to affect the server side (a new JVM is forked for each new test class).
I'm hoping this eventually helps to diagnose a possible issue, and leave it for posterity in case anyone faces a similar problem.
Relevant log output
Client side exception
Caused by: org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Host info: host: '739ec46b65a0', ip: '172.18.0.3'
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:537)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:233)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:162)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:142)
at org.jenkinsci.test.acceptance.FallbackConfig.buildRemoteWebDriver(FallbackConfig.java:162)
at org.jenkinsci.test.acceptance.FallbackConfig.createWebDriver(FallbackConfig.java:149)
at org.jenkinsci.test.acceptance.FallbackConfig.createWebDriver(FallbackConfig.java:315)
at org.jenkinsci.test.acceptance.FallbackConfig$$FastClassByGuice$$65dd6.GUICE$TRAMPOLINE(<generated>)
at org.jenkinsci.test.acceptance.FallbackConfig$$FastClassByGuice$$65dd6.apply(<generated>)
at com.google.inject.internal.ProviderMethod$FastClassProviderMethod.doProvision(ProviderMethod.java:260)
at com.google.inject.internal.ProviderMethod.doProvision(ProviderMethod.java:171)
at com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.provision(InternalProviderInstanceBindingImpl.java:185)
at com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.get(InternalProviderInstanceBindingImpl.java:162)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at org.jenkinsci.test.acceptance.guice.TestLifecycle.lambda$scope$0(TestLifecycle.java:58)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:50)
at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:146)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:124)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:300)
at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1148)
... 25 more
Caused by: org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.18.1', revision: 'b1d3319b48'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.146+', java.version: '17.0.10'
Driver info: driver.version: RemoteWebDriver
at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:403)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:359)
at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:54)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:114)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:95)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:67)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
at org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:51)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:519)
... 46 more
Caused by: java.util.concurrent.TimeoutException
at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:386)
... 56 moreServer-side logging
13:38:07.643 INFO [LocalDistributor.newSession] - Session request received by the Distributor:
[Capabilities {acceptInsecureCerts: true, browserName: firefox, moz:debuggerAddress: true, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}}]
13:38:09.783 INFO [LocalNode.newSession] - Session created by the Node. Id: 62f46cbb-85a4-4e39-9959-0747e54be196, Caps: Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 123.0, moz:accessibilityChecks: false, moz:buildID: 20240213221259, moz:debuggerAddress: 127.0.0.1:12793, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}, moz:geckodriverVersion: 0.34.0, moz:headless: false, moz:platformVersion: 5.15.146+, moz:processID: 13765, moz:profile: /tmp/rust_mozprofile9K3PEI, moz:shutdownTimeout: 60000, moz:webdriverClick: true, moz:windowless: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: ws://172.18.0.2:4444/sessio..., se:cdpVersion: 85.0, se:noVncPort: 7900, se:vnc: ws://172.18.0.2:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.18.0.2:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
13:38:09.785 INFO [LocalDistributor.newSession] - Session created by the Distributor. Id: 62f46cbb-85a4-4e39-9959-0747e54be196
Caps: Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 123.0, moz:accessibilityChecks: false, moz:buildID: 20240213221259, moz:debuggerAddress: 127.0.0.1:12793, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}, moz:geckodriverVersion: 0.34.0, moz:headless: false, moz:platformVersion: 5.15.146+, moz:processID: 13765, moz:profile: /tmp/rust_mozprofile9K3PEI, moz:shutdownTimeout: 60000, moz:webdriverClick: true, moz:windowless: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: ws://172.18.0.2:4444/sessio..., se:cdpVersion: 85.0, se:noVncPort: 7900, se:vnc: ws://172.18.0.2:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.18.0.2:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
13:38:09.786 INFO [SeleniumSpanExporter$1.lambda$export$3] - {"traceId": "069d010f096ae47834e48fd852c8ba33","eventTime": 1710941889785258977,"eventName": "HTTP request execution complete","attributes": {"http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.sessionqueue.local.LocalNewSessionQueue","http.host": "firefox:4444","http.method": "POST","http.request_content_length": "444","http.scheme": "HTTP","http.status_code": 200,"http.target": "\u002fsession","http.user_agent": "selenium\u002f4.18.1 (java unix)"}}
13:43:18.869 INFO [LocalNode.stopTimedOutSession] - Session id 62f46cbb-85a4-4e39-9959-0747e54be196 timed out, stopping...
13:43:19.185 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local Session Map, Id: 62f46cbb-85a4-4e39-9959-0747e54be196
13:43:19.185 INFO [GridModel.release] - Releasing slot for session id 62f46cbb-85a4-4e39-9959-0747e54be196
13:43:19.185 INFO [SessionSlot.stop] - Stopping session 62f46cbb-85a4-4e39-9959-0747e54be196
### Operating System
Ubuntu
### Selenium version
4.18.1
### What are the browser(s) and version(s) where you see this issue?
Firefox 123
### What are the browser driver(s) and version(s) where you see this issue?
GeckoDriver 0.34
### Are you using Selenium Grid?
_No response_