Skip to content

[πŸ› Bug]: Client times out creating a session after running a number of successful sessionsΒ #13718

@Vlatombe

Description

@Vlatombe

What happened?

We run the Jenkins Acceptance Test harness. We recently migrated from Selenium 4.11.0 to 4.18.1 (jenkinsci/acceptance-test-harness#1499) but got stuck because of execution issues that were not occurring before.

Our environment runs subsets of the tests suite inside a docker environment:

  • a container running the container image selenium/standalone-firefox:4.18.1
  • another container running the test harness using maven and targetting the selenium server.

All tests run sequentially, so it is expected that only one selenium session is active at a given time.

Obviously both client and server are aligned.

We managed to narrow down the issue to the 4.13.0 -> 4.14.1 selenium upgrade.

The symptoms are the following:

  • Between 34 and 36 selenium sessions run successfully
  • A new session is requested by the client
  • Server http logging indicates the session request is processed, a session is created and returned (HTTP 200 on POST /session)
  • However the client never receives the response, and ends up timing out after 3 minutes
  • Since the session is never used on server side, it ends up timing out after 5 minutes.
  • Sometimes, after a longer while (~1 hour) some tests start working again, but then some later tests go back to the same problem and time out trying to grab a new session.

All the tests are launched using the initialization sequence (link), so the problem is clearly not tied to one test content.

I managed to find a workaround for this problem: I added a proxy server (caddy reverse-proxy --from :4444 --to firefox:4444 --insecure) between the client and selenium server and the problem went away.

How can we reproduce the issue?

I never managed to reproduce the problem locally on a Mac, running the same tests against a local selenium server.

Even though I never tracked down the root cause for this, it is probably network stack related. The fact it started to appear just after migrating to 4.14, matching with the JDK http client switch, makes me think it could be related to the problem, even though it seems to affect the server side (a new JVM is forked for each new test class).

I'm hoping this eventually helps to diagnose a possible issue, and leave it for posterity in case anyone faces a similar problem.

Relevant log output

Client side exception

Caused by: org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure. 
Host info: host: '739ec46b65a0', ip: '172.18.0.3'
	at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:537)
	at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:233)
	at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:162)
	at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:142)
	at org.jenkinsci.test.acceptance.FallbackConfig.buildRemoteWebDriver(FallbackConfig.java:162)
	at org.jenkinsci.test.acceptance.FallbackConfig.createWebDriver(FallbackConfig.java:149)
	at org.jenkinsci.test.acceptance.FallbackConfig.createWebDriver(FallbackConfig.java:315)
	at org.jenkinsci.test.acceptance.FallbackConfig$$FastClassByGuice$$65dd6.GUICE$TRAMPOLINE(<generated>)
	at org.jenkinsci.test.acceptance.FallbackConfig$$FastClassByGuice$$65dd6.apply(<generated>)
	at com.google.inject.internal.ProviderMethod$FastClassProviderMethod.doProvision(ProviderMethod.java:260)
	at com.google.inject.internal.ProviderMethod.doProvision(ProviderMethod.java:171)
	at com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.provision(InternalProviderInstanceBindingImpl.java:185)
	at com.google.inject.internal.InternalProviderInstanceBindingImpl$CyclicFactory.get(InternalProviderInstanceBindingImpl.java:162)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
	at org.jenkinsci.test.acceptance.guice.TestLifecycle.lambda$scope$0(TestLifecycle.java:58)
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
	at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:50)
	at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:146)
	at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:124)
	at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:91)
	at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:300)
	at com.google.inject.internal.InjectorImpl$1.get(InjectorImpl.java:1148)
	... 25 more
Caused by: org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.18.1', revision: 'b1d3319b48'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.146+', java.version: '17.0.10'
Driver info: driver.version: RemoteWebDriver
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:403)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:359)
	at org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:54)
	at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:114)
	at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:95)
	at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:67)
	at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
	at org.openqa.selenium.remote.TracedCommandExecutor.execute(TracedCommandExecutor.java:51)
	at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:519)
	... 46 more
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:386)
	... 56 more

Server-side logging

 13:38:07.643 INFO [LocalDistributor.newSession] - Session request received by the Distributor: 
  [Capabilities {acceptInsecureCerts: true, browserName: firefox, moz:debuggerAddress: true, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}}]
 13:38:09.783 INFO [LocalNode.newSession] - Session created by the Node. Id: 62f46cbb-85a4-4e39-9959-0747e54be196, Caps: Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 123.0, moz:accessibilityChecks: false, moz:buildID: 20240213221259, moz:debuggerAddress: 127.0.0.1:12793, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}, moz:geckodriverVersion: 0.34.0, moz:headless: false, moz:platformVersion: 5.15.146+, moz:processID: 13765, moz:profile: /tmp/rust_mozprofile9K3PEI, moz:shutdownTimeout: 60000, moz:webdriverClick: true, moz:windowless: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: ws://172.18.0.2:4444/sessio..., se:cdpVersion: 85.0, se:noVncPort: 7900, se:vnc: ws://172.18.0.2:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.18.0.2:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
 13:38:09.785 INFO [LocalDistributor.newSession] - Session created by the Distributor. Id: 62f46cbb-85a4-4e39-9959-0747e54be196 
  Caps: Capabilities {acceptInsecureCerts: true, browserName: firefox, browserVersion: 123.0, moz:accessibilityChecks: false, moz:buildID: 20240213221259, moz:debuggerAddress: 127.0.0.1:12793, moz:firefoxOptions: {prefs: {dom.disable_beforeunload: false, dom.max_chrome_script_run_time: 1500000, dom.max_script_run_time: 1500000, intl.accept_languages: en}}, moz:geckodriverVersion: 0.34.0, moz:headless: false, moz:platformVersion: 5.15.146+, moz:processID: 13765, moz:profile: /tmp/rust_mozprofile9K3PEI, moz:shutdownTimeout: 60000, moz:webdriverClick: true, moz:windowless: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:bidiEnabled: false, se:cdp: ws://172.18.0.2:4444/sessio..., se:cdpVersion: 85.0, se:noVncPort: 7900, se:vnc: ws://172.18.0.2:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.18.0.2:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify}
 13:38:09.786 INFO [SeleniumSpanExporter$1.lambda$export$3] - {"traceId": "069d010f096ae47834e48fd852c8ba33","eventTime": 1710941889785258977,"eventName": "HTTP request execution complete","attributes": {"http.flavor": 1,"http.handler_class": "org.openqa.selenium.grid.sessionqueue.local.LocalNewSessionQueue","http.host": "firefox:4444","http.method": "POST","http.request_content_length": "444","http.scheme": "HTTP","http.status_code": 200,"http.target": "\u002fsession","http.user_agent": "selenium\u002f4.18.1 (java unix)"}}
 
 13:43:18.869 INFO [LocalNode.stopTimedOutSession] - Session id 62f46cbb-85a4-4e39-9959-0747e54be196 timed out, stopping...
 13:43:19.185 INFO [LocalSessionMap.lambda$new$0] - Deleted session from local Session Map, Id: 62f46cbb-85a4-4e39-9959-0747e54be196
 13:43:19.185 INFO [GridModel.release] - Releasing slot for session id 62f46cbb-85a4-4e39-9959-0747e54be196
 13:43:19.185 INFO [SessionSlot.stop] - Stopping session 62f46cbb-85a4-4e39-9959-0747e54be196


### Operating System

Ubuntu

### Selenium version

4.18.1

### What are the browser(s) and version(s) where you see this issue?

Firefox 123

### What are the browser driver(s) and version(s) where you see this issue?

GeckoDriver 0.34

### Are you using Selenium Grid?

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    B-gridEverything grid and server relatedC-javaJava BindingsI-defectSomething is not working as intended

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions