Skip to content

acc: Use fixed ports instead of dynamically allocated ones for run-local tests#3490

Merged
andrewnester merged 5 commits intomainfrom
acc/fix-run-local
Aug 27, 2025
Merged

acc: Use fixed ports instead of dynamically allocated ones for run-local tests#3490
andrewnester merged 5 commits intomainfrom
acc/fix-run-local

Conversation

@andrewnester
Copy link
Contributor

@andrewnester andrewnester commented Aug 26, 2025

Changes

Use fixed ports instead of dynamically allocated ones for run-local tests

Why

After researching and trying to stabilize the tests, I concluded that dynamic port allocation is likely the root cause of its flakiness. The hypothesis is that allocate_ports.py calls socket.close() after finding a free port. When the socket is closed, the port it was using is released back to the OS. However, the port does not become instantly available for another application to use. It typically enters a TIME_WAIT state to handle any stray packets.
It could be that the "connection refused" error, which happens when the curl command tries to connect after the Python script has closed the socket, but before the app has successfully bound to the port and is ready to accept connections

Tests

After making ports fixed in the script, all tests for all runners succeeded (tried 5 times)

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Aug 26, 2025

Run: 17270404259

Env ✅‌pass ❌‌FAIL 🔄‌flaky 🙈‌skip
🔄‌ aws linux 305 3 497
🔄‌ aws windows 306 3 496
❌‌ aws-ucws linux 418 6 393
❌‌ aws-ucws windows 419 6 392
✅‌ azure linux 308 496
🔄‌ azure windows 306 3 495
🔄‌ azure-ucws linux 422 2 392
✅‌ azure-ucws windows 425 391
🔄‌ gcp linux 301 6 498
🔄‌ gcp windows 302 6 497
24 failing tests:
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure windows azure-ucws linux gcp linux gcp windows
TestAccept 🔄‌flaky 🔄‌flaky ❌‌FAIL ❌‌FAIL 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass
TestAccept/bundle/deploy/lakebase/database-catalog 🙈‌skip 🙈‌skip ❌‌FAIL ❌‌FAIL 🙈‌skip ✅‌pass 🙈‌skip 🙈‌skip
TestAccept/bundle/deploy/lakebase/database-instance/single-instance 🙈‌skip 🙈‌skip ❌‌FAIL ❌‌FAIL 🙈‌skip ✅‌pass 🙈‌skip 🙈‌skip
TestAccept/bundle/deploy/lakebase/database-instance/single-instance/DATABRICKS_CLI_DEPLOYMENT=direct-exp ❌‌FAIL ❌‌FAIL ✅‌pass
TestAccept/bundle/deploy/lakebase/database-instance/single-instance/DATABRICKS_CLI_DEPLOYMENT=terraform ❌‌FAIL ❌‌FAIL ✅‌pass
TestAccept/bundle/deploy/lakebase/synced-database-table 🙈‌skip 🙈‌skip ❌‌FAIL ❌‌FAIL 🙈‌skip ✅‌pass 🙈‌skip 🙈‌skip
TestAccept/bundle/destroy/jobs-and-pipeline ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/destroy/jobs-and-pipeline/DATABRICKS_CLI_DEPLOYMENT=direct-exp ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass
TestAccept/bundle/destroy/jobs-and-pipeline/DATABRICKS_CLI_DEPLOYMENT=terraform ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/resources/pipelines/update/DATABRICKS_CLI_DEPLOYMENT=direct-exp ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/resources/pipelines/update/DATABRICKS_CLI_DEPLOYMENT=terraform ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/templates/default-python/combinations/classic ✅‌pass 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=direct-exp/DLT=no/NBOOK=yes/PY=no ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=direct-exp/DLT=yes/NBOOK=no/PY=no ✅‌pass 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=direct-exp/DLT=yes/NBOOK=yes/PY=no ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=terraform/DLT=yes/NBOOK=no/PY=no ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=terraform/DLT=yes/NBOOK=yes/PY=no ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky
TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_CLI_DEPLOYMENT=terraform/DLT=yes/NBOOK=yes/PY=yes ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestAccept/bundle/templates/default-python/integration_classic 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_CLI_DEPLOYMENT=direct-exp/UV_PYTHON=3.12 ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_CLI_DEPLOYMENT=terraform/UV_PYTHON=3.10 🔄‌flaky ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass
TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_CLI_DEPLOYMENT=terraform/UV_PYTHON=3.9 ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass
TestFilerReadWrite ✅‌pass ✅‌pass ✅‌pass ✅‌pass ✅‌pass 🔄‌flaky ✅‌pass ✅‌pass
TestFilerReadWrite/files 🙈‌skip 🙈‌skip ✅‌pass ✅‌pass 🙈‌skip 🔄‌flaky 🙈‌skip 🙈‌skip

@andrewnester andrewnester changed the title acc: Fix app run-local test acc: Use fixed ports instead of dynamically allocated ones for run-local tests Aug 27, 2025
@andrewnester andrewnester marked this pull request as ready for review August 27, 2025 14:12
@denik
Copy link
Contributor

denik commented Aug 27, 2025

Thanks for looking into this! Should we remove allocate_ports.py as well?

@andrewnester andrewnester enabled auto-merge August 27, 2025 14:44
@andrewnester andrewnester disabled auto-merge August 27, 2025 15:28
@andrewnester andrewnester merged commit 6031437 into main Aug 27, 2025
12 of 13 checks passed
@andrewnester andrewnester deleted the acc/fix-run-local branch August 27, 2025 15:28
andrewnester added a commit that referenced this pull request Aug 28, 2025
…r run-local tests" (#3505)

Reverts #3490

The tests still failing
https://github.com/databricks/cli/actions/runs/17289620395/job/49073700178

```
+++ /var/folders/x7/ch5v91h56_zbvbd1y2f600dm0000gn/T/TestAcceptcmdworkspaceappsrun-local1357262386/001/output.txt
        @@ -10,25 +10,6 @@
         === Waiting
         === Checking app is running...
         >>> curl -s -o - http://127.0.0.1:$(port)/
        -{
        -  "Accept": "*/*",
        -  "Accept-Encoding": "gzip",
        -  "Host": "127.0.0.1:$(port)",
        -  "User-Agent": "curl/(version)",
        -  "X-Forwarded-Email": "[USERNAME]",
        -  "X-Forwarded-Host": "localhost",
        -  "X-Forwarded-Preferred-Username": "",
        -  "X-Forwarded-User": "[USERNAME]",
        -  "X-Real-Ip": "127.0.0.1",
        -  "X-Request-Id": "[UUID]"
        -}
        +jq: parse error: Invalid numeric literal at line 1, column 6
  ```
@andrewnester andrewnester restored the acc/fix-run-local branch August 28, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants