Commit 4b77986
committed
[SPARK-54868][PYTHON][INFRA] Fail hanging tests and log the tracebacks
### What changes were proposed in this pull request?
Fail hanging tests and log the tracebacks
The timeout is set by env `PYSPARK_TEST_TIMEOUT`
### Why are the changes needed?
when a test gets stuck, there is no useful information
### Does this PR introduce _any_ user-facing change?
no, dev-only
### How was this patch tested?
1, PR builder with
```
PYSPARK_TEST_TIMEOUT: 100
```
https://github.com/zhengruifeng/spark/actions/runs/20522703690/job/58962106131
2, manually check
```
(spark_dev_313) ➜ spark git:(py_test_timeout) PYSPARK_TEST_TIMEOUT=15 python/run-tests -k --python-executables python3 --testnames 'pyspark.ml.tests.connect.test_parity_clustering'
Running PySpark tests. Output is in /Users/ruifeng.zheng/spark/python/unit-tests.log
Will test against the following Python executables: ['python3']
Will test the following Python tests: ['pyspark.ml.tests.connect.test_parity_clustering']
python3 python_implementation is CPython
python3 version is: Python 3.13.5
Starting test(python3): pyspark.ml.tests.connect.test_parity_clustering (temp output: /Users/ruifeng.zheng/spark/python/target/c014880c-80d2-49db-8fb1-a26ab4e5246d/python3__pyspark.ml.tests.connect.test_parity_clustering__u8n7t6zc.log)
Got TimeoutExpired while running pyspark.ml.tests.connect.test_parity_clustering with python3
Traceback (most recent call last):
File "/Users/ruifeng.zheng/spark/./python/run-tests.py", line 157, in run_individual_python_test
retcode = proc.wait(timeout=timeout)
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/subprocess.py", line 1280, in wait
return self._wait(timeout=timeout)
~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/subprocess.py", line 2058, in _wait
raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['/Users/ruifeng.zheng/spark/bin/pyspark', 'pyspark.ml.tests.connect.test_parity_clustering']' timed out after 15 seconds
Running tests...
----------------------------------------------------------------------
WARNING: Using incubator modules: jdk.incubator.vector
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/conf.py:64: UserWarning: Failed to set spark.connect.execute.reattachable.senderMaxStreamDuration to Some(1s) due to [CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark config: "spark.connect.execute.reattachable.senderMaxStreamDuration". SQLSTATE: 46110
warnings.warn(warn)
/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/conf.py:64: UserWarning: Failed to set spark.connect.execute.reattachable.senderMaxStreamSize to Some(123) due to [CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark config: "spark.connect.execute.reattachable.senderMaxStreamSize". SQLSTATE: 46110
warnings.warn(warn)
/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/conf.py:64: UserWarning: Failed to set spark.connect.authenticate.token to Some(deadbeef) due to [CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark config: "spark.connect.authenticate.token". SQLSTATE: 46110
warnings.warn(warn)
test_assert_remote_mode (pyspark.ml.tests.connect.test_parity_clustering.ClusteringParityTests.test_assert_remote_mode) ... ok (0.450s)
/Users/ruifeng.zheng/spark/python/pyspark/ml/clustering.py:1016: FutureWarning: Deprecated in 3.0.0. It will be removed in future versions. Use ClusteringEvaluator instead. You can also get the cost on the training dataset in the summary.
warnings.warn(
ok (6.541s)
test_distributed_lda (pyspark.ml.tests.connect.test_parity_clustering.ClusteringParityTests.test_distributed_lda) ... Thread 0x0000000173083000 (most recent call first):
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/grpc/_channel.py", line 1727 in channel_spin
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 994 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1043 in _bootstrap_inner
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1014 in _bootstrap
Thread 0x000000017509b000 (most recent call first):
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/concurrent/futures/thread.py", line 90 in _worker
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 994 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1043 in _bootstrap_inner
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1014 in _bootstrap
Thread 0x000000017408f000 (most recent call first):
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/concurrent/futures/thread.py", line 90 in _worker
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 994 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1043 in _bootstrap_inner
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1014 in _bootstrap
Thread 0x00000001719e7000 (most recent call first):
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/selectors.py", line 398 in select
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/socketserver.py", line 235 in serve_forever
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 994 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1043 in _bootstrap_inner
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1014 in _bootstrap
Thread 0x00000001709db000 (most recent call first):
File "/Users/ruifeng.zheng/spark/python/lib/py4j-0.10.9.9-src.zip/py4j/clientserver.py", line 58 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1043 in _bootstrap_inner
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 1014 in _bootstrap
Current thread 0x00000001f372e200 (most recent call first):
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/threading.py", line 363 in wait
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/grpc/_common.py", line 114 in _wait_once
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/grpc/_common.py", line 154 in wait
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/grpc/_channel.py", line 953 in _next
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/grpc/_channel.py", line 538 in __next__
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/reattach.py", line 164 in <lambda>
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/reattach.py", line 266 in _call_iter
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/reattach.py", line 163 in _has_next
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/reattach.py", line 139 in send
File "<frozen _collections_abc>", line 360 in __next__
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/core.py", line 1625 in _execute_and_fetch_as_iterator
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/core.py", line 1664 in _execute_and_fetch
File "/Users/ruifeng.zheng/spark/python/pyspark/sql/connect/client/core.py", line 1162 in execute_command
File "/Users/ruifeng.zheng/spark/python/pyspark/ml/util.py", line 308 in remote_call
File "/Users/ruifeng.zheng/spark/python/pyspark/ml/util.py", line 322 in wrapped
File "/Users/ruifeng.zheng/spark/python/pyspark/ml/clustering.py", line 1548 in toLocal
File "/Users/ruifeng.zheng/spark/python/pyspark/ml/tests/test_clustering.py", line 449 in test_distributed_lda
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/case.py", line 606 in _callTestMethod
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/case.py", line 651 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/case.py", line 707 in __call__
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/suite.py", line 122 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/suite.py", line 84 in __call__
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/suite.py", line 122 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/suite.py", line 84 in __call__
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/site-packages/xmlrunner/runner.py", line 67 in run
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/main.py", line 270 in runTests
File "/Users/ruifeng.zheng/.dev/miniconda3/envs/spark_dev_313/lib/python3.13/unittest/main.py", line 104 in __init__
File "/Users/ruifeng.zheng/spark/python/pyspark/testing/__init__.py", line 30 in unittest_main
File "/Users/ruifeng.zheng/spark/python/pyspark/ml/tests/connect/test_parity_clustering.py", line 37 in <module>
File "<frozen runpy>", line 88 in _run_code
File "<frozen runpy>", line 198 in _run_module_as_main
Had test failures in pyspark.ml.tests.connect.test_parity_clustering with python3; see logs.
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #53528 from zhengruifeng/py_test_timeout.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>1 parent 1bc5b29 commit 4b77986
File tree
3 files changed
+33
-5
lines changed- .github/workflows
- python
- pyspark/testing
3 files changed
+33
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
566 | 566 | | |
567 | 567 | | |
568 | 568 | | |
| 569 | + | |
569 | 570 | | |
570 | 571 | | |
571 | 572 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
20 | 23 | | |
21 | 24 | | |
22 | 25 | | |
| |||
177 | 180 | | |
178 | 181 | | |
179 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
180 | 186 | | |
181 | 187 | | |
182 | 188 | | |
| |||
197 | 203 | | |
198 | 204 | | |
199 | 205 | | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
200 | 209 | | |
201 | 210 | | |
202 | 211 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
234 | 234 | | |
235 | 235 | | |
236 | 236 | | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
237 | 242 | | |
238 | 243 | | |
239 | 244 | | |
240 | 245 | | |
241 | 246 | | |
242 | 247 | | |
243 | 248 | | |
| 249 | + | |
244 | 250 | | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
245 | 254 | | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
251 | 260 | | |
252 | 261 | | |
253 | 262 | | |
254 | 263 | | |
255 | 264 | | |
256 | 265 | | |
257 | 266 | | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
258 | 276 | | |
259 | 277 | | |
260 | 278 | | |
| |||
0 commit comments