Skip to content

Commit 2c3500c

Browse files
wangyumFokkopanbingkunpan3793
authored andcommitted
[SPARK-51549][BUILD][3.5] Bump Parquet 1.15.1
### What changes were proposed in this pull request? Bump Parquet to 1.15.1. ### Why are the changes needed? To fix critical CVE: https://www.cve.org/CVERecord?id=CVE-2025-30065 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GHA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50528 from wangyum/parquet-branch-3.5. Lead-authored-by: yumwang@ebay.com <yumwang@ebay.com> Co-authored-by: Fokko <fokko@apache.org> Co-authored-by: Fokko Driesprong <fokko@tabular.io> Co-authored-by: panbingkun <panbingkun@baidu.com> Co-authored-by: Fokko Driesprong <fokko@apache.org> Co-authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: yangjie01 <yangjie01@baidu.com>
1 parent 8eb9e34 commit 2c3500c

File tree

7 files changed

+369
-363
lines changed

7 files changed

+369
-363
lines changed

dev/deps/spark-deps-hadoop-3-hive-2.3

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -218,12 +218,12 @@ orc-shims/1.9.5//orc-shims-1.9.5.jar
218218
oro/2.0.8//oro-2.0.8.jar
219219
osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
220220
paranamer/2.8//paranamer-2.8.jar
221-
parquet-column/1.13.1//parquet-column-1.13.1.jar
222-
parquet-common/1.13.1//parquet-common-1.13.1.jar
223-
parquet-encoding/1.13.1//parquet-encoding-1.13.1.jar
224-
parquet-format-structures/1.13.1//parquet-format-structures-1.13.1.jar
225-
parquet-hadoop/1.13.1//parquet-hadoop-1.13.1.jar
226-
parquet-jackson/1.13.1//parquet-jackson-1.13.1.jar
221+
parquet-column/1.15.1//parquet-column-1.15.1.jar
222+
parquet-common/1.15.1//parquet-common-1.15.1.jar
223+
parquet-encoding/1.15.1//parquet-encoding-1.15.1.jar
224+
parquet-format-structures/1.15.1//parquet-format-structures-1.15.1.jar
225+
parquet-hadoop/1.15.1//parquet-hadoop-1.15.1.jar
226+
parquet-jackson/1.15.1//parquet-jackson-1.15.1.jar
227227
pickle/1.3//pickle-1.3.jar
228228
py4j/0.10.9.7//py4j-0.10.9.7.jar
229229
remotetea-oncrpc/1.1.2//remotetea-oncrpc-1.1.2.jar

pom.xml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@
140140
<kafka.version>3.4.1</kafka.version>
141141
<!-- After 10.15.1.3, the minimum required version is JDK9 -->
142142
<derby.version>10.14.2.0</derby.version>
143-
<parquet.version>1.13.1</parquet.version>
143+
<parquet.version>1.15.1</parquet.version>
144144
<orc.version>1.9.5</orc.version>
145145
<orc.classifier>shaded-protobuf</orc.classifier>
146146
<jetty.version>9.4.56.v20240826</jetty.version>
@@ -2663,6 +2663,12 @@
26632663
<version>${parquet.version}</version>
26642664
<scope>${parquet.test.deps.scope}</scope>
26652665
<classifier>tests</classifier>
2666+
<exclusions>
2667+
<exclusion>
2668+
<groupId>com.h2database</groupId>
2669+
<artifactId>h2</artifactId>
2670+
</exclusion>
2671+
</exclusions>
26662672
</dependency>
26672673
<dependency>
26682674
<groupId>org.apache.parquet</groupId>

sql/core/benchmarks/BuiltInDataSourceWriteBenchmark-results.txt

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,69 +2,69 @@
22
Parquet writer benchmark
33
================================================================================================
44

5-
OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure
6-
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
5+
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Linux 6.8.0-1021-azure
6+
AMD EPYC 7763 64-Core Processor
77
Parquet(PARQUET_1_0) writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
88
------------------------------------------------------------------------------------------------------------------------
9-
Output Single Int Column 2724 2758 49 5.8 173.2 1.0X
10-
Output Single Double Column 2816 2829 20 5.6 179.0 1.0X
11-
Output Int and String Column 8999 9080 115 1.7 572.1 0.3X
12-
Output Partitions 5003 5086 117 3.1 318.1 0.5X
13-
Output Buckets 6911 6956 64 2.3 439.4 0.4X
9+
Output Single Int Column 1685 1742 81 9.3 107.1 1.0X
10+
Output Single Double Column 1675 1774 139 9.4 106.5 1.0X
11+
Output Int and String Column 5038 5126 125 3.1 320.3 0.3X
12+
Output Partitions 2904 2927 33 5.4 184.6 0.6X
13+
Output Buckets 4051 4058 10 3.9 257.6 0.4X
1414

15-
OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure
16-
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
15+
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Linux 6.8.0-1021-azure
16+
AMD EPYC 7763 64-Core Processor
1717
Parquet(PARQUET_2_0) writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
1818
------------------------------------------------------------------------------------------------------------------------
19-
Output Single Int Column 2761 2806 64 5.7 175.5 1.0X
20-
Output Single Double Column 2652 2678 37 5.9 168.6 1.0X
21-
Output Int and String Column 8377 8518 199 1.9 532.6 0.3X
22-
Output Partitions 4865 4914 70 3.2 309.3 0.6X
23-
Output Buckets 6622 6664 59 2.4 421.0 0.4X
19+
Output Single Int Column 1545 1551 9 10.2 98.2 1.0X
20+
Output Single Double Column 1605 1629 34 9.8 102.0 1.0X
21+
Output Int and String Column 5077 5107 42 3.1 322.8 0.3X
22+
Output Partitions 2819 2822 3 5.6 179.2 0.5X
23+
Output Buckets 3911 3911 0 4.0 248.7 0.4X
2424

2525

2626
================================================================================================
2727
ORC writer benchmark
2828
================================================================================================
2929

30-
OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure
31-
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
30+
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Linux 6.8.0-1021-azure
31+
AMD EPYC 7763 64-Core Processor
3232
ORC writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
3333
------------------------------------------------------------------------------------------------------------------------
34-
Output Single Int Column 1575 1627 74 10.0 100.1 1.0X
35-
Output Single Double Column 2021 2087 94 7.8 128.5 0.8X
36-
Output Int and String Column 6533 6800 377 2.4 415.4 0.2X
37-
Output Partitions 3577 3635 82 4.4 227.4 0.4X
38-
Output Buckets 4895 4923 41 3.2 311.2 0.3X
34+
Output Single Int Column 944 974 32 16.7 60.0 1.0X
35+
Output Single Double Column 1514 1518 6 10.4 96.3 0.6X
36+
Output Int and String Column 4797 4801 6 3.3 305.0 0.2X
37+
Output Partitions 2270 2272 3 6.9 144.3 0.4X
38+
Output Buckets 3201 3222 30 4.9 203.5 0.3X
3939

4040

4141
================================================================================================
4242
JSON writer benchmark
4343
================================================================================================
4444

45-
OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure
46-
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
45+
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Linux 6.8.0-1021-azure
46+
AMD EPYC 7763 64-Core Processor
4747
JSON writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
4848
------------------------------------------------------------------------------------------------------------------------
49-
Output Single Int Column 2415 2465 71 6.5 153.6 1.0X
50-
Output Single Double Column 3690 3856 236 4.3 234.6 0.7X
51-
Output Int and String Column 6922 6930 12 2.3 440.1 0.3X
52-
Output Partitions 4619 4622 4 3.4 293.7 0.5X
53-
Output Buckets 6674 6756 116 2.4 424.3 0.4X
49+
Output Single Int Column 1659 1671 17 9.5 105.4 1.0X
50+
Output Single Double Column 2260 2262 4 7.0 143.7 0.7X
51+
Output Int and String Column 4963 4964 2 3.2 315.5 0.3X
52+
Output Partitions 2912 2915 3 5.4 185.2 0.6X
53+
Output Buckets 3868 3870 3 4.1 245.9 0.4X
5454

5555

5656
================================================================================================
5757
CSV writer benchmark
5858
================================================================================================
5959

60-
OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure
61-
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
60+
OpenJDK 64-Bit Server VM 1.8.0_442-b06 on Linux 6.8.0-1021-azure
61+
AMD EPYC 7763 64-Core Processor
6262
CSV writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
6363
------------------------------------------------------------------------------------------------------------------------
64-
Output Single Int Column 4276 4368 130 3.7 271.8 1.0X
65-
Output Single Double Column 5273 5346 104 3.0 335.2 0.8X
66-
Output Int and String Column 8999 9139 199 1.7 572.1 0.5X
67-
Output Partitions 6466 6526 85 2.4 411.1 0.7X
68-
Output Buckets 8844 8878 48 1.8 562.3 0.5X
64+
Output Single Int Column 2603 2606 4 6.0 165.5 1.0X
65+
Output Single Double Column 2887 2888 1 5.4 183.6 0.9X
66+
Output Int and String Column 6464 6492 40 2.4 411.0 0.4X
67+
Output Partitions 3844 3896 73 4.1 244.4 0.7X
68+
Output Buckets 5662 5671 13 2.8 360.0 0.5X
6969

7070

0 commit comments

Comments
 (0)