Commit f71806e
authored
# Rationale for this change
This is problematic if you try to implement your own `FileIO`. Then
Streams are opened both through the FileIO and the FileSystem directly.
# Are these changes tested?
Yes, existing tests.
# Are there any user-facing changes?
No, but I think this makes the code esthetically also more pleasing by
removing complexity.
<!-- In the case of user-facing changes, please add the changelog label.
-->
# Numbers
A while ago I did some inspection of the calls being made to S3, so just
to be sure that we don't alter anything, I've collected some stats using
a small "benchmark" locally:
```python
def test_fokko(session_catalog: RestCatalog):
parquet_file = "/Users/fokko.driesprong/Downloads/yellow_tripdata_2024-01.parquet"
from pyarrow import parquet as pq
df = pq.read_table(parquet_file)
try:
session_catalog.drop_table("default.taxi")
except Exception:
pass
tbl = session_catalog.create_table("default.taxi", schema=df.schema)
with tbl.update_spec() as tx:
tx.add_field("tpep_pickup_datetime", "hour")
tbl.append(df)
rounds = []
for _ in range(22):
start = round(time.time() * 1000)
assert len(tbl.scan().to_arrow()) == 2964624
stop = round(time.time() * 1000)
rounds.append(stop - start)
print(f"Took: {sum(rounds) / len(rounds)} ms on average")
```
Main:
Took: 1715.1818181818182 ms on average
```
> mc admin trace --stats minio
Call Count RPM Avg Time Min Time Max Time Avg TTFB Max TTFB Avg Size Rate /min Errors
s3.GetObject 77 (29.2%) 697.9 701µs 153µs 1.6ms 463µs 838µs ↑159B ↓712K ↑108K ↓485M 0
s3.HeadObject 73 (27.7%) 661.6 192µs 107µs 735µs 177µs 719µs ↑153B ↑99K 0
s3.CompleteMultipartUpload 37 (14.0%) 335.4 8.2ms 1.9ms 17.5ms 8.2ms 17.5ms ↑397B ↓507B ↑130K ↓166K 0
s3.NewMultipartUpload 37 (14.0%) 335.4 6.2ms 2.1ms 14.2ms 6.1ms 14.1ms ↑153B ↓437B ↑50K ↓143K 0
s3.PutObjectPart 37 (14.0%) 335.4 18.4ms 5.1ms 38.8ms 18.4ms 38.8ms ↑1.4M ↑469M 0
s3.PutObject 3 (1.1%) 27.2 5.4ms 3.4ms 8.8ms 5.3ms 8.8ms ↑2.8K ↑75K 0
```
Branch:
Took: 1723.1818181818182 ms on average
```
> mc admin trace --stats minio
Call Count RPM Avg Time Min Time Max Time Avg TTFB Max TTFB Avg Size Rate /min Errors
s3.GetObject 77 (29.2%) 696.3 927µs 171µs 4.5ms 610µs 3.5ms ↑159B ↓712K ↑108K ↓484M 0
s3.HeadObject 73 (27.7%) 660.1 222µs 109µs 1.2ms 205µs 1.2ms ↑153B ↑99K 0
s3.CompleteMultipartUpload 37 (14.0%) 334.6 4.4ms 1.2ms 14.2ms 4.4ms 14.2ms ↑397B ↓507B ↑130K ↓166K 0
s3.NewMultipartUpload 37 (14.0%) 334.6 4.3ms 1.2ms 15ms 4.3ms 15ms ↑153B ↓437B ↑50K ↓143K 0
s3.PutObjectPart 37 (14.0%) 334.6 14.5ms 2.6ms 30.7ms 14.5ms 30.7ms ↑1.4M ↑468M 0
s3.PutObject 3 (1.1%) 27.1 6.6ms 2.8ms 10.4ms 6.5ms 10.3ms ↑2.8K ↑75K 0
```
1 parent ccdb9b7 commit f71806e
3 files changed
+13
-57
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
| |||
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
40 | | - | |
41 | 39 | | |
42 | 40 | | |
43 | 41 | | |
| |||
371 | 369 | | |
372 | 370 | | |
373 | 371 | | |
374 | | - | |
375 | | - | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
72 | | - | |
73 | 72 | | |
74 | 73 | | |
75 | 74 | | |
| |||
117 | 116 | | |
118 | 117 | | |
119 | 118 | | |
120 | | - | |
121 | 119 | | |
122 | 120 | | |
123 | 121 | | |
| |||
309 | 307 | | |
310 | 308 | | |
311 | 309 | | |
312 | | - | |
313 | | - | |
314 | | - | |
| 310 | + | |
315 | 311 | | |
316 | 312 | | |
317 | 313 | | |
| |||
916 | 912 | | |
917 | 913 | | |
918 | 914 | | |
919 | | - | |
920 | | - | |
921 | | - | |
922 | | - | |
923 | | - | |
924 | | - | |
| 915 | + | |
925 | 916 | | |
926 | | - | |
927 | | - | |
928 | | - | |
929 | | - | |
930 | | - | |
931 | | - | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
932 | 922 | | |
933 | 923 | | |
934 | 924 | | |
935 | 925 | | |
936 | 926 | | |
937 | 927 | | |
938 | | - | |
939 | | - | |
| 928 | + | |
940 | 929 | | |
941 | 930 | | |
942 | 931 | | |
| |||
1383 | 1372 | | |
1384 | 1373 | | |
1385 | 1374 | | |
1386 | | - | |
| 1375 | + | |
1387 | 1376 | | |
1388 | 1377 | | |
1389 | 1378 | | |
| |||
1393 | 1382 | | |
1394 | 1383 | | |
1395 | 1384 | | |
1396 | | - | |
1397 | 1385 | | |
1398 | | - | |
| 1386 | + | |
1399 | 1387 | | |
1400 | 1388 | | |
1401 | 1389 | | |
| |||
1479 | 1467 | | |
1480 | 1468 | | |
1481 | 1469 | | |
1482 | | - | |
| 1470 | + | |
1483 | 1471 | | |
1484 | 1472 | | |
1485 | 1473 | | |
| |||
1491 | 1479 | | |
1492 | 1480 | | |
1493 | 1481 | | |
1494 | | - | |
1495 | | - | |
1496 | | - | |
1497 | | - | |
1498 | | - | |
1499 | | - | |
1500 | | - | |
1501 | | - | |
1502 | | - | |
1503 | | - | |
1504 | | - | |
1505 | | - | |
1506 | | - | |
1507 | | - | |
1508 | | - | |
1509 | | - | |
1510 | | - | |
1511 | | - | |
1512 | | - | |
1513 | 1482 | | |
1514 | 1483 | | |
1515 | 1484 | | |
| |||
1654 | 1623 | | |
1655 | 1624 | | |
1656 | 1625 | | |
1657 | | - | |
| 1626 | + | |
1658 | 1627 | | |
1659 | 1628 | | |
1660 | 1629 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1539 | 1539 | | |
1540 | 1540 | | |
1541 | 1541 | | |
1542 | | - | |
| 1542 | + | |
1543 | 1543 | | |
1544 | 1544 | | |
1545 | 1545 | | |
| |||
0 commit comments