Skip to content

Conversation

@joechenrh
Copy link
Contributor

@joechenrh joechenrh commented Dec 25, 2025

What problem does this PR solve?

Issue Number: close #65261

Problem Summary:

The problem is caused by arrow-go and azure reader.

In the implementation of arrow-go, we first need to call r.Seek(0, io.SeekEnd). But azure reader failed to open the reader from the end of the file.

if realOffset < 0 || realOffset > r.totalSize {
return 0, errors.Annotatef(berrors.ErrInvalidArgument, "Seek: offset is %d, but length of content is only %d", realOffset, r.totalSize)
}
if realOffset == r.pos {
return r.pos, nil
}
r.pos = realOffset
// azblob reader can only read forward, so we need to reopen the reader
if err := r.reopenReader(); err != nil {
return 0, err
}
return r.pos, nil
}

What changed and how does it work?

Open a noop reader in such case, just like what others readers (S3 and gcs) do.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Load data from azure:

Before:

MySQL [test]> import into students from "azblob://xxx/test.students.parquet?account-name=xxx&sas-token=xxx";                                                                                                                                                                         
ERROR 8160 (HY000): Failed to read source files. Reason: Failed to read data from azure blob, data info: pos='1046': GET https://xxx/test.students.parquet      
--------------------------------------------------------------------------------                                                                                                                                        
RESPONSE 416: 416 The range specified is invalid for the current size of the resource.                                                                                                                                  
ERROR CODE: InvalidRange                                                            

After:

MySQL [test]> import into students from "azblob://testimportdata/testdata/test.students.parquet?account-name=xxx&sas-token=xxx;                                                                                                                                                                         
+--------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+----------------------------+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+                                                         
| Job_ID | Group_Key | Data_Source                                                                                                                                                                                      
                            | Target_Table      | Table_ID | Phase | Status   | Source_File_Size | Imported_Rows | Result_Message | Create_Time                | Start_Time                 | End_Time                  
 | Created_By | Last_Update_Time           | Cur_Step | Cur_Step_Processed_Size | Cur_Step_Total_Size | Cur_Step_Progress_Pct | Cur_Step_Speed | Cur_Step_ETA |                                                         
+--------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------+-------------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+---------------------------
-+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+                                                         
|      1 | NULL      | azblob://testimportdata/testdata/test.students.parquet?account-name=xxx&xxx | `test`.`students` |      114 |       | finished | 1.021KiB         |             5 |                | 2025-12-25 06:35:30.176518 | 2025-12-25 06:35:30.723001 | 2025-12-25 06:35:39.727013
 | root@%     | 2025-12-25 06:35:39.727013 | NULL     | NULL                    | NULL                | NULL                  | NULL           | NULL         |                                                         
+--------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------+-------------------+----------+-------+----------+------------------+---------------+----------------+----------------------------+----------------------------+---------------------------
-+------------+----------------------------+----------+-------------------------+---------------------+-----------------------+----------------+--------------+
1 row in set (14.096 sec)                             
                                                      
MySQL [test]> select * from test.students;            
+----+---------+------+                               
| id | name    | age  |  
+----+---------+------+
|  1 | Alice   |   25 |  
|  2 | Bob     |   30 |
|  3 | Charlie |   35 |  
|  4 | David   |   40 |  
|  5 | Eve     |   45 |                               
+----+---------+------+  
5 rows in set (0.002 sec)
                                                      
MySQL [test]> 

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: Ruihao Chen <[email protected]>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed do-not-merge/needs-tests-checked labels Dec 25, 2025
@tiprow
Copy link

tiprow bot commented Dec 25, 2025

Hi @joechenrh. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Ruihao Chen <[email protected]>
@codecov
Copy link

codecov bot commented Dec 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.3283%. Comparing base (0eb881e) to head (2048211).
⚠️ Report is 41 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #65262        +/-   ##
================================================
+ Coverage   70.7500%   71.3283%   +0.5783%     
================================================
  Files          1895       1902         +7     
  Lines        518041     522220      +4179     
================================================
+ Hits         366514     372491      +5977     
+ Misses       127012     125243      -1769     
+ Partials      24515      24486        -29     
Flag Coverage Δ
integration 48.1237% <0.0000%> (-0.0458%) ⬇️
unit 66.0750% <100.0000%> (+0.5491%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (+0.1132%) ⬆️
parser ∅ <ø> (∅)
br 58.5320% <100.0000%> (+0.2982%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Dec 25, 2025
@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 25, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 25, 2025
@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 25, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 25, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-12-25 11:58:51.150699447 +0000 UTC m=+2338275.964477019: ☑️ agreed by D3Hunter.
  • 2025-12-25 12:16:48.373895452 +0000 UTC m=+2339353.187673034: ☑️ agreed by zimulala.

@tiprow
Copy link

tiprow bot commented Dec 25, 2025

@joechenrh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow fe68587 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Signed-off-by: Ruihao Chen <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 25, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Dec 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, wjhuang2016, zimulala

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 26, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 27, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 27, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@joechenrh
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Dec 27, 2025

@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot merged commit 310e154 into pingcap:master Dec 27, 2025
61 of 70 checks passed
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #65300.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parquet import meets error when loading from azure

5 participants