⚡️ Speed up function dataframe_merge
by 1,305%
#98
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 1,305% (13.05x) speedup for
dataframe_merge
insrc/numpy_pandas/dataframe_operations.py
⏱️ Runtime :
119 milliseconds
→8.48 milliseconds
(best of267
runs)📝 Explanation and details
The optimized code achieves a 13x speedup by eliminating pandas' slowest operations and leveraging NumPy arrays for data access.
Key optimizations:
Eliminated
.iloc[]
calls: The original code usedleft.iloc[i]
andright.iloc[right_idx]
for every row access, which are extremely expensive operations. The optimized version extracts the underlying NumPy arrays once using.values
and accesses rows directly via array indexing.Pre-cached column indices: Instead of repeatedly looking up column names during the merge loop, the optimized code pre-computes column indices using
get_loc()
and stores them in dictionaries for O(1) lookup.Vectorized right-side dictionary building: Uses
enumerate(right_values[:, right_on_idx])
to build the key mapping in one pass, avoiding individual.iloc[]
calls for each right DataFrame row.Performance impact by test case:
.iloc[]
overhead scales poorly with DataFrame sizeThe line profiler confirms this: the original code spent 33.4% of time in
right.iloc[i][right_on]
and 27.6% inleft.iloc[i]
calls, while the optimized version eliminates these bottlenecks entirely. The optimization is particularly effective for datasets with hundreds or thousands of rows where pandas overhead becomes the dominant cost.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-dataframe_merge-mfejj4uk
and push.