Commit da51e3d
committed
mount: performance improvement
### Performance Comparison: `bytearray.extend()` vs. Concatenation
The efficiency difference between `bytearray.extend(bytes(N))` and `self.meta = self.meta + bytes(N)` stems from how Python manages memory and objects during these operations.
#### 1. In-Place Modification vs. Object Creation
- **`bytearray.extend()`**: This is an **in-place** operation. If the current memory block allocated for the `bytearray` has enough extra capacity (pre-allocated space), Python simply writes the new bytes into that space and updates the length. If it needs more space, it uses `realloc()`, which can often expand the existing memory block without moving the entire data set to a new location.
- **Concatenation (`+`)**: This creates a **completely new** `bytearray` object. It allocates a new memory block large enough to hold the sum of both parts, copies the contents of `self.meta`, copies the contents of `bytes(N)`, and then reassigns the variable `self.meta` to this new object.
#### 2. Computational Complexity
- **`bytearray.extend()`**: In the best case (when capacity exists), it is **O(K)**, where K is the number of bytes being added. In the worst case (reallocation), it is **O(N + K)**, but Python uses an over-allocation strategy (growth factor) that amortizes this cost, making it significantly faster on average.
- **Concatenation (`+`)**: It is always **O(N + K)** because it must copy the existing `N` bytes every single time. As the `bytearray` grows larger (e.g., millions of items in a backup), this leads to **O(N²)** total time complexity across multiple additions, as you are repeatedly copying an ever-growing buffer.
#### 3. Memory Pressure and Garbage Collection
- Concatenation briefly requires memory for **both** the old buffer and the new buffer simultaneously before the old one is garbage collected. This increases the peak memory usage of the process.
- `extend()` is more memory-efficient as it minimizes the need for multiple large allocations and relies on the underlying memory manager's ability to resize buffers efficiently.
In the context of `borg mount`, where `self.meta` can grow to be many megabytes or even gigabytes for very large repositories, using concatenation causes a noticeable slowdown as the number of archives or files increases, whereas `extend()` remains performant.1 parent 7d63832 commit da51e3d
1 file changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | | - | |
| 164 | + | |
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
| |||
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
202 | | - | |
| 202 | + | |
203 | 203 | | |
204 | 204 | | |
205 | 205 | | |
| |||
0 commit comments