Skip to content

Comments

Improve performance of flatten in DenseMatrix#3400

Merged
josdejong merged 9 commits intodevelopfrom
faster-denseMatrix-flatten
Mar 12, 2025
Merged

Improve performance of flatten in DenseMatrix#3400
josdejong merged 9 commits intodevelopfrom
faster-denseMatrix-flatten

Conversation

@dvd101x
Copy link
Collaborator

@dvd101x dvd101x commented Feb 21, 2025

Hi, this improves the performance of flattening matrices by 60%.

@dvd101x dvd101x changed the title Implemented function that flattens an array with a validated size (rectangular) Improve performance of flatten in DenseMatrix Feb 21, 2025
@josdejong
Copy link
Owner

Thanks David. I've run the benchmark comparing develop and this PR, but I get smaller performance gains than you, in the order of 10-20% improvement. I'm a bit in doubt whether we should merge this PR: it adds quite some complexity to the code, but the performance improvement seems relatively small. It feels a bit like micro optimization (a different nodejs version or browser type may give bigger performance differences than the code optimization here). What do you think?

# develop

flatten(array)             0.77 µs   ±0.39%
flatten(genericMatrix)     1.99 µs   ±0.44%
flatten(numberMatrix)      2.00 µs   ±0.57%

# PR 3400

flatten(array)             0.60 µs   ±0.19%
flatten(genericMatrix)     1.88 µs   ±0.63%
flatten(numberMatrix)      1.87 µs   ±0.54%

@dvd101x
Copy link
Collaborator Author

dvd101x commented Feb 27, 2025

Thanks for reviewing this Jos, let me check what happened and report back.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 4, 2025

Hi, I was missing a parenthesis. Was using Matrix.size instead of Matrix.size()

These are the results:
Sin título

# develop

flatten(array)             0.69 µs   ±0.03%
flatten(genericMatrix)     1.83 µs   ±0.17%
flatten(numberMatrix)      1.83 µs   ±0.19%

# PR 3400

flatten(array)             0.67 µs   ±0.15%
flatten(genericMatrix)     1.47 µs   ±0.25%
flatten(numberMatrix)      1.46 µs   ±0.19%

Faster by 25% on my end, could you please validate?

I think the initial improvement I was showing was comparing to develop before #3354

@josdejong
Copy link
Owner

Thanks David, I get the same results as you when running the benchmark.

I have a couple of thoughts:

  1. All the matrix versions of flatten does is run flatten and then convert the result in a DenseMatrix. So the difference between the first test of the benchmark and the other two should be only caused by the create function of DenseMatrix. Since this PR only changes the source code of the low level array function, it may be good to only test it on Array (and then test the two variances: with and without the extra knowledge of maxDepth, otherwise it's a bit comparing apples and pears.

  2. I'm still a bit in doubt if 25% performance gain is worth the additional code.

  3. Instead of passing a maxDepth (which can be used wrongly), it may be enough to pass a boolean flag denoting that the function can be sure that the passed array is a matrix where every row has the same length. The function can then determine the matrix size in a cheap way using arraySize or just looking at array[0] to determine whether to recurse further or not.

  4. I did some experimenting too 😁, it's too tempting. The following function improves the first benchmark test flatten(array) from 0.69 µs to 0.41 µs on my machine, without need for much extra code and complexity (does not need 2 implementations or an extra parameter), maybe that is good enough?

    export function flatten (array) {
      if (!Array.isArray(array)) {
        // if not an array, return as is
        return array
      }
    
      const flat = []
    
      function _flatten (array) {
        for (let i = 0; i < array.length; i++) {
          const item = array[i]
          if (Array.isArray(item)) {
            _flatten(item)
          } else {
            flat.push(item)
          }
        }
      }
    
      _flatten(array)
    
      return flat
    }

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 5, 2025

Hi Jos!

Thank you for reviewing this!

  1. Ok, I will include a specific test on Array. I was testing only high level code, since at a high level math.flatten doesn't distinguish between different types of arrays.
  2. Ok, that's a fair concern.
  3. Since maxDepth is only used internally I thought it might be ok. Initially I was working with array[0] but in my opinion the algorithm was easier to understand with maxDepth. I think using array[0] allows for another optimization that I will test and show if possible.
  4. That's very nice, I will review in depth.

I will make a new benchmark between arrays including your example, compare using maxDepth and array[0] and report back.

I completely understand if in the end the added complexity of a second algorithm isn't worth it. I expect something positive from comparing apples to apples on the following days.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 6, 2025

Hi Jos,

I included a new commit, with your code proposal for regular arrays and modified the code for rectangular arrays to take a boolean as an argument instead of the maximum depth.

Test Time PR A Error PR A Time Dev Error Dev Improvement A
flatten(array) 0.44 µs ±0.19% 0.69 µs ±0.04% 36.23%
flattenArray(array) 0.45 µs ±0.25% 0.69 µs ±0.09% 34.78%
flattenArray(array, false) 0.44 µs ±0.19% 0.69 µs ±0.05% 36.23%
flattenArray(array, true) 0.33 µs ±0.29% 0.69 µs ±0.05% 52.17%
flatten(genericMatrix) 1.51 µs ±0.16% 1.84 µs ±0.18% 17.93%
flatten(numberMatrix) 1.51 µs ±0.16% 1.82 µs ±0.16% 17.03%

As a summary your code proposal is 36% faster than dev for arrays in general, for rectangular arrays the specialized algorithm shows 52% improvement. As you mentioned the improvement is lost in the creation of a Matrix, in the end it improves 18%.

I tested your proposed algorithm for arrays in general using it also for matrices and the improvement is 14% (instead of 18%).

Compared to the previous commit, with changes only in the algorithm for rectangular arrays but using maxDepth.

Test Time PR B Error PR B Time Dev Error Dev Improvement B
flatten(array) 0.67 µs ±0.14% 0.69 µs ±0.05% 2.89%
flattenArray(array) 0.67 µs ±0.23% 0.69 µs ±0.10% 2.89%
flattenArray(array, -1) 0.68 µs ±0.18% 0.69 µs ±0.04% 1.44%
flattenArray(array, 2) 0.33 µs ±0.30% 0.69 µs ±0.04% 52.17%
flatten(genericMatrix) 1.44 µs ±0.20% 1.83 µs ±0.19% 21.31%
flatten(numberMatrix) 1.43 µs ±0.20% 1.83 µs ±0.20% 21.86%

It had the same 52% improvement, I think just by luck it shows 21% improvement for Matrices (vs 18% of the new commit)

A nice improvement of using a boolean indicating if it's rectangular, is that you don't need to know the size of the array, just that it is rectangular, as it's the case for the function reshape it's algorithm assumes that the array is rectangular. It could be tasted how much faster would reshape be.

My conclusions:

  • Your proposal is very good, applying it's logic instead of array.forEach could have significant improvements (in this case 36 %)
  • It's important to differentiate between rectangular and jagged arrays (in this case 52% vs 36% both with the same overall logic of your proposal)
  • This isn't the main issue to address, as the matrix creation affects the most, but I don't have a proposal for that at the moment.
  • Maybe it's a good idea to differentiate arrays, both in code and in documentation.
  • On various PR's I made the introduction of maxDepth as I thought might be easier to understand and I think I found marginal improvements, those could be easily be refactored to use a boolean.

@josdejong
Copy link
Owner

Thanks for the updates David!

  1. I think the performance improvement of flatten (without matrix create) is worth the additional code, being up to twice as fast.
  2. I like passing an optional flag to denote whether the matrix is rectangular or jagged over passing a maxDepth, so that is a nice improvement! I think the name "homogeneous" can be confusing: it can also used to describe whether a matrix contains a single data type (like number or complex). How about naming it "isRectangular" instead?
  3. Looking into the performance of creating a DenseMatrix is an interesting next step :). Probably one of the validation steps on creation is slow, and if so, we may be able to improve on that by introducing a way to skip the validation tests when they are not needed (similar to ensuring flatten that the passed Array is rectengular by passing a boolean flag).

I think this PR is ready to merge, agree?

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 6, 2025

Thanks!

Yes I agree this is ready to merge, just renamed "homogeneous" to "rectangular" and found a case in hypot where a rectangular array is expected. There might be many others, but that could be addressed later.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 7, 2025

Please wait, I found a possible regression.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Mar 7, 2025

All good, this is ready to merge.

@josdejong
Copy link
Owner

👍 thanks again David

@josdejong josdejong merged commit 5186a3d into develop Mar 12, 2025
15 checks passed
@dvd101x dvd101x deleted the faster-denseMatrix-flatten branch March 12, 2025 18:05
@josdejong
Copy link
Owner

Published now in v14.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants