Skip to content

GSoC 2025 ‐ Aayush Khanna

Aayush Khanna edited this page Aug 26, 2025 · 38 revisions

About me

Hey there! I'm Aayush Khanna from Noida, Uttar Pradesh, India. I am a third year undergrad pursuing civil engineering at the Indian institute of Technology (BHU), Varanasi. I am interested in all things related to tech in general! more recently, I've been trying to learn how interpreters work. I am also a huge football enthusiast :)

Project overview

My project aims to advance the state of LAPACK routines in stdlib, by extending conventional LAPACK APIs which ensures easy compatibility with stdlib ndarrays and adding support for both row-major (C-style) and column-major (Fortran-style) storage layouts. The project covers both lower level helper routines and higher level user facing routines. To further optimize these routines, techniques such as loop reordering and loop tiling were be used. A lot of time was spent on benchmarking and testing of these routines against the actual LAPACK implementations as well! The initial goal was to cover all LAPACK routines up to dgeev but we didn't quite make it there unfortunately.

Project recap

I started off by parsing the LAPACK source code into a directed graph where the nodes represent a LAPACK routine and the edges represent dependencies between these routines. To pick out which routines to work on first, I performed a topological sorting and started working on the ones with no dependencies. I've documented the process in this repository!

After that I started implementing these routines one by one, my workflow consisted of writing a base API that took strides and offsets as input parameters, this was to be kept private and not exported. Then I would make the ndarray wrappers over that and another API that was consistent with the LAPACK function signature.

The testing and benchmarking of these routines was a very important part of the project, for testing I would compare the outputs of our implementation and the actual LAPACK routine in various different cases by storing them in a JSON format and using tape to write the tests for it. This process soon became very tedious so I ended up writing a script to auto-generate the ndarray test fixtures for a routine given the standard inputs. This script can be found here. This saved us a lot of time and helped us to move faster.

The existing reference LAPACK implementation is Fortran-based and hence follows a column-major layout by default. However, in JavaScript, we can provide the user with the freedom to choose whether they want to pass the matrix in a row-major or column-major order. This flexibility is important since matrices are represented as arrays in JavaScript, ensuring contiguous memory allocation.

We did this by representing a matrix in linear memory using strides and offsets. For example:

  A   = [ 1, 2, 3 ]
        [ 4, 5, 6 ]
        [ 7, 8, 9 ] (3X3)
A_row = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ] (row-major)
A_col = [ 1, 4, 7, 2, 5, 8, 3, 6, 9 ] (column-major)

here, we would define two strides to iterate over the two dimensions of the matrix, strideA1 and strideA2. For row-major matrices strideA1 would be greater than strideA2 and the opposite otherwise. Note that swapping the strides and the dimensions of a matrix also gives us it's transpose. This allows for various optimizations which I'll talk about later.

To ensure consistency with the LAPACK function signatures, we have had to use a single element typed array to pass elements by value where needed, for example in dlacn2 the LAPACK API is:

SUBROUTINE DLACN2( N, V, X, ISGN, EST, KASE, ISAVE )

where KASE is an integer value that changes repeatedly between multiple function calls to dlacn2. Hence, to pass the variables by reference we used a single-element typed array, that would make out JavaScript API to be:

function dlacn2( N, V, strideV, offsetV, X, strideX, offsetX, ISGN, strideISGN, offsetISGN, EST, offsetEST, KASE, offsetKASE, ISAVE, strideISAVE, offsetISAVE )

and KASE[ offsetKASE ] represents the element that we're passing by reference.

Completed work

Current state

TODO: add a summary of the current state of the project.

What remains

TODO: add a summary of what remains left to do. If there is a tracking issue, include a link to that issue.

Challenges and lessons learned

TODO: add a summary of any unexpected challenges that you faced, along with any lessons learned during the course of your project.

Conclusion

This was a very fruitful project overall which has taught me a lot about various optimizations and was a very nice introduction to high performance numerical computing. Other than that, it's also helped me elevate my problem solving ability and helped me step outside my comfort zone. I would like to thank the Org admins Athan Reines and Phillip Buckhardt for the opportunity, guidance and support throughout this journey. I would also like to thank Karan Anand, Gunj Joshi and Gururaj Gurram for being wonderful people to talk to in the duration of the program. It's been almost a year since I opened my first PR to stdlib and this is a wonderful reminder of how far I've come. stdlib is a wonderful project that I hold very close to my heart, the people that I've met while contributing here are wonderful and I would definitely be active in the community after the GSoC period as well.

Clone this wiki locally