Skip to content

GSoC 2024 ‐ Aman Bhansali

Aman Bhansali edited this page Aug 16, 2024 · 19 revisions

About me

I’m Aman Bhansali, an undergraduate from the Indian Institute of Technology Jodhpur. I am from Purnia, Bihar, India. I am passionate about mathematics and programming, with a strong interest in numerical linear algebra and scientific computing. My experience in areas like computer vision and NLP has deepened my understanding of mathematical models' role in algorithm development. Additionally, applications involving the use cases of machine learning to address challenges in the real world greatly fascinate me.

Project overview

TODO: add a description of the project goals and technical approach.

Project recap

During the community bonding period, I began coding by focusing on the BLAS Level 1 routines, as outlined in tracking issue #2039. I initially concentrated on mastering the simpler single- and double-precision routines before advancing to more complex ones. The complex routines at the initial stages were challenging to grasp, but after working on a few packages, I became used to them. Implementing C/Fortran was initially challenging, but with understanding and practice, things became easier eventually. Thus, single- and double-precision routines were implemented entirely within the decided timeframe. The complex single- and double-precision routines that are independent of other BLAS routines have also been fully implemented, including their C and Fortran implementations. However, due to the ongoing development of the tooling required for integrating packages across C and Fortran, a few of the complex Level 1 routines remain. Native Implementation Level 1 Signature:

sswap( N, x, strideX, y, strideY )

Ndarray Implementation Level 1 Signature:

sswap( N, x, strideX, offsetX, y, strideY, offsetY )

This additional offset parameter in the ndarray implementation gives the user the freedom to select their starting index for the operation along with the stride.

After completing the Level 1 routines, I switched to working on Level 2 routines, which involve matrix-vector operations. At this stage, things became more interesting and, at the same time, challenging to work on. My approach remained systematic, starting with the real single and double precision routines and commencing with their native Js implementations based on the reference Fortran implementation. At this stage, both code and R&D became equally important, leading to an extensive round of refactoring and discussions with Athan, during which we figured out a way to implement modern BLAS routines. The existing reference Lapack implementation is Fortran-based and hence follows a column-major layout by default. However, in Js, we can provide the user with the freedom to choose whether they want to pass the matrix in a row-major or column-major order. This flexibility is important since matrices are represented as arrays in JavaScript, ensuring contiguous memory allocation. However, the key innovation comes after this.

Let's take a small example:

A = [ 1, 2, 3 ]
    [ 4, 5, 6 ] (2X3)
A = [ 1, 2, 3, 4, 5, 6 ] (row-major)
A = [ 1, 4, 2, 5, 3, 6 ] (column-major)

Native Implementation Signature:

sgemv( trans, M, N, alpha, A, lda, x, strideX, beta, y, strideY )
sgemv( order, trans, M, N, alpha, A, lda, x, strideX, beta, y, strideY )

Similar to the ndarray implementation for the Level 1 routine, we have an offset parameter associated with the matrix and the vector. Additionally, there are use-cases where we need to perform operations on a specific submatrix within a larger global matrix. To accommodate this, our ndarray implementation includes the following parameters:

  • sa1: stride along the first dimension of matrix A.
  • sa2: stride along the second dimension of matrix A.

These parameters give users full flexibility to utilize the BLAS implementation as needed, even allowing for negative stride values depending on the use case. At each stage of implementation, the idea was to reduce code duplication and maintain cache optimality.

Ndarray Implementation Signature:

sgemv( trans, M, N, alpha, A, strideA1, strideA2, offsetA, x, strideX, offsetX, beta, y, strideY, offsetY )

Completed work

TODO: include a list of links to all relevant PRs along with a short description for each. For small bug fix PRs, you can group them together and describe them as, e.g., "various maintenance work".

Current state

TODO: add a summary of the current state of the project.

What remains

TODO: add a summary of what remains left to do. If there is a tracking issue, include a link to that issue.

Challenges and lessons learned

TODO: add a summary of any unexpected challenges that you faced, along with any lessons learned during the course of your project.

Conclusion

TODO: add a report summary and include any acknowledgments (e.g., shout outs to contributors/maintainers who helped out along the way, etc).

Clone this wiki locally