-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Full name
Aarya Balwadkar
University status
Yes
University name
Symbiosis Institute of Technology, Pune
University program
Computer Science
Expected graduation
2027
Short biography
I am a second year student at Symbiosis Institute of Technology Pune, pursuing B.Tech in Computer Science. I have knowledge of javascript, c, cpp and python as all these languages were covered till semester 3. I have a strong hold on javascript as web development was taught us in semester 3 and from the same time I have worked initially as UI/UX designer and as a full stack developer remotely in “Project Human City” an non-profit Canada based startup and still working there. So, I have gained knowledge and experience in JavaScript, TypeScript, React.js, MongoDB, GraphQL, Node.js, and Express.js.
I am also very much fascinated to contribute to open source with whatever knowledge I have and the journey was started from my first year through Hacktoberfest 2023 and then Girlscript Summer of Code 2024.
Timezone
Indian Standard Time ( IST ), UTC+5:30
Contact details
email: [email protected], github: AaryaBalwadkar, linkedin: AaryaBalwadkar
Platform
Windows
Editor
My preferred editor is VS Code as working on many different languages becomes very easy with it extension feature.
Programming experience
I have worked initially as UI/UX designer and then as a full stack developer remotely in “Project Human City” an non-profit Canada based startup and still working there. So, I have gained knowledge and experience in JavaScript, TypeScript, React.js, MongoDB, GraphQL, Node.js, and Express.js.
JavaScript experience
I work a lot with JavaScript and Typescript as a working professional in one of the startup and one of my project is:
Shiksha Sankalp: A Collaborative Platform Inspired by Discord and Slack Shiksha Sankalp is a communication and collaboration platform designed for educational purposes. It is a next-generation platform designed to empower users with seamless communication, collaboration, and creativity. Whether you're managing a community or building your own online presence, Shiksha Sankalp helps you stay connected with your team, friends, and other users. Github
Node.js experience
My all web development projects uses Node.js. So, I am very much familiar with npm, npx and nvm. I have made real time messaging application using Express.js and databases like Mongodb and Mysql.
C/Fortran experience
I have experience in C and C++. One of my project is:
ExpenseEase: The expense tracker project is designed to simplify the process of managing group expenses and splitting costs among members. The goal is to provide an efficient way to keep track of individual expenditures, calculate who owes what, and present detailed analyses based on categories of spending. This project is especially useful in group activities such as trips, shared households, or social gatherings where multiple people contribute to shared expenses.
Interest in stdlib
First simplest assignment given at time of learning any new programming language is creating a matrix multiplication and creating complex number adding, subtraction. During that period our professor told us to research on how the existing libraries implement those and then think of can we optimize those further. While researching I came to know stdlib. It was about in my first year of engineering.
I discovered its cross-platform capabilities and its ability to run seamlessly in both Node.js and the browser. This increased my interest in stdlib and also such numerical libraries.
Version control
Yes
Contributions to stdlib
Merged
chore: fix EditorConfig lint errors (issue #6270) #6278
chore: fix C lint errors (issue #6272) #6279
Open
feat(stats): add nanmeanstdev package #6148
feat(stats): add nanmmidrange package #6263
stdlib showcase
In progress
Goals
-
Implementing Stringarray
The plan is to add dynamic string typed array using Uint8array. The strategy is:- We need to maintain 3 things:
- An buffer to store the string
- An index array to track each string’s start position. For this we will use Uint32Array.
- And length array of each string. Again, here we will require Uint32Array.
- Whenever the string needs to be updated:
- Check if the string is lesser or equal to the previous string. If this check comes true than we will override the previous indexes in the buffer array.
- If its greater than we will add the string at the end of the buffer and update the buffer and length accordingly.
But here whenever the string is smaller than some of the memory bytes will remain as it is similarly if it greater than the previous string than the indexes related to previous string will remain as it is and hence of no use. In small array it will not make any major concern but in large arrays, largely updating the array may cause serious memory wastage. We can’t ignore this. So, I here propose we should keep track of these waste memory bytes by creating a separate index array and length array for it. So, whenever a new string is inserted will check first that if the a continuous memory of the that size available or not. If yes will use it. It will be somewhat challenging but doing so we can save memory as much as possible.
For string input and output we will use getter and setter like that of the Boolean array and complex array. Further all the typed array constructors and methods can be included similar to one done earlier in typed arrays, these includes:
- Constructors:
- StringArray()
- StringArray(length)
- StringArray(typedArray)
- StringArray(object)
- Constructor Signatures:
- StringArray(buffer)
- StringArray(buffer, byteOffset)
- StringArray(buffer, byteOffset, length)
- Methods:
- get(index)
- set(index, value)
- buffer
- byteOffset
- byteLength
- from()
- of()
- at()
- copyWithin()
- entries()
- every()
- fill()
- filter()
- find()
- findIndex()
- findLast()
- findLastIndex()
- forEach()
- includes()
- indexOf()
- join()
- keys()
- lastIndexOf()
- length()
- map()
- reduce()
- reduceRight()
- reverse()
- set()
- slice()
- some()
- sort()
- subarray()
- toLocaleString()
- toReversed()
- toSorted() : based upon the charcters for a-z and z-a. Also, If numbers are present in the StringArray, they will be sorted lexicographically (dictionary order) rather than numerically
- toString()
- values()
- with()
- We need to maintain 3 things:
-
Adding support of Stringarray dtypes throughout the @stdlib/*
To fullfill this we need to update following packages:-
Need to add new packages
- @stdlib/strided/base/reinterpret-string
- @stdlib/stringarray : similar to @stdlib/complex and @stdlib/Boolean (we have string package for handling all string manipulation functions so not sure about what to give the name for this new package)
-
Needs updates along with functionality
- @stdlib/array/dtype
- @stdlib/array/typed-ctors
- @stdlib/types
- @stdlib/array/defaults
- @stdlib/strided/dtypes
-
Needs updates except functionality, like Tests, docs, etc.
Many more will be added to this list.
-
Why this project?
The StringArray project excites me because it fills a crucial gap in JavaScript's typed array ecosystem by enabling efficient handling of variable-length strings in a structured memory format. Unlike numbers (fixed size), strings are variable-length, making their storage in typed arrays challenging.
Questions like how to handle resizing, memory fragmentation, and efficient lookup make this problem interesting for me and will help me learn many new things.
Qualifications
I believe I am well-suited for this StringArray implementation in stdlib due to my prior experience in web development and open-source contributions.
Prior art
Some popular libraries that implement StringArray are
Numpy: Here the string array are implemented as fixed size and modifying a value does not cause reallocation, but changing length truncates the array. So wastage of memory occurs if very small strings are stored.
Pandas: Here strings are stored as variable-length, so modifying a string does not cause truncation. But when string greater than the previous string than the new string is added without the modification of previous and just referencing to the new memory location. ( I proposed this solution for our project ).
Tenserflow: Here strings are stored as variable-length but modification entire new array having the updated string and copying the previous array into it.
Commitment
I am more committed on valuable implementation suggested, whatever time it will take during the GSOC period. But then also, I can would like to tell that I can work on the project for 30 hours per week. After the GSOC also if the more things needed to implement in project I will be available.
Schedule
Assuming a 12 week schedule,
-
Community Bonding Period:
- Study codebase as how similar typed array has been implemented earlier. eg. Boolean and Complex
- Attending meetings and refining project
-
Week 1: Implementing basic functionality of string array
- Implementing:
- constructor
- getter and setter
- update string method
Testing these implementations till it satisfies our goals. As we need a careful implementation of update string method.
- Implementing:
-
Week 2: Completing implementation
- As greatest task is done in Week 1 it would be easier to implement all other functionalities.
- Testing them
-
Week 3: Testing, benchmarking and documentation
- Comprehensive testing, benchmarking and all the required documentation of the @stdlib/array/string
- Implementing @stdlib/stringarray
-
Week 4 - 5: Implementing packages listed "needs updates along with functionality"
- completing everything along with the test, benchmarks, examples and documentation related to these functionality
-
Week 6: (midterm)
- Midterm evaluation requirements if any
- Scanning entire codebase for finding bits of code that needs to updated. Most of it will be listed during proposal phase but still need to ensure that node of then remains.
-
Week 7-8: @stdlib/ndarray array updation for ensuring support of stringarray
- As ndarray implementation requires large amount of C implementation these weeks will be dedicated one.
- All other related implementations of ndarray would be done as listed in "Need to add new packages", "Needs updates along with functionality" and "Needs updates except functionality, like Tests, docs, etc."
-
Week 9-10: Support for stringarray in entire codebase
-
Week 11-12: Refinement
- Optimization of the implemented code
- Testing and documentations refinement according to the feedback received in each stage.
-
Final Week:
-
Submit the project and document all the goals achieved.
Notes:
- The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
- Usually, even week 1 deliverables include some code.
- By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
- By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
- During the final week, you'll be submitting your project.
Related issues
No response
Checklist
- I have read and understood the Code of Conduct.
- I have read and understood the application materials found in this repository.
- I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
- I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
- I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
- The issue name begins with
[RFC]:
and succinctly describes your proposal. - I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.