Skip to content

Conversation

adityacodes30
Copy link
Contributor

@adityacodes30 adityacodes30 commented Mar 9, 2024

Resolves #1729

Description

What is the purpose of this pull request?

This PR adds the package @stdlib/string/base/last-grapheme-cluster

This pull request:

  • Adds the package @stdlib/string/base/last-grapheme-cluster which returns the last n grapheme clusters (i.e., user-perceived characters) of a string.

Related Issues

Does this pull request have any related issues?

#854

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.


@stdlib-js/reviewers

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 9, 2024

I have implemented the last-grapheme-cluster package using prev-grapheme-cluster-break. I noticed that the implementation of prev-grapheme-cluster-break was rather contrary to pattern of other similar packages. On further research i found that according to #1062 prev-grapheme-cluster-break has to be rewritten.

I think before moving ahead with this pr we should address rewriting of prev-grapheme-cluster-break. Should i open an issue and start working on that?

@kgryte @steff456

edit prev-grapheme-cluster -> prev-grapheme-cluster-break

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

It doesn't need to be rewritten tmk, just fixed.

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

Relevant issues:

Now that I am jogging my memory, yes, prev-grapheme-cluster-break does need refactoring. But we should first get #1431 over the finish line before doing so.

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 9, 2024

Yes, to be fixed

Also just to be clear the packages mentioned in #1062 next-grapheme-cluster and prev-grapheme-cluster refer to next-grapheme-cluster-break and prev-grapheme-cluster-break right.

What should be the further course of action, fixing prev-grapheme-cluster-break or should we move ahead as is?

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

@adityacodes30 I'd say go ahead and implement in terms of next-grapheme-cluster-break. It won't be as efficient, as it requires iterating over the entire string, but it's more important at this point to get something working.

@adityacodes30
Copy link
Contributor Author

Alright , ill edit this package to use next-grapheme-cluster-break instead of prev-grapheme-cluster-break

@adityacodes30
Copy link
Contributor Author

i have implemented this package with the logic

num = num of total grapheme strings 
n = num of strings to return 
no = num - n 

while ( no > 0 ) {
		i = nextGraphemeClusterBreak( str, i );
		no -= 1;
	}
	
return str.substring(  i  , len  )

i think this approach should suffice until prev-grapheme-cluster-break receives the fixes, will push code in a bit

@kgryte
Copy link
Member

kgryte commented Mar 10, 2024

Hmm...I wonder if this can be implemented in a single pass, even with next-grapheme-cluster-break.

@kgryte
Copy link
Member

kgryte commented Mar 10, 2024

In fact, I am fairly sure you don't need to first compute the number of grapheme clusters.

@adityacodes30
Copy link
Contributor Author

I thought about it as well, however nextgraphemeclusterbreak will always move us forward in the string, while we need to go backwards. Do you have a particular logic in mind ?

@kgryte
Copy link
Member

kgryte commented Mar 11, 2024

Yes, you can use a circular buffer (see https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/utils/circular-buffer), where the buffer size is n. Use nextGraphemeClusterBreak to populate with cluster breaks until the next break is -1 (i.e., no more breaks). You can then substring by using the first element (index) in the buffer.

@Planeshifter Planeshifter added Feature Issue or pull request for adding a new feature. JavaScript Issue involves or relates to JavaScript. Needs Changes Pull request which needs changes before being merged. labels Mar 12, 2024
@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 12, 2024

got it ! I thought of another approach as well , we could use the split-grapheme-cluster package and then join the last n grapheme clusters.

Edit - Thought a bit more on it, it would unnecessarily increase the space complexity. Circular buffer approach seems the best.

@adityacodes30
Copy link
Contributor Author

implementation updated !

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 13, 2024

this pr is ready for review, after merging this we can perhaps move towards resolving #854
@kgryte

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 17, 2024

@Planeshifter @kgryte
can we get a review on this one so work on #854 can begin !

Copy link
Member

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @adityacodes30 I recommend spending some time going over the changes I made to simplify your implementation and avoid unnecessary string concatenation.

@kgryte kgryte removed JavaScript Issue involves or relates to JavaScript. Needs Changes Pull request which needs changes before being merged. labels Mar 21, 2024
@kgryte kgryte added the Utilities Issue or pull request concerning general utilities. label Mar 21, 2024
@kgryte kgryte merged commit bbc3cfc into stdlib-js:develop Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature Issue or pull request for adding a new feature. Utilities Issue or pull request concerning general utilities.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC]: add @stdlib/string/base/last-grapheme-cluster

3 participants