All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- support for
#<=>and#join, which were added tosetin the meantime - support for getting the (overall) character set of a Regexp with multiple expressions
- support for global and local case-insensitivity in Regexp inputs
Regexp#{covered_by_character_set?,uses_character_set?}methods (if core ext is used)
- new codepoints for
::assignedand::emojipredefined sets, as in Ruby 3.2.0
- fixed processing of Strings that are not ASCII- or UTF8-encoded
- removed dependency on
setandsorted_set- thanks to https://github.com/mikebaldry for reporting a related issue (#2)
::ofnow supports bothStringandRegexparguments
- fixed segfault during
Stringmanipulation on Ruby 3.2.0-dev - improved performance for
Stringmanipulation - allow usage in Ractors
- predefined sets must be pre-initialized for this, though
- e.g.
CharacterSet.ascii,keep_character_set(:ascii)etc. - call them once in the main Ractor to trigger initialization
- new codepoints for
::assignedand::emojipredefined sets, as in Ruby 3.1.0 - latest unicode case-folding data (for
#case_insensitive) - support for passing any Enumerable to
#disjoint?,#intersect?- this matches recent broadening of these methods in
ruby/set
- this matches recent broadening of these methods in
- new instance method
#secure_token(see README) - class method
::ofnow accepts more than oneString CharacterSet::ExpressionConvertercan now build output of any Set-like class
CharacterSet::Pure::of_expressionnow returns aCharacterSet::Pure- it used to return a regular
CharacterSet
- it used to return a regular
- multiple fixes for Ruby 3
- fixed segfault for some
Stringmanipulation cases - added
sorted_setas dependency, soCharacterSet::Pure(non-C fallback) works
- fixed segfault for some
- fixed error when parsing a
Regexpwith an empty intersection (e.g./[a&&]/)
#to_s_with_surrogate_ranges/Writer::write_surrogate_ranges- allows for much shorter astral plane representations e.g. in JavaScript
- thanks to https://github.com/singpolyma for the suggestion and groundwork (#1)
- improved performance for
#to_s/Writerby avoiding buggedRange#minmax
- '/' is now escaped by default when stringifying so as to work with //-regexp syntax
- improved
Stringmanipulation speed - improved initialization and
#mergespeed when passing a largeRange - reduced memory consumption by > 90% for most use cases via dynamic resizing
- before, every set instance required 136 KB for codepoints
- now, 16 bytes for a CharacterSet in ASCII space, 8 KB for one in BMP space etc.
#count_inand#scanmethods forStringinteraction- new predefined sets
::any/::all,::assigned,::surrogate - conversion methods
#assigned_part,#valid_part - sectioning methods
#ascii_part,#plane(n) - section test methods
#ascii_part?,#ascii_ratio,#ascii_only?,#astral_only?
#countnow supports passing an argument or block as usualCharacterSet::Pure#keep_in,#delete_innow preserve the original encoding
- added latest Unicode casefold data (for
#case_insensitive)
- restored
range_compressoras a runtime dependency for JRuby only
- improved messages for missing optional dependencies
- made
range_compressoran optional dependency as it is almost never needed
- added option to reference a predefined set via Symbol in
Stringextension methods - added predefined sets
::ascii_alnumand::ascii_letters
Initial release.