Skip to content

deps: Update fancy-regex to v0.16.2#596

Merged
keith-hall merged 2 commits intotrishume:masterfrom
CosmicHorrorDev:update-fancy-regex
Sep 21, 2025
Merged

deps: Update fancy-regex to v0.16.2#596
keith-hall merged 2 commits intotrishume:masterfrom
CosmicHorrorDev:update-fancy-regex

Conversation

@CosmicHorrorDev
Copy link
Contributor

Updates fancy-regex to the latest version. It's not part of the public API, so no breaking change 🥳

This change looks like it makes the following bat syntaxes compile successfully with fancy-regex as the backend (verified by generating two-face's dumps):

  • syntaxes/02_Extra/LiveScript.sublime-syntax
  • syntaxes/02_Extra/SCSS_Sass/Syntaxes/Sass.sublime-syntax
  • syntaxes/02_Extra/cmd-help/syntaxes/cmd-help.sublime-syntax

while the following still fail:

  • syntaxes/02_Extra/Assembly (ARM).sublime-syntax
  • syntaxes/02_Extra/Elixir/Regular Expressions (Elixir).sublime-syntax
  • syntaxes/02_Extra/JavaScript (Babel).sublime-syntax
  • syntaxes/02_Extra/PowerShell.sublime-syntax
  • syntaxes/02_Extra/SLS/SLS.sublime-syntax
  • syntaxes/02_Extra/VimHelp.sublime-syntax

@keith-hall
Copy link
Collaborator

It seems like the syntest example is hanging 🤔 I will try to debug it later, at least it would be nice to see if it is a specific regex it gets stuck on etc

@CosmicHorrorDev
Copy link
Contributor Author

Ah yeah I'm seeing the hang too. It looks like it starts happening with fancy-regex v0.13.0 if that helps. Help pinning down the problematic regex would be greatly appreciated!

@CosmicHorrorDev
Copy link
Contributor Author

CosmicHorrorDev commented Aug 13, 2025

Difference detected! It looks like the parser is getting stuck looping here forever without advancing anything

while self.parse_next_token(
line,
syntax_set,
&mut match_start,
&mut search_cache,
&mut regions,
&mut non_consuming_push_at,
&mut res,
)? {}

I've got things minimized down to this difference at least

[package]
name = "fancy-bug"
version = "0.1.0"
edition = "2024"

[dependencies]
fancy-regex = "0.13.0"
fancy_regex_old = { package = "fancy-regex", version = "0.11.0" }
fn main() {
    let regex_str = r"\>";
    let haystack = "<T> void save(T obj);\n";

    let regex = fancy_regex::Regex::new(regex_str).unwrap();
    let captures = regex.captures(haystack);
    let region = Region::init_from_captures(&captures.unwrap().unwrap());

    let regex_old = fancy_regex_old::Regex::new(regex_str).unwrap();
    let captures_old = regex_old.captures(haystack);
    let region_old = Region::init_from_captures_old(&captures_old.unwrap().unwrap());
    println!("New: {region:?}\nOld: {region_old:?}");
}

#[derive(Clone, Debug, Eq, PartialEq)]
pub struct Region {
    positions: Vec<Option<(usize, usize)>>,
}

impl Region {
    fn init_from_captures(captures: &fancy_regex::Captures) -> Self {
        let mut positions = Vec::new();
        for i in 0..captures.len() {
            let pos = captures.get(i).map(|m| (m.start(), m.end()));
            positions.push(pos);
        }
        Self { positions }
    }

    fn init_from_captures_old(captures: &fancy_regex_old::Captures) -> Self {
        let mut positions = Vec::new();
        for i in 0..captures.len() {
            let pos = captures.get(i).map(|m| (m.start(), m.end()));
            positions.push(pos);
        }
        Self { positions }
    }
}
$ cargo r
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/fancy-bug`
New: Region { positions: [Some((2, 2))] }
Old: Region { positions: [Some((2, 3))] }

@keith-hall
Copy link
Collaborator

Nice find! I had also just come to the conclusion that the regex causing problems seemed to be \> but it wasn't a case of catastrophic backtracking or anything. It's an interesting case, \> should actually be just > in the syntax definition, but it is escaped instead of quoted as > is a special character in Yaml, and then that slash is being taken literally...

@CosmicHorrorDev
Copy link
Contributor Author

> isn't a special character for regexes, so I'm guessing it was erroneously escaped. I'll try fixing up the syntax to see if there are any deeper issues

@CosmicHorrorDev
Copy link
Contributor Author

Jinx 😺

@CosmicHorrorDev
Copy link
Contributor Author

@keith-hall since it looks like this is a bug in fancy-regex, would you be good with me moving things over to an issue in that repo? I can try to handle fixing things there if I get a bit of guidance (unless you'd rather take things on yourself of course)

@CosmicHorrorDev
Copy link
Contributor Author

CosmicHorrorDev commented Aug 13, 2025

Alright switching Ruby's syntax to no longer try to escape > and < and make syntest-fancy finishes ( 🎉 ) with this diff

--- testdata/known_syntest_failures_fancy.txt   2025-05-16 15:47:51.312892736 -0600
+++ -   2025-08-12 22:06:47.954075324 -0600
@@ -1,5 +1,7 @@
 loading syntax definitions from testdata/Packages
 FAILED testdata/Packages/C#/tests/syntax_test_Strings.cs: 38
+FAILED testdata/Packages/Java/syntax_test_java.java: 75
 FAILED testdata/Packages/LaTeX/syntax_test_latex.tex: 1
+FAILED testdata/Packages/Lisp/syntax_test_lisp.lisp: 15
 FAILED testdata/Packages/Markdown/syntax_test_markdown.md: 11

So, it looks like there are potentially more failed tests for java and lisp now? Not sure what to make of that (upd: it looks like both of their syntax files also try to escape >, so it's probably because of that)

@keith-hall
Copy link
Collaborator

@keith-hall since it looks like this is a bug in fancy-regex, would you be good with me moving things over to an issue in that repo? I can try to handle fixing things there if I get a bit of guidance (unless you'd rather take things on yourself of course)

Yes, we can consider it a bug in fancy-regex and open an issue there 👍
I do think it makes sense to prevent syntect from getting stuck, so I plan to look into that

@keith-hall
Copy link
Collaborator

keith-hall commented Aug 13, 2025

As for a hint of where to look in fancy-regex: https://github.com/fancy-regex/fancy-regex/pull/121/files#diff-5583e398a8deec11b274d9965eff8b5ade5226c7a020ca214a27d7d07dcc8a29R372-R375 (its a big diff which GitHub doesn't expand by default so in case that link doesn't help, it's parse.rs line 372)

Meanwhile \< and \> isn't mentioned at https://github.com/kkos/oniguruma/blob/master/doc/RE so probably it should be interpreted as a literal like before... Maybe we need a flag for having the parser behave contrary to Oniguruma in some cases but it shouldn't be the default at least

Just a flag/method for the RegexBuilder should suffice, as opposed to an inline flag controllable from the pattern itself.

@CosmicHorrorDev CosmicHorrorDev changed the title deps: Update fancy-regex to v0.16.1 deps: Update fancy-regex to v0.16.2 Sep 21, 2025
@CosmicHorrorDev
Copy link
Contributor Author

Thanks for the fancy-regex release @keith-hall. I think things should be all good to go here now. It looks like the only CI failure is an existing issue 🎉

Copy link
Collaborator

@keith-hall keith-hall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't got round to making a patch for the jsonnet submodule in bat yet, so that CI would pass, sorry. Pretty sure we can merge this as is though 👍

@keith-hall keith-hall merged commit c0472fb into trishume:master Sep 21, 2025
4 of 5 checks passed
@CosmicHorrorDev CosmicHorrorDev deleted the update-fancy-regex branch September 21, 2025 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants