Skip to content

Conversation

@Earlopain
Copy link
Collaborator

Skipping detecting the encoding is almost always right, just for binary it should actually happen.

A symbol containing escapes that are invalid in utf-8 would fail to parse since symbols must be valid in the script encoding.
Additionally, the parser gem would raise an exception somewhere during string handling.

The parser gem reject code where a string is invalid in the script encoding. RuboCop however monkey-patches that behaviour out, I have mirrored that logic for the tests here.

@kddnewton
Copy link
Collaborator

It seems like this is causing a bunch of new files to fail. I'm not sure if I understand fully what's going on there. Why are we adding a bunch of skips?

@Earlopain
Copy link
Collaborator Author

Earlopain commented Jan 10, 2025

Ah, yes. That is from the first commit, they were previously rejected by the parser gem as invalid syntax.

I made that change since I noticed that the encoding_shift_jis file I added supposedly passed tests even though I was certain that at least the tokens must not line up currently.

RuboCop accepts such code via monkeypatch here: https://github.com/rubocop/rubocop-ast/blob/02b8d0faeeebc3ba6b5284c9e66569bbae836941/lib/rubocop/ast/builder.rb#L112-L114
And this is the parser issue with a bit more context whitequark/parser#283

Edit: Let me just fix that minimal build failure... Doesn't really matter which encoding it is

@Earlopain Earlopain force-pushed the parser-translator-binary-encoding branch from 4892b05 to 69cb8b9 Compare January 10, 2025 20:35
@kddnewton
Copy link
Collaborator

@Earlopain if you don't mind just rebasing here I can merge this

Skipping detecting the encoding is almost always right, just for binary it should actually happen.

A symbol containing escapes that are invalid
in utf-8 would fail to parse since symbols must be valid in the script encoding.
Additionally, the parser gem would raise an exception somewhere during string handling
@Earlopain Earlopain force-pushed the parser-translator-binary-encoding branch from 69cb8b9 to fa0154d Compare January 11, 2025 21:25
@Earlopain
Copy link
Collaborator Author

Merged the two commits since all newly running tests from that monkeypatch pass now.

@kddnewton kddnewton merged commit 01cbec9 into ruby:main Jan 12, 2025
56 of 57 checks passed
@Earlopain Earlopain deleted the parser-translator-binary-encoding branch January 16, 2025 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants