Skip to content

Keep original line endings when reading expectation files#25

Open
xymaxim wants to merge 2 commits intozaufi:masterfrom
xymaxim:keep-original-eols
Open

Keep original line endings when reading expectation files#25
xymaxim wants to merge 2 commits intozaufi:masterfrom
xymaxim:keep-original-eols

Conversation

@xymaxim
Copy link
Collaborator

@xymaxim xymaxim commented Mar 5, 2024

Changes in this PR

Since Python by default opens files in the universal newlines mode (newline=None), line ending characters are translated. Which is not good for us. So, let's preserve original line endings while writing and reading expectation files.

@xymaxim xymaxim requested a review from zaufi as a code owner March 5, 2024 16:37
@xymaxim xymaxim force-pushed the keep-original-eols branch from 0d31584 to 1d636b5 Compare March 5, 2024 18:27
@zaufi
Copy link
Owner

zaufi commented Mar 5, 2024

Could you please elaborate and gimme some more details and/or an example? I just can't understand what problem you're trying to solve %)

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 5, 2024

Could you please elaborate and gimme some more details and/or an example? I just can't understand what problem you're trying to solve %)

While I was working on #24, I noticed that regular tests (not from the previous issue) with \r\n would fail:

	def test_sample_out(capfd, expected_out):
    	print('Hello Africa!\r\n', end='')
    	stdout, stderr = capfd.readouterr()
>   	assert expected_out == stdout
E   	AssertionError: assert
E     	The test output doesn't equal to the expected
E     	(from `/tmp/pytest-of-ms/pytest-109/crlf_test0/crlf_test/test_sample_out.out`):
E     	---[BEGIN actual output]---
E     	Hello Africa!↵
E     	---[END actual output]---
E     	---[BEGIN expected output]---
E     	Hello Africa!↵
E     	---[END expected output]---

Here’s a quick demo of what’s happening:

# test: 00000000: 410d 0a42                            	A..B
#                   \r \n
with open("test", "w") as f:
    f.write("A\r\nB")

with open("test", "r") as f:
    print(repr(f.read()))  # “A\nB”

While this works as desired:

with open("test", "r", newline=””) as f:
    print(repr(f.read()))  # “A\r\nB”

@xymaxim xymaxim changed the title Keep original line endings while writing and reading expectation Keep original line endings when reading expectation files Mar 5, 2024

# Store!
self._pattern_filename.write_text(text)
with self._pattern_filename.open('w', newline='') as f:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this one is unnecessary and needs to be removed: the conversion of newlines doesn't happen during writing (see https://peps.python.org/pep-0278/#specification).

@xymaxim xymaxim force-pushed the keep-original-eols branch from 46ffe41 to 2991c3e Compare March 5, 2024 20:30
@zaufi
Copy link
Owner

zaufi commented Mar 5, 2024

What makes me worried is that this will ruin (almost) everything! Indeed, for example, I have a trivial output matching test print('Hello Africa!') with the obviously trivial expectation. Whatever OS I use, thanks to Git and the auto CRLF option everything works fine on all platforms! -- cuz during checkout git will replace EOL(s) in the expectations file to the native format and Python's TextIO will handle \n properly...

If I needed smth system-specific, I add a f'-{platform.system()}' suffix to the expectation filename and told Git via .gitattributes to set whatever CRLF style I needed over system-dependent expectations.

Obviously, this patch will make this way broken...

I've taken a closed look at the test added by the #24 ... it looks incorrect to me %) makepatternfile() of the fixture behaves similarly to the pytester.makefile and does not write EOL at the last (and in this case the only) text line! So, strictly speaking, that test never checks for EOL styles...

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

What makes me worried is that this will ruin (almost) everything! Indeed, for example, I have a trivial output matching test print('Hello Africa!') with the obviously trivial expectation. Whatever OS I use, thanks to Git and the auto CRLF option everything works fine on all platforms! -- cuz during checkout git will replace EOL(s) in the expectations file to the native format and Python's TextIO will handle \n properly...

Hmm, indeed, that’s something I hadn’t considered. My initial thoughts were that expectation texts should be treated as immutable, without any changes, and files are just an intermediate state of them, but, yes, Git couldn't be excluded from the story.

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

If I needed smth system-specific, I add a f'-{platform.system()}' suffix to the expectation filename and told Git via .gitattributes to set whatever CRLF style I needed over system-dependent expectations.

I recently came across a binary file with an ASCII text header that needed to be parsed, and the header contains Windows-style line endings. The imaginary test case (just for an example) for some header extract function would be to output the header as is. This is not a platform-specific case.

As I understand, on Unix, it’s not possible to have a pattern with CRLF symbols right now (even if we add tests/data/expected/crlf_test.out -text to .gitattributes) because of how Python converts newlines during reading (CRLF -> LF).

So, the only way is to read it in a binary mode?

@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 6, 2024

I've taken a closed look at the test added by the #24 ... it looks incorrect to me %) makepatternfile() of the fixture behaves similarly to the pytester.makefile and does not write EOL at the last (and in this case the only) text line! So, strictly speaking, that test never checks for EOL styles...

Let's discuss it in #24, I'll answer there.

@zaufi
Copy link
Owner

zaufi commented Mar 9, 2024

As I understand, on Unix, it’s not possible to have a pattern with CRLF symbols right now (even if we add tests/data/expected/crlf_test.out -text to .gitattributes) because of how Python converts newlines during reading (CRLF -> LF).

I didn't get it... why not? The other question is if you want to make this test run on all platforms (w/ different native EOLs) and a test's input data file is "static" (in terms of EOL style in its beginning (I'm thinking about some "self-extract" archive w/ a shell script at the beginning %) obviously u need to preserve EOLs in the pattern file added to VCS (git I guess. so .gitattributes could help) and during the test tell to Python to use specific EOL style on read from file...

What could go wrong here? %)

@zaufi zaufi force-pushed the master branch 2 times, most recently from 0196775 to 68dc622 Compare March 11, 2024 03:36
@xymaxim
Copy link
Collaborator Author

xymaxim commented Mar 12, 2024

The problem is that the test from the PR doesn't pass without the proposed changes because of the missing CR symbol. This may look unexpected and confusing for users, since an expectation file actually contains the symbol, but not the fixture.

@zaufi zaufi force-pushed the master branch 6 times, most recently from ec9dfe3 to 8abc274 Compare July 17, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants