Skip to content

Add an option to build translation tables#220

Closed
ddeclerck wants to merge 8 commits intoOCamlPro:gcos4gnucobol-3.xfrom
ddeclerck:gen_tables
Closed

Add an option to build translation tables#220
ddeclerck wants to merge 8 commits intoOCamlPro:gcos4gnucobol-3.xfrom
ddeclerck:gen_tables

Conversation

@ddeclerck
Copy link
Collaborator

This PR adds a -gentable option to GnuCOBOL that allows to build EBCDIC/ASCII translation tables using iconv.
It outputs the table to stdout - it is expected that users will redirect that to an acutal file.

@ddeclerck ddeclerck force-pushed the gen_tables branch 4 times, most recently from a2b5ed2 to 5bd9321 Compare March 11, 2025 08:42
Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall that is already a good implementation, I think the missing pieces would be done quickly; note that --gentable should likely be referenced in gnucobol.texi where the translation tables are documented

Comment on lines +46 to +49
#../configure --without-db --without-curses --without-xml2 --without-json \
# --without-iconv --disable-dependency-tracking
../configure --without-db --without-curses --without-xml2 --without-json \
--without-iconv --disable-dependency-tracking
--disable-dependency-tracking
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't adjust the CI config - as noted above this is only for generating the intermediate tarball and that must work without iconv as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but as far as I remember, something was failing (don't remember what). I'll revert this change so we can check that.

Copy link
Collaborator Author

@ddeclerck ddeclerck Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, for MSYS2, we have undefined references to libiconv in gentable.c - although all these calls are guarded by a HAVE_ICONV ifdef. And this only fails on MSYS2...

EDIT: apparently under MSYS2 configure insists on setting HAVE_ICONV 1 in config.h...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this means that it does have iconv in glibc - which symbols are undefined during link of cobc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The undefined symbols are 'libiconv', 'libiconv_open' and 'libiconv_close'.
See this CI run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you have any clue about this one ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that there's something broken in configure.ac, I'll try to have a look at this later.

Copy link
Collaborator

@GitMensch GitMensch Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it, the iconv stuff is included by gettext and because of the AC_REQUIRE those parts are all done even before the conditions are run. But I've found a solution by setting the iconv cache variable to "no", in case of explicit disabled iconv (which then can break gettext, but that should be explained by configure).

fixed upstream, should work fine now on MSYS2, both --with-iconv and --without-iconv

Copy link
Collaborator

@GitMensch GitMensch Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ebcdic500_latin1.ttbl should still be adjusted to not include code but the reference to --gentable, also it should be able to recreate that exact table using --gentable, correct? If the output is somehow better formatted/commented, then it should be adjusted in that file as if it would be generated.

Copy link
Collaborator Author

@ddeclerck ddeclerck Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it should be able to recreate that exact table using --gentable, correct?

Yes it is. Except that in the "manual" file we named the encodings EBCDIC 500 and Latin-1, which are not recognized by iconv - we'd need to pass IBM500,ISO-8859-1 to generate this table.

If the output is somehow better formatted/commented, then it should be adjusted in that file as if it would be generated.

Appart from the issue above, and a license text added to the "manual" file, the tables are formatted identically.

@ddeclerck
Copy link
Collaborator Author

ddeclerck commented Mar 25, 2025

Note the unrelated MacOS failure - this has started to fail when they deployed a new MacOS image...
Don't know exactly what is going on ; the build step fails with No rule to make target '/config.status', needed by 'Makefile', although the file seems to have been created correctly in the previous step. The leading / is suspicious though, so maybe some kind of path issue...

EDIT: apparently Makevars is not correctly inserted in po/Makefile.in.in - maybe something related to the version of gettext...

Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nearly done: the new option should be added to NEWS as new cobc option with a very short note what to use it for - you are free to directly do that in the upstream commit (note my minor changes to gentable.c); also build_windows and cobc and gnucobol.texi should get an update (possibly the Changelogs and the files touched need an update of the copyright year to 2025 as well)

... and a testcase be added (maybe just create the ebcdic500_ascii8bit.ttbl reversible/non-reversible, of course only if iconv is available)

Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor change requests; note: the reversible option is definitely a good one

Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good for upstream

@GitMensch
Copy link
Collaborator

The test 31 should really be adjusted so it takes both the "expected" amount of line breaks and the unexpected ones, if we don't see a way to fix presumably flush not happening...

Interesting msvc debug failures!

Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything here is fine now... as soon as the new tests pass the MSVC debug builds.

@ddeclerck
Copy link
Collaborator Author

I think everything here is fine now... as soon as the new tests pass the MSVC debug builds.

If only...
For some reason, in debug builds, the iconv call on untranslatable characters does not set errno to EILSEQ - see the output I added to gentable: errno is 0 after the call (while res is = -1). Could that be an iconv bug ?

@ddeclerck ddeclerck force-pushed the gen_tables branch 2 times, most recently from ec18c58 to a416afb Compare March 30, 2025 11:04
@GitMensch
Copy link
Collaborator

So currently we always get ret==-1 in all cases we expect and get errno set; but with MSVC debug builds we have -1 and errno not set. That does neither match the Linux manpage nor the posix definition.

I'm tempted to say: revert the errno output, replacing it with a comment to the strange behaviour, and adjust the config script to not enable iconv in case of debug builds, then consider this finished and merge upstream.

Possibly retry later, depending on the state of vcpkg bug microsoft/vcpkg#44698.

Thoughts?

@ddeclerck
Copy link
Collaborator Author

So currently we always get ret==-1 in all cases we expect and get errno set; but with MSVC debug builds we have -1 and errno not set. That does neither match the Linux manpage nor the posix definition.

I'm tempted to say: revert the errno output, replacing it with a comment to the strange behaviour, and adjust the config script to not enable iconv in case of debug builds, then consider this finished and merge upstream.

Possibly retry later, depending on the state of vcpkg bug microsoft/vcpkg#44698.

Thoughts?

I found a comment online (see last comment here) that this could be caused by a mix of debug and release DLLs. But I don't know, really.

I guess disabling iconv for MSVC debug builds is okay for now.

@ddeclerck ddeclerck force-pushed the gen_tables branch 3 times, most recently from 517244d to 0645017 Compare March 31, 2025 14:43
@ddeclerck ddeclerck force-pushed the gen_tables branch 3 times, most recently from 8be6f0b to 6b973c9 Compare March 31, 2025 15:14
@GitMensch
Copy link
Collaborator

LGTM

@ddeclerck
Copy link
Collaborator Author

Merged @ SVN 5470.

@GitMensch
Copy link
Collaborator

Well done. So up to the dependency fixup next?

@ddeclerck
Copy link
Collaborator Author

Well done. So up to the dependency fixup next?

Yes.

@GitMensch
Copy link
Collaborator

Just to let you know (as the CI builds work again thanks to your test 31 adjustment): old MinGW fails with the iconv conversion and therefore the one testcase is broken:

error: conversion from IBM500 to ASCII is not supported by your iconv implementation

it would be possible to check the TESTMODE and if it is set don't exit with EXIT_FAILURE but 77 (skip tests)
What do you think?

@ddeclerck
Copy link
Collaborator Author

Just to let you know (as the CI builds work again thanks to your test 31 adjustment): old MinGW fails with the iconv conversion and therefore the one testcase is broken:

error: conversion from IBM500 to ASCII is not supported by your iconv implementation

it would be possible to check the TESTMODE and if it is set don't exit with EXIT_FAILURE but 77 (skip tests) What do you think?

That would be possible.

Alternatively, we could skip the test if the command iconv -f IBM500 -t ASCII fails (assuming both libiconv and iconv are installed).

P.S: which MinGW workflow are you using ? The one we have here does not even have iconv, hence the test is just skipped.

@GitMensch
Copy link
Collaborator

The binary and library are independent, especially as the iconv functions may be part of another library, including libc. Can you please implement the test within cobc to adjust the return code in this case?

This
Is is the CI definition used (the MinGW part depends on a tarball created by Ubuntu in another CI).

@ddeclerck
Copy link
Collaborator Author

Actually one part of the test is for checking the return code & error message when requesting an unsupported translation:

AT_CHECK([$COBC --gentable=EBCDIC_XXX,ASCII_XXX], [1], [],
[error: conversion from EBCDIC_XXX to ASCII_XXX is not supported by your iconv implementation
])

Just returning 77 in testmode would skip the whole test.
If we want to keep that check for unsupported conversions for the general case, then maybe we can skip the test specifically when the IBM500/ASCII conversion is not supported, for instance with something like:

AT_SKIP_IF([test "$($COBC --gentable=IBM500,ASCII 2>&1)" = "error: conversion from IBM500 to ASCII is not supported by your iconv implementation"])

@GitMensch
Copy link
Collaborator

You're right, that's a good idea in general.

I've also thought about the option that this code may just be different named, but after checking with the binary artifact - neither ibm500 nor cp500 is supported (ASCII is).

@ddeclerck
Copy link
Collaborator Author

You're right, that's a good idea in general.

I've also thought about the option that this code may just be different named, but after checking with the binary artifact - neither ibm500 nor cp500 is supported (ASCII is).

Yeah, I was wondering about that too ; it just seems these encodings are not supported by old versions of libiconv.

@GitMensch
Copy link
Collaborator

I'm checking that one-liner in, no need to rebase this.

@GitMensch GitMensch closed this Apr 7, 2025
@ddeclerck ddeclerck deleted the gen_tables branch July 29, 2025 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants