Skip to content

Patch: more useful error message on parse error [rt.cpan.org #99098] #70

@toddr

Description

@toddr

Migrated from rt.cpan.org#99098 (status was 'open')

Requestors:

From [email protected] on 2014-09-24 02:33:48
:

Currently some of the messages you get parsing are not useful.

e.g.

# No such file or directory at /usr/lib64/perl5/vendor_perl/XML/Parser/Expat.pm line 470.

Which file?

With the patch below:

#       No such file or directory at /usr/lib64/perl5/vendor_perl/XML/Parser/Expat.pm line 470.
#  at line 7:
# %DOCBOOK_ENTS;
# <!ENTITY % BOOK_ENTITIES SYSTEM "Test_DB5_Book.ent">
# %BOOK_ENTITIES;
# ^
# ]>
# <info version="5.0" xml:lang="en-US" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">

Oh it can't find one of the files referenced in the XML header, that is more useful information.


############ start patch ############

diff -ur a/Expat.pm b/Expat.pm
--- a/Expat.pm  2014-09-24 12:28:05.046517986 +1000
+++ b/Expat.pm  2014-09-24 12:28:53.637728671 +1000
@@ -466,7 +466,12 @@

     $prev_rs = $ioclass->input_record_separator("\n$delim\n")
       if defined($delim);
-    $result = ParseStream($parser, $ioref, $delim);
+    eval {
+      $result = ParseStream($parser, $ioref, $delim);
+    };
+    if($@) {
+      $self->xpcroak("$@");
+    }

     $ioclass->input_record_separator($prev_rs)
       if defined($delim);

############ end patch ############


From [email protected] on 2014-12-11 07:00:24
:

Thanks for the patch.



From [email protected] on 2015-01-01 18:15:26
:

On Thu Dec 11 02:00:24 2014, TODDR wrote:
> Thanks for the patch.

There is a problem with this patch: if the user wraps the parse in an eval, then dies in a handler, the process dies, and the eval cannot trap the error.

You can see this in XML::Twig, the tests now fail with this version of XML::Parser.

I am not sure how to fix this, replacing the xpcroak by a die does not improve things.
__
mirod

From [email protected] on 2015-01-01 18:27:43
:

On Thu Jan 01 13:15:26 2015, MIROD wrote:
> On Thu Dec 11 02:00:24 2014, TODDR wrote:
> > Thanks for the patch.
> 
> There is a problem with this patch: if the user wraps the parse in an
> eval, then dies in a handler, the process dies, and the eval cannot
> trap the error.
> 
> You can see this in XML::Twig, the tests now fail with this version of
> XML::Parser.
> 
> I am not sure how to fix this, replacing the xpcroak by a die does not
> improve things.
> __
> mirod

Sorry, actually replacing xpcroak by die fixes the problem.

If there is a way to detect that the error comes from the parsing itself and not from a handler, and call xpcrok only in the first case, then it would  at least fix the XML::Twig problem.
__
mirod

From [email protected] on 2015-01-03 15:43:32
:

On Thu Jan 01 13:27:43 2015, MIROD wrote:
> On Thu Jan 01 13:15:26 2015, MIROD wrote:
> > On Thu Dec 11 02:00:24 2014, TODDR wrote:
> > > Thanks for the patch.
> >
> > There is a problem with this patch: if the user wraps the parse in an
> > eval, then dies in a handler, the process dies, and the eval cannot
> > trap the error.
> >
> > You can see this in XML::Twig, the tests now fail with this version
> > of
> > XML::Parser.
> >
> > I am not sure how to fix this, replacing the xpcroak by a die does
> > not
> > improve things.
> > __
> > mirod
> 
> Sorry, actually replacing xpcroak by die fixes the problem.
> 
> If there is a way to detect that the error comes from the parsing
> itself and not from a handler, and call xpcrok only in the first case,
> then it would  at least fix the XML::Twig problem.
> __
> mirod


Looking closer into the patch, here is the error message I get with XML::Parser 2.41:

404 File `/home/mrodrigu/perl/XML-Parser-2.43.01/Test_DB5_Book.ent' does not exist file:///home/mrodrigu/perl/XML-Parser-2.43.01/Test_DB5_Book.ent
Handler couldn't resolve external entity at line 3, column 0, byte 70
error in processing external entity reference at line 3, column 0, byte 70 at /home/mrodrigu/perl5/perlbrew/perls/perl-5.20.1/lib/site_perl/5.20.1/x86_64-linux/XML/Parser.pm line 187

I get it by doing 

perl -MXML::Parser -E'XML::Parser->new(ParseParamEnt =>1)->parsefile( "bad.xml")'

where bad.xml is:

<!DOCTYPE doc [
<!ENTITY % BOOK_ENTITIES SYSTEM "Test_DB5_Book.ent"> 
%BOOK_ENTITIES; 
]>
<doc></doc>

All other parsing errors produce an error message that includes the same amount information as xpcroak (it actually doesn't look like xpcroak is used at all).

In any case stringifying exceptions when they can be generated by the user (through a die in a handler) doesn't look like a great idea.

I'd recomment rolling back this patch.

-- 
mirod

__
mirod

From [email protected] on 2015-01-12 07:03:23
:

> I'd recomment rolling back this patch.


Yeah, XML::Parser is a little too widely used to change it's behavior in this way. I'm open to a patch that will make this optionally behave this way but not a global change.

Reverted. https://github.com/toddr/XML-Parser/commit/96d50dfb59914fb55dd08da1a45a544ba470cbef

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions