Skip to content

Commit 56f75da

Browse files
johanlundbergc00kiemon5ter
authored andcommitted
Convert sign_statement result to native string
Using lxml.etree.tostring without encoding in python3 results in a unparsable xml document. To fix this, we always set the encoding to UTF-8 and omit the xml declaration. We then convert the result to the native string type before returning it. --- Our preferred encoding (in general) is `utf-8`. `lxml` defaults to `ASCII`, or expects us to provide an encoding. Provided an encoding, `lxml` serializes the tree-representation of the xml document by encoding it with that encoding. If it is directed to include an xml declaration, it embeds that encoding in the xml declaration as the `encoding` property. (ie, `<?xml version='1.0' encoding='iso-8859-7'?>`) `lxml` allows for some _special_ values as an encoding. - In python2 those are: `"unicode"` and `unicode`. - In python3 those are: `"unicode"` and `str`. By specifying those values, the result will be _decoded_ from bytes to unicode ("unicode" is not an actual encoding; the actual encoding will be utf-8). The encoding is already the _type_ of the result. This is why you are not allowed to have an xml declaration for those cases. The result is not bytes that have to be read by some encoding rules, but decoded data that their type dictates how they are managed. With the latest changes, what we do is: 1. we always encode the result as UTF-8 2. we do not include an xml declaration (because of _(3)_) 3. we convert to the native string type (that is `bytes`/`str` for Python2, and `str` for Python3 (the equivalent of `unicode` in Python2) The consumer of the result should expect to treat the result as utf8-encoded bytes in Python2, and utf8-decoded string in Python3. Signed-off-by: Ivan Kanakarakis <[email protected]>
1 parent fbff99e commit 56f75da

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

src/saml2/sigver.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -957,7 +957,10 @@ def sign_statement(self, statement, node_name, key_file, node_id, id_attr):
957957

958958
xml = xmlsec.parse_xml(statement)
959959
signed = xmlsec.sign(xml, key_file)
960-
return lxml.etree.tostring(signed, xml_declaration=True)
960+
signed_str = lxml.etree.tostring(signed, xml_declaration=False, encoding="UTF-8")
961+
if not isinstance(signed_str, six.string_types):
962+
signed_str = signed_str.decode("utf-8")
963+
return signed_str
961964

962965
def validate_signature(self, signedtext, cert_file, cert_type, node_name, node_id, id_attr):
963966
"""

0 commit comments

Comments
 (0)