Request for feedback on handling whitespace in XML notices #779

bertrand-lorentz · 2023-12-06T14:43:47Z

bertrand-lorentz
Dec 6, 2023
Maintainer

We are thinking about changing the way we handle whitespace inside XML elements when validating XML notices.
We would like to get your input on the topic before we go ahead with any change

Current situation

When a rule looks at the value of a field, any leading or trailing whitespace (space, tab, line break, etc.) are ignored.
That's done by using the normalize-space function in the XPath of the Schematron test.
For example:

<rule context="/*/cac:ProcurementProjectLot[cbc:ID/@schemeName='Lot']/cbc:ID">
	<assert id="BR-BT-00137-0155" role="ERROR" test="matches(normalize-space(.),'^LOT-\d{4}$')">rule|text|BR-BT-00137-0155</assert>
</rule>

This means that all of the following 3 elements are considered OK for the above rule:

<cbc:ID schemeName="Lot">LOT-1234</cbc:ID>
<cbc:ID schemeName="Lot">  LOT-1234  </cbc:ID>
<cbc:ID schemeName="Lot">
    LOT-1234
</cbc:ID>

Problem

As described above, we currently accept notices in which elements that contain a structured string (like identifier or code from a codelist) have leading or trailing whitespace.
This can cause problems for applications that read values from the XML by themselves by just using the XPath of the field: they will get a value with leading and/or trailing whitespace.

Proposed change

For fields that correspond to a structured string (so field that have a type of "id", "id-ref", or "code"), we would add rules to check that their value do not have leading or trailing whitespace. In XPath it would be something like: . = normalize-space(.)

This means that only the first element in the example above would be considered valid. A notice containing the second or third element would be rejected.

This would not impact "free text" fields (like title, description, etc.) or fields that have a type corresponding to a number or a date.

Questions

In order to help guide this change, we would like to have some input on the following questions

1/ When generating XML notices, how do handle whitespace in the values ?
2/ When reading data from XML notices, do you remove leading and trailing whitespace ?
3/ Would you be impacted if we added the rules described above ? How ?

rkottmann · 2023-12-06T15:36:18Z

rkottmann
Dec 6, 2023

First thanks for discussing this in advance!

I fully support the proposed change.

Your question 3 fits in the in the context of a short discussion I had with Karl about “not knowing what effect a new or changed rule might have” and “one year transition period might be very short if severe conflicts have to be solved” .

Here I came up with the following proposal:

Problem statement: Introducing new business rules or changed rules into the wild (defined as you do not know all systems depending on them) might have severe and unknown impact. At the same time nothing is a better test than the real-life in the wild.

Solution idea: Introduce new schematron rules with a severity level which indicates that this rule will fire as an error or warning or whatsoever in the future. E.g. “future-error”

So one could introduce rule X with severity level “future-error” in the current release and change it to severity level “error” in the next release
The advantage would be that all unknown systems would see if this rule causes troubles in their system and give feedback based on the experience in the wild
This would allow keeping the one year life span of a release while being flexible in which release a “future-error” gets activated
- If no trouble: next release
- If trouble: change rule based on feedback from the wild either
  - Keep “future-error” for new version of rule to let the wild test one more release cycle
  - Or activate with changes applied

Another variant:

Create a specific Schematron phase (not enabled by default of course) with future rules and ask the wild to test it in their environment.
- This can be combined with above solution statement

1 reply

bertrand-lorentz Dec 7, 2023
Maintainer Author

Thanks for your feedback.

Concerning your proposal of having a "future-error" severity, I would suggest that you create a separate discussion about it, to avoid mixing up the topic here (whitespace handling) with your suggestion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for feedback on handling whitespace in XML notices #779

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Request for feedback on handling whitespace in XML notices #779

Uh oh!

bertrand-lorentz Dec 6, 2023 Maintainer

Current situation

Problem

Proposed change

Questions

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

rkottmann Dec 6, 2023

Uh oh!

bertrand-lorentz Dec 7, 2023 Maintainer Author

bertrand-lorentz
Dec 6, 2023
Maintainer

Replies: 1 comment 1 reply

rkottmann
Dec 6, 2023

bertrand-lorentz Dec 7, 2023
Maintainer Author