DITA 1.3 Preprocessing Architecture proposal

In the course of trying to implement copy-to customization I'm realizing that what I'm trying to do is similar to (and dependent on) implementation for DITA 1.3 branch filtering and scoped keys. Thus I've created this page to capture the general requirements and implementation implications for DITA 1.3 processing.

The current OT preprocessing cannot support the DITA 1.3 requirements as implemented because it does topic copying and filtering too early in the process, before either the full set of effective topics or actual filtering conditions for a given topic use instance are known. This will require significant changes to the current code, at least to the debug-and-filter process (and possibly generate-lists, although lists can be updated as needed).

As far as I can work out, implementing branch filtering and scoped keys has to be done as follows, because of the information required and available at any given step.

The process as described does filtering after conref resolution. This is to ensure that applicable conrefs to inapplicable elements can be detected and reported rather than causing the conref to fail because the target as has already been filtered out (the main problem with the current filter-first approach).

However, it is wasteful and expensive to process elements that will subsequently be filtered out.

Thus, in order to implement this processing most efficiently there needs to be an "isEffective()" function that takes an element and its effective @props values (that is, potentially inherited from an ancestor) and determines if that element is effective based on the current active condition set. This shouldn't be hard to implement in XSLT or Java. It simply requires maintaining knowledge of the current effective conditions (reflecting any branch filtering, where applicable) and doing applicability evaluation on demand.

One way to do this would be to do a "filter" process that simply flags an element as being effective or ineffective but retains the element itself. An isEffective() check would then be trivial and subsequent filtering processing would be very simple. It would even allow for reporting of element applicability by not actually removing things filtered out and rendering them as some sort of report.

This approach to filtering then avoids the current problem with early filtering while avoiding unnecessary processing.

Resolve the map using only direct map references and resolve any map-to-map direct-reference conrefs.
Instantiate all filtered branches implied by ditavalrefs that are not themselves filtered out of the map (can use the isEffective() method to determine if a given filtered branch is itself effective, and if not ignore it, avoiding the need to do branch generation on on branches that will be filtered out later. We can't do filtering at this point because there might be key-based conref targets that would get filtered out in advance of final conref resolution.
Expand all keys to reflect key scopes. Capture original keyref value and scope steps (so messages can reflect the key scope hierarchy and not just the expanded key [not all dot-separated tokens represent key scope names])
Construct the key spaces. Can choose to only include effective keys in the key space (using the isEffective() function to determine if a given key definition is filtered in when determining key definition precedence) or include all key definitions along with their applicability and whether or not they are effective.

For the OT, would expect to normally use pre-filtered key space set, but other processes might want to have the full keyspace with conditions, so it needs to be an option. Within a component that provides a general API for managing key spaces, it must be possible to know the applicability of any key definition or key space as a whole (where the key scope itself was conditional).
Resolve key-based conrefs from maps to maps (map-to-topic conrefs can't necessarily be resolved yet because copy-to processing hasn't yet been applied to the topics, so we don't know what the target element details might be, in particular, use-context-specific filtering that would make a conref target filtered out.) This should be filtering-aware so that conrefs to elements that would be filtered out are reported and not resolved (meaning that no elements are unnecessarily processed).

At this point we have a fully-resolved, unfiltered map.

Filter the map.

At this point, we have the final map structure and content, with filtering applied.
Do any metadata propogation within the map, or any other similar data denormalization (e.g., updating @href values to reflect resolved keyrefs, etc.)
Add @copy-to attributes to all second and subsequent topicrefs to topics referenced within branches with different filtering conditions.

At this point the map is fully resolved and filtered and the key spaces have been constructed.
Allow additional preprocessing of the map (e.g., copy-to adjustment)

At this point the set of effective resources (topics and non-DITA resources) is known (because the map now reflects all desired copy-to).
Determine the set of resources required by the map and what file(s) they must be copied to.
Make copies of topics reflecting @copy-to values. Can apply non-destructive filtering as described above.
Resolve conrefs in topics. Again, can use isEffective() to avoid processing elements that will be filtered out and can report conrefs to inapplicable targets.
Apply filtering to topics

At this point all topic copies have been copied, conrefs have been resolved, and filtering applied. The topics are ready for any additional processing, such as link generation, etc., as well as final deliverable production.

The remaining steps of the current preprocess pipeline should work normally (that is, everything that follows mappull today).

DITA 1.3 Preprocessing Architecture proposal

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally