Skip to content

FoLia-correct: resolve HEMP's using FoLiA::Correction #47

@kosloot

Description

@kosloot

This came up after issue #45

when resolving a HEMP, FoLiA-correct just adds the resolved text to one of the string/word nodes.
I assume using a real Correction would be better.

for example:

    <p xml:id="mwsel.p.1">
      <t class="OCR">•c c•</t>
      <str xml:id="mwsel.p.1.str.1">
        <t class="OCR">•c</t>
      </str>
      <str xml:id="mwsel.p.1.str.2">
        <t class="OCR">c•</t>
      </str>
    </p>

assuming •c c• is in the PUNCT file as •c c• cc this HEMP is resolved as:

   <p xml:id="mwsel.p.1">
      <t>cc</t>
      <t class="OCR">•c c•</t>
      <str xml:id="mwsel.p.1.str.1">
        <t class="OCR">•c</t>
      </str>
      <str xml:id="mwsel.p.1.str.2">
        <t offset="0">cc</t>
        <t class="OCR">c•</t>
      </str>
    </p>

IMHO a much better solution would be:

   <p xml:id="mwsel.p.1">
      <t>cc</t>
      <t class="OCR">•c c•</t>
      <correction xml:id="mwsel.p.1.correction.1">
        <new>
          <str xml:id="mwsel.p.1.str.edit.1">
            <t >cc</t>
          </str>
        </new>
         <original>
          <str xml:id="mwsel.p.1.str.1">
            <t class="OCR">•c</t>
          </str>
          <str xml:id="mwsel.p.1.str.2">
            <t class="OCR">c•</t>
          </str>
        </original>
      </correction>
    </p>

interesting point: HEMP resolution is done before other corrections. I assume that a real correction using the cc will not be performed.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions