Skip to content

Conversation

jtat1062
Copy link

@jtat1062 jtat1062 commented Aug 5, 2025

Issues:

If a related issue doesn't exist, then create one first and assign it to yourself.

Related PRs:

Please review our Contributor Guide.


This change is Reviewable

@jtat1062 jtat1062 requested a review from brosenberg42 August 6, 2025 16:20
@jrobble jrobble moved this from To do to In Progress in OpenMPF: Development Sep 17, 2025
Copy link
Member

@brosenberg42 brosenberg42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the first pipeline example from #1719 , the language detection tracks do no appear in the output object.

@brosenberg42 reviewed 17 of 29 files at r1, 1 of 1 files at r2.
Reviewable status: 17 of 29 files reviewed, 15 unresolved discussions (waiting on @jtat1062)


a discussion (no related file):
When I run the job below, the output includes both fast text and argos, even though argos has SUPPRESS_TRACKS=true

{
  "algorithmProperties": {},
  "buildOutput": true,
  "jobProperties": {
  },
  "media": [
    {
      "mediaUri": "file:///home/mpf/sample-data/human-rights-text/chinese.txt",
      "properties": {}
    }
  ],
  "pipelineDefinition": {
    "pipeline": [
      "FASTTEXT LANGUAGE ID TEXT FILE TASK", 
      "ARGOS TRANSLATION TASK",
      "KEYWORD TAGGING (WITH FF REGION) TASK"
    ],
    "tasks": [
      {
        "name": "FASTTEXT LANGUAGE ID TEXT FILE TASK",
        "actions": ["FASTTEXT LANGUAGE ID TEXT FILE ACTION"]
      },
      {
        "name": "ARGOS TRANSLATION TASK",
        "actions": ["ARGOS TRANSLATION ACTION"]
      },
      {
        "name": "KEYWORD TAGGING (WITH FF REGION) TASK",
        "actions": ["KEYWORD TAGGING (WITH FF REGION) ACTION"]
      }
    ],
    "actions":[
      {
        "name": "FASTTEXT LANGUAGE ID TEXT FILE ACTION",
        "algorithm": "FASTTEXTLANGUAGE",
        "properties": [
          {
            "name": "SUPPRESS_TRACKS",
            "value": "TRUE"
          }
        ]
      },
      {
        "name": "ARGOS TRANSLATION ACTION",
        "algorithm": "ARGOSTRANSLATION",
        "properties": [
          {
            "name": "FEED_FORWARD_TYPE",
            "value": "REGION"
          },
          {
            "name": "IS_ANNOTATOR",
            "value": "FALSE"
          }
        ]
      },
      {
        "name": "KEYWORD TAGGING (WITH FF REGION) ACTION",
        "algorithm": "KEYWORDTAGGING",
        "properties": [
          {
            "name": "FEED_FORWARD_TYPE",
            "value": "REGION"
          },
          {
            "name": "IS_ANNOTATOR",
            "value": "TRUE"
          },
          
        ]
      }
    ]
  },
  "priority": 4
}

trunk/interop/src/main/java/org/mitre/mpf/interop/JsonActionOutputObject.java line 79 at r1 (raw file):

    public static JsonActionOutputObject factory(@JsonProperty("action") String action,
                                                 @JsonProperty("algorithm") String algorithm,
                                                 @JsonProperty("annotators") SortedSet<String> annotators,

This should be a List so we can put the annotators in the same order as they are in the pipeline.


trunk/interop/src/main/java/org/mitre/mpf/interop/JsonActionOutputObject.java line 43 at r2 (raw file):

    public static final String NO_TRACKS_TYPE = "NO TRACKS";
    public static final String TRACKS_SUPPRESSED_TYPE = "TRACKS SUPPRESSED";
    public static final String TRACKS_MERGED_TYPE = "TRACKS MERGED";

This field should be removed since "TRACKS MERGED" will no longer be in the output object.


trunk/mpf-system-tests/src/main/java/org/mitre/mpf/mst/TestSystem.java line 424 at r1 (raw file):

        log.info("Actual output for job {}: {}", jobId, objectMapper.writeValueAsString(actualOutputJson));
        log.info("Expected output for job {}: {}", jobId, objectMapper.writeValueAsString(expectedOutputJson));

Please remove these log lines. Some of the output objects are very large.


trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/operations/detection/artifactextraction/ArtifactExtractionSplitterImpl.java line 125 at r2 (raw file):

            if (_taskAnnotatorService.taskHasAnnotator(job, media, taskIndex)) {
                LOG.info("ARTIFACT EXTRACTION IS SKIPPED for pipeline task {} and media {}" +
                                " due to being merged with a following task.",

Change "due to being merged with a following task." to "because it is annotated by a following task."


trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/JobCompleteProcessorImpl.java line 166 at r1 (raw file):

    @Autowired
    private TaskAnnotatorService taskMergingManager;

Please change the name of this field to taskAnnotatorService.


trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/JobCompleteProcessorImpl.java line 113 at r2 (raw file):

@Component(JobCompleteProcessorImpl.REF)
public class JobCompleteProcessorImpl extends WfmProcessor implements JobCompleteProcessor {

There are a bunch of references to task merging in this class. Please update any references to task merging to refer to annotation.


trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/service/TiesDbService.java line 283 at r2 (raw file):

    private SortedSet<String> getTrackTypes(BatchJob job, Media media) {
        var pipelineElements = job.getPipelineElements();
        Stream<Task> tasks = pipelineElements.getTaskStreamInOrder();

This list should only include tasks that were not suppressed and their annotators.


trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java line 299 at r1 (raw file):

                "EAST TEXT DETECTION TASK",
                "TESSERACT OCR TEXT DETECTION (WITH FF REGION) TASK",
                "KEYWORD TAGGING (WITH FF REGION) TASK");

Please add the comment back in, but with the new property name:
// has IS_ANNOTATOR=TRUE


trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java line 339 at r1 (raw file):

        addPipeline(pipelineName,
                "SPHINX SPEECH DETECTION TASK",
                "KEYWORD TAGGING (WITH FF REGION) TASK",

Please add the comment back in, but with the new property name:
// has IS_ANNOTATOR=TRUE


trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java line 294 at r2 (raw file):

    @Test(timeout = 5 * MINUTES)
    public void runMergeWithPreviousTextTaskTest() {

Please choose a name that doesn't include "merge".


trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java line 335 at r2 (raw file):

    @Test(timeout = 5 * MINUTES)
    public void runMergeWithPreviousSpeechTaskTest() {

Please choose a name that doesn't include "merge".


trunk/mpf-system-tests/src/test/resources/output/text/runJsonPathTest.json line 68 at r1 (raw file):

                        "properties": {
                            "FEED_FORWARD_TYPE": "REGION",
                            "IS_ANNOTATOR": "TRUE"

The test using this output file should be failing because this file is missing the annotator lists. Please add the code below to OutputChecker.compareJsonActionOutputObjectSets:

_errorCollector.checkThat(
        "Annotator List", actActionOutput.getAnnotators(),
        is(expActionOutput.getAnnotators()));

trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/util/AggregateJobPropertiesUtil.java line 508 at r2 (raw file):

    // Get map of tasks that need to be merged. Values are the previous tasks to merge with.
    public Map<Integer, Integer> getTasksToMerge(Media media, BatchJob job) {

This is no longer used.


trunk/workflow-manager/src/test/java/org/mitre/mpf/wfm/service/TestTiesDbService.java line 230 at r2 (raw file):

        //         .thenReturn(IntStream.of(2, 1));
        // when(_mockTaskAnnotatorManager.getTransitiveAnnotatedTasks(_job, _tiesDbParentMedia, 3, 1))
        //         .thenReturn(IntStream.of(2, 1));

These should be updated after addressing the issue I mentioned in the comment on TiesDbService.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants