-
Notifications
You must be signed in to change notification settings - Fork 7
Add support for IS_ANNOTATORS and SUPPRESS_TRACKS Properties #1947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…e output last task only property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run the first pipeline example from #1719 , the language detection tracks do no appear in the output object.
@brosenberg42 reviewed 17 of 29 files at r1, 1 of 1 files at r2.
Reviewable status: 17 of 29 files reviewed, 15 unresolved discussions (waiting on @jtat1062)
a discussion (no related file):
When I run the job below, the output includes both fast text and argos, even though argos has SUPPRESS_TRACKS=true
{
"algorithmProperties": {},
"buildOutput": true,
"jobProperties": {
},
"media": [
{
"mediaUri": "file:///home/mpf/sample-data/human-rights-text/chinese.txt",
"properties": {}
}
],
"pipelineDefinition": {
"pipeline": [
"FASTTEXT LANGUAGE ID TEXT FILE TASK",
"ARGOS TRANSLATION TASK",
"KEYWORD TAGGING (WITH FF REGION) TASK"
],
"tasks": [
{
"name": "FASTTEXT LANGUAGE ID TEXT FILE TASK",
"actions": ["FASTTEXT LANGUAGE ID TEXT FILE ACTION"]
},
{
"name": "ARGOS TRANSLATION TASK",
"actions": ["ARGOS TRANSLATION ACTION"]
},
{
"name": "KEYWORD TAGGING (WITH FF REGION) TASK",
"actions": ["KEYWORD TAGGING (WITH FF REGION) ACTION"]
}
],
"actions":[
{
"name": "FASTTEXT LANGUAGE ID TEXT FILE ACTION",
"algorithm": "FASTTEXTLANGUAGE",
"properties": [
{
"name": "SUPPRESS_TRACKS",
"value": "TRUE"
}
]
},
{
"name": "ARGOS TRANSLATION ACTION",
"algorithm": "ARGOSTRANSLATION",
"properties": [
{
"name": "FEED_FORWARD_TYPE",
"value": "REGION"
},
{
"name": "IS_ANNOTATOR",
"value": "FALSE"
}
]
},
{
"name": "KEYWORD TAGGING (WITH FF REGION) ACTION",
"algorithm": "KEYWORDTAGGING",
"properties": [
{
"name": "FEED_FORWARD_TYPE",
"value": "REGION"
},
{
"name": "IS_ANNOTATOR",
"value": "TRUE"
},
]
}
]
},
"priority": 4
}
trunk/interop/src/main/java/org/mitre/mpf/interop/JsonActionOutputObject.java
line 79 at r1 (raw file):
public static JsonActionOutputObject factory(@JsonProperty("action") String action, @JsonProperty("algorithm") String algorithm, @JsonProperty("annotators") SortedSet<String> annotators,
This should be a List so we can put the annotators in the same order as they are in the pipeline.
trunk/interop/src/main/java/org/mitre/mpf/interop/JsonActionOutputObject.java
line 43 at r2 (raw file):
public static final String NO_TRACKS_TYPE = "NO TRACKS"; public static final String TRACKS_SUPPRESSED_TYPE = "TRACKS SUPPRESSED"; public static final String TRACKS_MERGED_TYPE = "TRACKS MERGED";
This field should be removed since "TRACKS MERGED" will no longer be in the output object.
trunk/mpf-system-tests/src/main/java/org/mitre/mpf/mst/TestSystem.java
line 424 at r1 (raw file):
log.info("Actual output for job {}: {}", jobId, objectMapper.writeValueAsString(actualOutputJson)); log.info("Expected output for job {}: {}", jobId, objectMapper.writeValueAsString(expectedOutputJson));
Please remove these log lines. Some of the output objects are very large.
trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/operations/detection/artifactextraction/ArtifactExtractionSplitterImpl.java
line 125 at r2 (raw file):
if (_taskAnnotatorService.taskHasAnnotator(job, media, taskIndex)) { LOG.info("ARTIFACT EXTRACTION IS SKIPPED for pipeline task {} and media {}" + " due to being merged with a following task.",
Change "due to being merged with a following task." to "because it is annotated by a following task."
trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/JobCompleteProcessorImpl.java
line 166 at r1 (raw file):
@Autowired private TaskAnnotatorService taskMergingManager;
Please change the name of this field to taskAnnotatorService
.
trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/camel/JobCompleteProcessorImpl.java
line 113 at r2 (raw file):
@Component(JobCompleteProcessorImpl.REF) public class JobCompleteProcessorImpl extends WfmProcessor implements JobCompleteProcessor {
There are a bunch of references to task merging in this class. Please update any references to task merging to refer to annotation.
trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/service/TiesDbService.java
line 283 at r2 (raw file):
private SortedSet<String> getTrackTypes(BatchJob job, Media media) { var pipelineElements = job.getPipelineElements(); Stream<Task> tasks = pipelineElements.getTaskStreamInOrder();
This list should only include tasks that were not suppressed and their annotators.
trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java
line 299 at r1 (raw file):
"EAST TEXT DETECTION TASK", "TESSERACT OCR TEXT DETECTION (WITH FF REGION) TASK", "KEYWORD TAGGING (WITH FF REGION) TASK");
Please add the comment back in, but with the new property name:
// has IS_ANNOTATOR=TRUE
trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java
line 339 at r1 (raw file):
addPipeline(pipelineName, "SPHINX SPEECH DETECTION TASK", "KEYWORD TAGGING (WITH FF REGION) TASK",
Please add the comment back in, but with the new property name:
// has IS_ANNOTATOR=TRUE
trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java
line 294 at r2 (raw file):
@Test(timeout = 5 * MINUTES) public void runMergeWithPreviousTextTaskTest() {
Please choose a name that doesn't include "merge".
trunk/mpf-system-tests/src/test/java/org/mitre/mpf/mst/TestSystemOnDiff.java
line 335 at r2 (raw file):
@Test(timeout = 5 * MINUTES) public void runMergeWithPreviousSpeechTaskTest() {
Please choose a name that doesn't include "merge".
trunk/mpf-system-tests/src/test/resources/output/text/runJsonPathTest.json
line 68 at r1 (raw file):
"properties": { "FEED_FORWARD_TYPE": "REGION", "IS_ANNOTATOR": "TRUE"
The test using this output file should be failing because this file is missing the annotator lists. Please add the code below to OutputChecker.compareJsonActionOutputObjectSets
:
_errorCollector.checkThat(
"Annotator List", actActionOutput.getAnnotators(),
is(expActionOutput.getAnnotators()));
trunk/workflow-manager/src/main/java/org/mitre/mpf/wfm/util/AggregateJobPropertiesUtil.java
line 508 at r2 (raw file):
// Get map of tasks that need to be merged. Values are the previous tasks to merge with. public Map<Integer, Integer> getTasksToMerge(Media media, BatchJob job) {
This is no longer used.
trunk/workflow-manager/src/test/java/org/mitre/mpf/wfm/service/TestTiesDbService.java
line 230 at r2 (raw file):
// .thenReturn(IntStream.of(2, 1)); // when(_mockTaskAnnotatorManager.getTransitiveAnnotatedTasks(_job, _tiesDbParentMedia, 3, 1)) // .thenReturn(IntStream.of(2, 1));
These should be updated after addressing the issue I mentioned in the comment on TiesDbService.
Issues:
OUTPUT_LAST_TASK_ONLY
and supportIS_ANNOTATOR
andSUPPRESS_TRACKS
WFM properties #1719If a related issue doesn't exist, then create one first and assign it to yourself.
Related PRs:
Please review our Contributor Guide.
This change is