-
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
We've seen some more USFM parsing errors in three different projects. Contact me if you need the project names.
The later two are in non-Scripture books which I don't believe we should be parsing: I think when I reworked the preprocess logic, I may have undone this safeguard. It's probably as simple as passing the list of canonical books as text ids when processing the whole corpus
System.InvalidOperationException: An error occurred while parsing the text '1KI' in project ///. Verse: 1KI 1:0, line: 3, character: 1, error: 'Stack empty.'
---> System.InvalidOperationException: Stack empty.
at System.Collections.Generic.Stack`1.ThrowForEmptyStack()
at System.Collections.Generic.Stack`1.Pop()
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.NextElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartParentElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartSidebar(UsfmParserState state, String marker, String category)
at SIL.Machine.Corpora.UsfmParser.ProcessToken()
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
--- End of inner exception stack trace ---
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
at SIL.Machine.Corpora.ScriptureText.GetRows()
at System.Linq.Enumerable.SelectManySingleSelectorIterator`2.MoveNext()
at System.Linq.Enumerable.SelectEnumerableIterator`2.MoveNext()
at System.Linq.Enumerable.WhereIterator[TSource](IEnumerable`1 source, Func`3 predicate)+MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.MergedTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.CollectVerses()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
at SIL.ServiceToolkit.Services.ParallelCorpusPreprocessingService.PreprocessAsync(IReadOnlyList`1 corpora, Func`2 train, Func`4 inference, Boolean useKeyTerms, HashSet`1 ignoreUsfmMarkers) in /app/src/ServiceToolkit/src/SIL.ServiceToolkit/Services/ParallelCorpusPreprocessingService.cs:line 165
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 47
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.PreprocessBuildJob`1.DoWorkAsync(String engineId, String buildId, IReadOnlyList`1 data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/PreprocessBuildJob.cs:line 44
at Serval.Machine.Shared.Services.HangfireBuildJob`2.RunAsync(String engineId, String buildId, TData data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/HangfireBuildJob.cs:line 56
Failed to process the job '69206425ae2c0992fc32e756': an exception occurred.
System.InvalidOperationException: An error occurred while parsing the text 'XXA' in project ///'. Verse: XXA 1:0, line: 2, character: 1, error: 'Stack empty.'
---> System.InvalidOperationException: Stack empty.
at System.Collections.Generic.Stack`1.ThrowForEmptyStack()
at System.Collections.Generic.Stack`1.Pop()
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.NextElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartParentElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartSidebar(UsfmParserState state, String marker, String category)
at SIL.Machine.Corpora.UsfmParser.ProcessToken()
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
--- End of inner exception stack trace ---
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
at SIL.Machine.Corpora.ScriptureText.GetRows()
at System.Linq.Enumerable.SelectManySingleSelectorIterator`2.MoveNext()
at System.Linq.Enumerable.SelectEnumerableIterator`2.MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.MergedTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at System.Linq.Enumerable.WhereIterator[TSource](IEnumerable`1 source, Func`3 predicate)+MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.CollectVerses()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.ParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
at SIL.ServiceToolkit.Services.ParallelCorpusPreprocessingService.PreprocessAsync(IReadOnlyList`1 corpora, Func`2 train, Func`4 inference, Boolean useKeyTerms, HashSet`1 ignoreUsfmMarkers) in /app/src/ServiceToolkit/src/SIL.ServiceToolkit/Services/ParallelCorpusPreprocessingService.cs:line 120
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 47
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken)
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.PreprocessBuildJob`1.DoWorkAsync(String engineId, String buildId, IReadOnlyList`1 data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/PreprocessBuildJob.cs:line 44
at Serval.Machine.Shared.Services.HangfireBuildJob`2.RunAsync(String engineId, String buildId, TData data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/HangfireBuildJob.cs:line 56
at Serval.Machine.Shared.Services.HangfireBuildJob`2.RunAsync(String engineId, String buildId, TData data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/HangfireBuildJob.cs:line 120
at Serval.Machine.Shared.Services.HangfireBuildJob`2.RunAsync(String engineId, String buildId, TData data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/HangfireBuildJob.cs:line 124
Failed to process the job '692e9619ae2c0992fc33faad': an exception occurred.
System.InvalidOperationException: An error occurred while parsing the text 'XXB' in project ///. Verse: XXB 1:0, line: 2, character: 1, error: 'Stack empty.'
---> System.InvalidOperationException: Stack empty.
at System.Collections.Generic.Stack`1.ThrowForEmptyStack()
at System.Collections.Generic.Stack`1.Pop()
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.NextElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartParentElement(String marker)
at SIL.Machine.Corpora.ScriptureRefUsfmParserHandlerBase.StartSidebar(UsfmParserState state, String marker, String category)
at SIL.Machine.Corpora.UsfmParser.ProcessToken()
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
--- End of inner exception stack trace ---
at SIL.Machine.Corpora.UsfmTextBase.GetVersesInDocOrder()
at SIL.Machine.Corpora.ScriptureText.GetRows()
at System.Linq.Enumerable.SelectManySingleSelectorIterator`2.MoveNext()
at System.Linq.Enumerable.SelectEnumerableIterator`2.MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.MergedTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at System.Linq.Enumerable.WhereIterator[TSource](IEnumerable`1 source, Func`3 predicate)+MoveNext()
at SIL.Machine.Corpora.TextCorpusEnumerator.CollectVerses()
at SIL.Machine.Corpora.TextCorpusEnumerator.MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IList`1 enumerators)+MoveNext()
at SIL.Machine.Corpora.NParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at SIL.Machine.Corpora.ParallelTextCorpus.GetRows(IEnumerable`1 textIds)+MoveNext()
at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
at System.Collections.Generic.EnumerableHelpers.ToArray[T](IEnumerable`1 source)
at SIL.ServiceToolkit.Services.ParallelCorpusPreprocessingService.PreprocessAsync(IReadOnlyList`1 corpora, Func`2 train, Func`4 inference, Boolean useKeyTerms, HashSet`1 ignoreUsfmMarkers) in /app/src/ServiceToolkit/src/SIL.ServiceToolkit/Services/ParallelCorpusPreprocessingService.cs:line 124
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 47
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken)
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.TranslationPreprocessBuildJob.WriteDataFilesAsync(String buildId, IReadOnlyList`1 corpora, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/TranslationPreprocessBuildJob.cs:line 83
at Serval.Machine.Shared.Services.PreprocessBuildJob`1.DoWorkAsync(String engineId, String buildId, IReadOnlyList`1 data, String buildOptions, CancellationToken cancellationToken) in /app/src/Machine/src/Serval.Machine.Shared/Services/PreprocessBuildJob.cs:line 44