-
Notifications
You must be signed in to change notification settings - Fork 22
RegexSpanFinder
Class that can / should be used to find text spans using regular expressions. It runs Matcher find {@link Matcher#find()} in a separate thread so that it may be interrupted at a set timeout. This prevents infinite loop problems that can be caused by poorly-built expressions or unexpected text contents. The timeout can be specified in milliseconds between 100 and 10,000. Large timeouts are unadvised. If a large amount of text needs to be parsed then it is better to split up the text logically and use smaller timeouts. The default timeout is 1000 milliseconds.
Proper usage is:
try ( RegexSpanFinder finder = new RegexSpanFinder( "\\s+" ) ) {
final List<Pair<Integer>> spans = finder.findSpans( "Hello World !" );
... <do something with discovered spans> ...
} catch ( IllegalArgumentException iaE ) {
... <do something with Exception> ...
}
- Author: SPF , chip-nlp
- Version: %I%
-
Since: 11/5/2016
Uses the default timeout of 1000 milliseconds
-
Parameters:
-
regexregular expression
-
-
Exceptions:
-
IllegalArgumentExceptionif the regular expression is null or malformed
-
public RegexSpanFinder( final String regex, final int flags, final int timeoutMillis ) throws IllegalArgumentException
Uses the default timeout of 1000 milliseconds
-
Parameters:
-
regexregular expression -
flagspattern flags; CASE_INSENSITIVE, etc. -
timeoutMillismilliseconds at which the regex match should abort, between 100 and 10000
-
-
Exceptions:
-
IllegalArgumentExceptionif the regular expression is null or malformed
-
public RegexSpanFinder( final String regex, final int timeoutMillis ) throws IllegalArgumentException
-
Parameters:
-
regexregular expression -
timeoutMillismilliseconds at which the regex match should abort, between 100 and 10000
-
-
Exceptions:
-
IllegalArgumentExceptionif the regular expression is null or malformed
-
Uses the default timeout of 1000 milliseconds
-
Parameters:
-
patternPattern compiled from a regular expression
-
-
Exceptions:
-
IllegalArgumentExceptionif the pattern is null or malformed
-
public RegexSpanFinder( final Pattern pattern, final int timeoutMillis ) throws IllegalArgumentException
Uses the default timeout of 1000 milliseconds
-
Parameters:
-
patternPattern compiled from a regular expression -
timeoutMillismilliseconds at which the regex match should abort, between 100 and 10000
-
-
Exceptions:
-
IllegalArgumentExceptionif the pattern is null or malformed
-
-
Parameters:
-
texttext in which a find should be conducted
-
-
Returns: List of Integer Pairs representing text span begin and end offsets
shut down the executor {@inheritDoc}
Simple Callable that runs a {@link Matcher} on text to find text span begin and end offsets
{@inheritDoc}
- Returns: text span begin and end offsets
- Java API
- Running a Pipeline
- Working with Data
- Utilities
- Text Searching
patient
AbstractPatientConsumer
AbstractPatientFileWriter
PatientNoteCollector
PatientNoteStore
PatientViewUtil
pipeline
PipeBitLocator
PipelineBuilder
PiperFileReader
PiperFileRunner
resource
FileLocator
util
CalendarUtil
MutableUimaContext
NumberedSuffixComparator
Pair
RelationArgumentUtil
StringUtil
annotation
ConceptBuilder
EssentialAnnotationUtil
IdentifiedAnnotationBuilder
IdentifiedAnnotationUtil
OntologyConceptUtil
SemanticGroup
SemanticTui
doc
DocIdUtil
JCasBuilder
TextBySectionBuilder
TextBySentenceBuilder
log
DotLogger
FinishedLogger
regex
RegexSpanFinder
TimeoutMatcher
textspan
DefaultAspanComparator
DefaultTextSpanComparator
DefaultTextSpan
TextSpan
