Extract substrings matching a lexical pattern.
Load the paclet from the Paclet Repository
PacletInstall[ResourceObject["FaizonZaman/LexicalCases"]]
Needs["LexicalCases`"]Supports v14.0+
Search strings, files or wikipedia articles for a lexical pattern.
oosp = ExampleData[{"Text", "OriginOfSpecies"}];
oospPattern = BoundToken[WordToken[2], BoundToken["specie"|"species"]];
oospResults = LexicalCases[oosp, oospPattern]All Text Content Types can be used, however, some will take unreasonably long to expand, especially if it's meant to represent a hefty piece of text, like a topic type. The basic parts of speech types are good ones to start with:
alice = ExampleData[{"Text", "AliceInWonderland"}];
alicePattern = "Alice" ~~ TypeToken["Verb"] ~~ TypeToken["Adverb"];
aliceResults = LexicalCases[alice, alicePattern]Use lexical patterns in StringCases, StringPosition and StringmatchQ by wrapping the pattern with LexicalPattern.
Here's an example creating an operator of StringCases:
aliceOp = StringCases[LexicalPattern["Alice" ~~ TypeToken["Verb"] ~~ TypeToken["Adverb"]]];The paclet documentation includes additional examples, or visit LexicalCases on the Wolfram Paclet Repository.