Skip to content
Discussion options

You must be logged in to vote

Can you give us some more information about your patterns? The main thing that would affect size is the raw number of patterns and the complexity of each pattern.

For a large number of patterns large memory use is kind of unavoidable, but 3GB does sound like a whole lot. The Matcher does use a trie, but tries are usually more memory efficient than raw lists. On the other hand we don't regularly test with anywhere near as many patterns as you're using.

It depends on your patterns, but if you have a bunch of patterns matching on literal terms, you might be able to reduce memory usage by compiling them to regex matches. For example if you have matches for a single token like a, aa, aaa, and …

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@Pandalei97
Comment options

@polm
Comment options

@Pandalei97
Comment options

@polm
Comment options

@Pandalei97
Comment options

Answer selected by Pandalei97
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / matcher Feature: Token, phrase and dependency matcher perf / memory Performance: memory use
2 participants