Skip to content
Discussion options

You must be logged in to vote

The noun chunks functionality depends on the UPOS tags (doc.pos), not the fine-grained tags (doc.tag). You need to include the AttributeRuler to get those. You can do that by sourcing the AttributeRuler from en_core_web_sm like you did the Tagger, though note it doesn't need replace_listeners.

For example, this config, which just sources components from the pretrained pipeline, works:

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null
raw_text = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "en"
pipeline = ["tagger","parser", "attribute_ruler"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@gunalanlakshmanan
Comment options

Answer selected by polm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / pipeline Feature: Processing pipeline and components
2 participants