Where are the attribute_ruler patterns defined for each language? #8508

bryant1410 · 2021-06-26T06:16:04Z

bryant1410
Jun 26, 2021

The different language models have an attribute_ruler, which is populated with a binary patterns file (serialized patterns). This allows for example to populate pos based on tag. I was wondering where is the source file for this patterns file.

Answered by adrianeboyd

Jun 26, 2021

Right now the pretrained pipelines are the main reference. You can see the patterns using:

nlp.get_pipe("attribute_ruler").patterns

The patterns file is just this same data in msgpack format, which you can read directly with srsly.read_msgpack (or other msgpack libraries) if you'd prefer.

You can also always source an existing attribute ruler into a new pipeline and all the patterns will be copied with it.

We've considered having a separate repo where some of the pipeline-specific settings like this are stored, so it would be easy to refer to them in new configs, but this doesn't exist yet.

View full answer

adrianeboyd · 2021-06-26T10:48:51Z

adrianeboyd
Jun 26, 2021

Right now the pretrained pipelines are the main reference. You can see the patterns using:

nlp.get_pipe("attribute_ruler").patterns

The patterns file is just this same data in msgpack format, which you can read directly with srsly.read_msgpack (or other msgpack libraries) if you'd prefer.

You can also always source an existing attribute ruler into a new pipeline and all the patterns will be copied with it.

We've considered having a separate repo where some of the pipeline-specific settings like this are stored, so it would be easy to refer to them in new configs, but this doesn't exist yet.

0 replies

bryant1410 · 2021-06-26T19:22:45Z

bryant1410
Jun 26, 2021
Author

Thanks for the answer! Yeah, it'd be cool if the pipeline-specific settings are open source, in case people want to contribute to them.

This came just out of curiosity, as I was traversing the code, trying to understand different parts.

Related, are the commands used to create the different language models publicly available somewhere? I mean the training part, when they take parts from other pipelines, packaging, etc.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Where are the attribute_ruler patterns defined for each language? #8508

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Where are the attribute_ruler patterns defined for each language? #8508

Uh oh!

bryant1410 Jun 26, 2021

Replies: 2 comments

Uh oh!

adrianeboyd Jun 26, 2021

Uh oh!

bryant1410 Jun 26, 2021 Author

bryant1410
Jun 26, 2021

adrianeboyd
Jun 26, 2021

bryant1410
Jun 26, 2021
Author