Skip to content

Document extracting data from text files that unformatted text #20

@yasonk

Description

@yasonk

For the following pipeline:

- implementation: nodestream.pipeline.extractors:FileExtractor
  arguments:
    globs:
    - data/nodes.txt
- implementation: nodestream.interpreting:Interpreter
  arguments:
    interpretations:
    - type: source_node
      node_type: MyNdoe
      key:
        node_name: !regex
          regex: '^(?P<node_name>.*)'
          data: !jmespath 'line'
          group: node_name

Because the file ends with .txt, it will be extracted into an object that has the property named "line".
However, this feature is not documented.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions