Skip to content

Can not access _ingest metadata from painless scriptingΒ #60470

@jakelandis

Description

@jakelandis

Each processor can add an "if" condition that runs a painless script to evaluate if the processor should be run. The script processor also executes a painless script. Currently the only information available to Painless (via ctx.) is the _source and metadata (i.e. _index) from the ingest document, not the _ingest metadata. The _ingest metadata has access to the ingestion time, pipeline used, etc. This information is only available via the mustache script templates which are not accessible via painless.

It is arguably a bug that the templates (mustache script) and painless scripts are not aligned. For example:

DELETE foo

POST foo/_doc?pipeline=foo_pipeline
{
  "a" : 1,
  "_ingest" : "blah"
 
}

PUT _ingest/pipeline/foo_pipeline
{
  "processors": [
    {
      "set": {
        "field": "x",
        "value": "{{_ingest}}"
      }
    },
    {
      "script": {
        "source": "ctx.y = ctx._ingest"
      }
    }
  ]
}

GET foo/_search

results in

{
        "_index" : "foo",
        "_type" : "_doc",
        "_id" : "4Z0MKoEBL4S8u-TTDs0h",
        "_score" : 1.0,
        "_source" : {
          "a" : 1,
          "_ingest" : "blah",
          "x" : "{pipeline=foo_pipeline, timestamp=2022-06-03T14:50:41.970148170Z}",
          "y" : "blah"
        }
      }

If exposing Painless to _ingest meta data should be read only since the internal processing should be the only thing writing to the field. (all values are already read only via the if condition)

Perhaps the most impactful area is the for-each processor. The for-each processor uses a special _ingest meta data to help keep track of which value you are working with. Since _ingest metadata is not exposed to painless for either the script process or the "if" condition it makes using painless with the for-each process not really possible since you can not pull the current value from the _ingest.value.

We should consider exposing the _ingest data to painless for the script processor and "if" condition. The workaround is to for-go the for-each process and do the necessary looping and processing via a top level (i.e. not inside a for-each) script processor. Additionally using the set and remove processor to read data from {{_ingest.value}} and then set a temporary field in the main document (and remove at the end of the pipeline) may allow for some (awkward) usage of painless within a for-each.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Core/Infra/ScriptingScripting abstractions, Painless, and Mustache:Data Management/Ingest NodeExecution or management of Ingest Pipelines including GeoIP>bugTeam:Core/InfraMeta label for core/infra teamTeam:Data ManagementMeta label for data/management teamhelp wantedadoptmetriagedIssue has been looked at, and is being left open

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions