Skip to content

Incremental ETL not obeyed #50

@andyadamides

Description

@andyadamides

Steps taken

  1. state.json.
{
    "bookmarks": {
        "gl_accounts": {
            "ASSET": "2020-01-01T17:33:32.000000Z",
            "LIABILITY": "2020-01-01T12:31:58.000000Z",
            "EQUITY": "2020-01-01T00:00:00Z",
            "INCOME": "2020-01-01T13:53:29.000000Z",
            "EXPENSE": "2020-01-01T13:53:13.000000Z"
        },
        "gl_journal_entries": "2020-06-09T00:00:00.000000Z",
        "deposit_transactions": "2020-06-03T00:00:05.000000Z"
    }
}
  1. command
tap-mambu --config tap_config.json --catalog catalog.json --state state.json | target-jsonl > state_new.json
  1. The above produces my streams as json as expected
  2. it also produces a new state_new.json:
{
    "bookmarks": {
        "gl_accounts": {
            "ASSET": "2021-06-17T13:23:02.000000Z",
            "LIABILITY": "2021-06-17T13:23:18.000000Z",
            "EQUITY": "2020-01-01T00:00:00Z",
            "INCOME": "2021-05-04T13:53:29.000000Z",
            "EXPENSE": "2021-05-04T13:53:13.000000Z"
        },
        "gl_journal_entries": "2021-07-01T00:00:03.000000Z",
        "deposit_transactions": "2021-07-01T00:00:03.000000Z"
    }
}
  1. Rerunning the command immediately after with new state JSON file:
tap-mambu --config tap_config.json --catalog catalog.json --state state_new.json | target-jsonl > state_new_2.json

and every time after that(with the "new" state) keeps bringing back the same results all over again which results in a massive set of duplicate records.

Can you suggest what can fix the above?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions