Proposal: Unify `REST` API and `GraphQL` read / write access to database

With the introduction of `GraphQL` (currently in _beta_) to easily explore dataset and job metadata  collected by Marquez, there has been a considerable drift in the REST API [spec](https://github.com/MarquezProject/marquez/blob/main/spec/openapi.yml) and the [schema](https://github.com/MarquezProject/marquez/blob/main/api/src/main/resources/schema.graphqls) for GraphQL. Keeping both specs aligned will be addressed in a separate proposal, but for now, we'd like to propose a simple solution for reading / writing metadata to / from the Marquez database.

**What to consider in our design:**

1. Dataset and job metadata collected via either the Dataset and Job APIs or the Lineage API (used to collect [OpenLineage](https://openlineage.io/) events) should be stored using a common interface to avoid drift in logic
2. When reading collected dataset and job metadata using either the REST API or GraphQL, a common interface should be used to avoid drift in logic but also code duplication

With a common read / write interface to access metadata, there's also the added benefit of maintainability and testability.

## How will dataset and job metadata be _written_ / _read_?

We propose a common DAO class `MetadataDao` (defined below) that would delegate writes to tables using specific underlying DAOs, but also encapsulate any pre-processing steps:

```java
public interface MetadataDao {
  BagOfJobInfo upsertLineageEvent(LineageEvent event);
  Namespace upsertNamespaceMeta(NamespaceName namespaceName, NamespaceMeta namespaceMeta);
  Dataset upsertDatasetMeta(NamespaceName namespaceName, DatasetName datasetName, DatasetMeta datasetMeta);
  Job upsertJob(NamespaceName namespaceName, JobName jobName, JobMeta jobMeta);
  .
  .
}
```

**What are the benefits?**

1. Clear interface for inserting rows to database
2. Maintainability and testability
3. Avoids duplicating row insertion logic


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Unify `REST` API and `GraphQL` read / write access to database #1727

How will dataset and job metadata be written / read?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Unify REST API and GraphQL read / write access to database #1727

Description

How will dataset and job metadata be written / read?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Proposal: Unify `REST` API and `GraphQL` read / write access to database #1727