Skip to content

Conversation

nikgraf
Copy link
Collaborator

@nikgraf nikgraf commented Dec 9, 2024

EDIT: Added a schema-v2.md

This is a first proposal of how local schema definition including a mapping structure looks like. I guess this will require a couple iterations to get it right, but would be great to have a lot of input before we even build the first version which gets the relations right. We have a prototype of the initial schema exploration in the code-base, but it lacks proper handling of relations.

I think we need to design all three parts (schema definition, mappings and create/query API) together since a change in on area can have quite an impact on the others.
And please don't hesitate to propose radical different ideas.

Note: I haven't included an option where schema and mappings would be located in the same structure. Mostly because I couldn't come up with a mental model how this would look like for the different sync use-cases.

Todo

  • Add API for syncing public data (download and publish)
  • Add API for mapping relation attributes

@nikgraf nikgraf force-pushed the schema-and-mapping-design branch from 9467e03 to 4155681 Compare December 9, 2024 17:15
@nikgraf nikgraf changed the title add schema and mapping file doc add schema definition, mapping file and api proposal Dec 9, 2024
docs/schema.md Outdated
## Rules

- Each entity can have multiple types.
- All fields can be undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer user-defined strictness configuration that gets read by the codegen queries. e.g., if you want all fields to be required, the codegen can read that configuration and filter/validate results from the query to only return entities where all the fields exist. We could start with full optionality but eventually we want better strictness or the framework will be annoying to use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely can make it strict like that and I actually would prefer it like that. I think the ergonomics are much better.

We were going back and forth on this, but I think this might be the best option. What do you think @yanivtal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One interesting implication regarding queries: If we require to define the fields that you request. Do we check only for them to be present or all the fields of a type?

Copy link
Contributor

@baiirun baiirun Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect that the codegen'd queries only queries for the fields defined by the mapping. We will likely have a graphql API in front of the main data service which should make that simpler.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that for remote objects for sure, but here I was thinking only about the local objects.


### Design A

An object based schema definition where relations must be defined on the entity type. This design assumes that the local schema doesn't deal with entities having multiple types. This fact is only dealt with in the mappings file and therefor the local schema stays a lot simpler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely prefer Design A over the others.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer A! This maps very closely with how relations are represented in the Neo4j implementation

Comment on lines +124 to +138
types: {
User: {
name: type.Text,
email: type.Text,
},
Event: {
name: type.Text,
},
},
relations: {
ownedEvents: {
from: 'User',
to: 'Event',
}
},
Copy link
Contributor

@baiirun baiirun Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is just extra steps to do the same thing as Design A. I think developers should think of relations as the same "class" of data as the triple fields and separating them like this just creates an extra level of indirection.

Comment on lines +171 to +191
@Entity('Person')
class Person {
@Attribute()
name: string;

@Attribute()
age: number;

@Relationship('worksAt')
employers: Organization[];
}

@Entity('Organization')
class Organization {
@Attribute()
name: string;

@Relationship('worksAt')
employees: Person[];
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this either haha

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask someone to explain why this wouldn't be preferred? Just for my own understanding as I'm not familiar enough to understand what the downsides would be. Cheers.

Copy link
Contributor

@baiirun baiirun Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decorators aren't really idiomatic in JavaScript-land. They exist and some applications use them but it's not widespread. A more idiomatic way to apply decorators-style behavior would be with higher-order functions, but that's also not super common.

More common is typed module exports that get read by some consumer, which A and B use. Not sure why this became common, but probably because compile-time tooling is better around modules than it is for decorators, and decorators are a somewhat recent addition to the ecosystem (last few years).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right got it. Nice one thanks

Comment on lines +296 to +298
How to handle mappings which are conflicting? e.g.
- attribute `name` on `Person` is a string (id: `asd`) and
- attribute `name` on `User` is a number (id: `lal`).
Copy link
Contributor

@baiirun baiirun Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of a situation where a developer might choose to do this in their schema? I can't think of anything off the top of my head. One way this can happen is if one of the id value types changes. I think Yaniv is proposing to not allow that in the GRC-20 spec though.

Unless I'm missing something I think to start we shouldn't allow people to define multiple ids for the same alias. e.g., you can't have a name field that points to two different attributes. This would remove all the complexity of merging as well as type coercion. If developers eventually find a good reason to do this then we can revisit in the future.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is very interesting and could possibly simplify the whole mapping. I wonder if we can even prevent the case?

In a space I can have multiple attributes with the same name, but different value type.

Attributes

ID Type ID Name
abc LBdMpTNyycNffsF51t2eSp (number) temp
cde LckSTmjBrYAJaFcDs89am5 (text) temp

Entity Type: Sensor with ID xyz

ID Attribute ID
xyz abc

Entity Type: Location with ID wxz

ID Attribute ID
wxz cde

Now if you have an entity which references both types Sensor (xyz) and Location (wxz) they are conflicting.

We could disallow it for one space, but what if you include/reference other spaces. Can we avoid such conflicts?

Copy link
Contributor

@baiirun baiirun Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Sensor and Location name attributes only be conflicting if there is an entity that is both a Sensor and a Location? Shouldn't a developer choose which name attribute to use? I can't imagine they'd want to merge them if they have different attributes and value types. I think we should only merge if they're the same attribute id and the developer should have to specify it potentially.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I think we should not allow to merge locally if they don't have the same type. With this example though they could map to two different aliases:

    attributes: {
      tempText: [{
        spaceId: 'uio',
        typeId: 'xyz', // public schema type: `Sentor`
        attributeId: 'abc',
      }],
      temp: [{
        spaceId: 'uio',
        typeId: 'wxz', // public schema type: `Location`
        attributeId: 'cde',
      }],
    }

Is this a good idea? Is it a bad idea? I mean I'm all in for making it even simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should design constraints in the developer API to make sure developers fall into the pit of success. IMO declaring an alias to read from potentially two different sources is something we should avoid allowing users to do, even accidentally. And if they really want to do it for some reason then creating a second alias like in your previous example should work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree. If @yanivtal is equally planning to disallow this kind of thing in the spec I'd agree it would improve the DX as well.

When the developer goes to publish, conflicts can throw an error to that effect. Reducing complexity, improving DX and avoiding merge issues.


## Mappings

- Mappings are optional. Context: So developers can start with a local schema and only create the mappings when they feel confident about the schema and publish it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: some fields in a schema may always be private-only, so would never map to a public knowledge graph entity

Copy link
Collaborator Author

@nikgraf nikgraf Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed! What do you think? Should we have an entry e.g. private that can be added to the mappings to ensure it's never published or just omit it.

My current assumption is that without such an entry every publishing process would propose again to publish the field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an example of a private field? Just curious.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the ReactVienna meetup for example:

  • on every event it would be great to have a private not for organizers to write down specifics for the event e.g. when the food should be ordered
  • In addition we replicate every event + speakers for planning in private repository so we can write down who had contact in which channel (link/email) with the speaker so others can contact them in case the person is not available.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not understanding the examples I guess, so I might be off here. But I'll chime in anyway, feel free to disregard if it's irrelevant or off-track.

So from my perspective I wonder if this breaches the scope of the schema and should be solved appropriately at the app-level?

I can imagine certain properties being useful being local-only but generally I'd also not want those values persisted at all.

In SwiftData for example we have @Transient which is an attribute we can prepend to a property to tell SwiftData to ignore it. We can then assign an in-memory value or make it computed which is useful in many cases.

If I do also want to cache that value for some reason, I would use an external mechanism. For example we have preference storage, etc.


- Does every mapping include the space ID as well or only the type ID, attribute ID and relation ID? This depends of we want sync entities and relations from only one or multiple spaces.
- Can a local schema type/field map to multiple public schema types/fields? e.g. local `User` maps to the public schema type `User` and `Person`. Therefor the email field should be mapped to the attribute `email` of the `User` type and the attribute `email` of the `Person` type which both might have the same ID, but can be two different fields as well.
- What to do in case there are conflicting attributes e.g. `Location` and `Sensor` both have an `temperature` field and one is a number and the other a string. Do we merge them or how to map them to different fields? The most reasonable solution I can think of is to map them manually in the mappings file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that it is only the name of the attributes that is the same for both but their IDs are different, then they are considered different attributes and both could be aliased in the mappings to differentiable names.

In the case where it is the same attribute (the attributes have the same ID) then they should have the same value type and instances which implement both types would only have one value for that attribute.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot on! I think there are 3 cases to cover:

  • same attribute -> identical id,name and value type
    • map it to the same alias
  • different attribute, but same name and value type -> different id, identical name
    • map it to the same alias probably still makes sense, but there might be cases where you want to have them separate
  • different attribute with different value type, but identical name -> different id and value type, but same name
    • here we should disallow mapping it to the same alias, but allow to map it to different aliases.

The current model supports all 3 cases. That said I wonder if there are simpler ways of structuring it or if I missed something.


## Open Questions

- Does every mapping include the space ID as well or only the type ID, attribute ID and relation ID? This depends of we want sync entities and relations from only one or multiple spaces.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would have to include the space id since the same entity can exist in different spaces and have different attributes/values. Moreover, the API for querying the knowledge graph would require the client to specify the space ID since not all indexers will index all spaces and we need to be able to route the query to an indexer which indexes the space.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could even add an optional version ID in cases where a user wants to import past (or future) entities, although the implementation details will have to wait until we figure out versioning on our end.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification 👍 Then we go with including the space ID every time in the mapping. Will update the spec.

Copy link
Contributor

@baiirun baiirun Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One addendum here is that this gets tricky when it comes to querying multi-space entities and any relations coming from an entity.

By default a query for an entity is expected to be scoped to a set of spaces (either one space or many). However it will be common for entities to have relations pointing to entities that only exist in other spaces. e.g., You query for Entity A in Space A, and Entity A has Relation A that points to Entity B. Entity B only has data in Space B. So if you queried for Entity B scoped to Space A you would receive an empty entity.

Copy link
Contributor

@baiirun baiirun Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other interesting use case is that you might want to query specific attributes on an entity from specific spaces. e.g., I want Attribute A, B, and C from Space A, but Attribute D from Space D.

One example is a labeler. Maybe there's a space that labels all entities in the knowledge graph based on some dimension. All it does is "tag" entities. You might want to query Entity A's attribute schemas in Space A, but also query Space D for what the labeler says about the entity.

The simplest thing to start is probably assuming you can only query one space at a time though.


### Design A

An object based schema definition where relations must be defined on the entity type. This design assumes that the local schema doesn't deal with entities having multiple types. This fact is only dealt with in the mappings file and therefor the local schema stays a lot simpler.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer A! This maps very closely with how relations are represented in the Neo4j implementation

In this design we can map a local entity to multiple public schema types. We only define the attributes that are mapped to the public schema.

We indicate which local type maps to which public schema type with the `types` field.
In addition we define the attributes that are mapped to the public schema. Each attribute maps to a space ID, a type ID and an attribute ID to clearly identify the attribute.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is redundant to include the type ID when defining mappings for attributes, since attributes exist in the knowledge graph independently of types. E.g.: I can create an attribute "foo" that has a specific meaning and value type without necessarily associating it with types in the property graph.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting … does it make sense to have attributes that aren't connected to a specific type. Also another question is when I create an actual attribute triple with a value, can I connect it to multiple entity IDs or can an attribute triple only be connected to one entity? /cc @baiirun

Copy link
Contributor

@baiirun baiirun Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to have attributes that aren't connected to a specific type?

A major reason we'd want this behavior is in the case that a type's schema changes over time. Existing applications that depend on elements of the changed schema need to be durable to those changes. e.g., if Person has Birthdate in its schema, any application might expect for Birthdate to exist. Later on Birthdate is removed from the schema for Person. Any applications consuming Birthdate should continue to work.

This is possible if we decouple the local schema from the public types themselves, and use the public types schema more as hints as to what could exist on a type in the public graph.

The schema for types in the public knowledge graph can be thought of as "consensus" over a type and its schema. Not every developer will agree on this consensus, but still might want to consume the type from the space. Since they can define their own local schema as an extension of the public one this is possible.

Also another question is when I create an actual attribute triple with a value, can I connect it to multiple entity IDs or can an attribute triple only be connected to one entity

An individual triple can only be linked to one entity ID at a time. The uniqueness of a triple is the tuple of its space id, entity id, and attribute id.

In addition we define the attributes that are mapped to the public schema. Each attribute maps to a space ID, a type ID and an attribute ID to clearly identify the attribute.

```ts
const mappings = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the need for all this information. In my mind this would be easier to understand if it was just a flat mapping. For types it would be a mapping from Type name to Type entity ID, and for attributes it would be a mapping from Type name[.]Attribute name to Attribute ID. Could we just put that in a flat object and call that the mapping?

I suppose that assumes that the schema we define is for a single local space. Is that correct and/or do we need the space ID?

Copy link
Collaborator Author

@nikgraf nikgraf Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good questions

I suppose that assumes that the schema we define is for a single local space. Is that correct and/or do we need the space ID?

after thinking about it, I agree that the spaceID should not be in there. The more I think about, the more I'm getting confused about the mapping between types, spaces and an app.

There is the use-case where you build one app for one public space. There can be one or multiple local private spaces. They all use the same schema defined in the public space.

If you want to build an app where different teams can sign up and each of them has their own public space with it's own schema then then a hard-coded mapping per space makes little sense. There could be a "base space" the app is using for the types, but then actual data is published to the respective public space of a team.

Curious about how you think this should work?

Note: here are a couple examples of querying multi-space entities: #71 (comment)

I'm not sure I understand the need for all this information. In my mind this would be easier to understand if it was just a flat mapping. For types it would be a mapping from Type name to Type entity ID, and for attributes it would be a mapping from Type name[.]Attribute name to Attribute ID. Could we just put that in a flat object and call that the mapping?

In the past couple weeks I heard more often we want to have a strict type system for the app framework. Based on that the idea here was that we abstract the type conflicts into the mapping so that local development can be smooth. In this case you also locally wouldn't need to deal with entities having multiple types. One local type would map to one or multiple public types via the mapping

What do I mean with type conflicts e.g. you have a type

  • Sensor with temperature attribute as a number &
  • Location with temperature attribute as a string

let's create an example with a one to one mapping:

const mappings = {
  Location: {
    typeID: 'xyz',
    attributes: {
      temperature: 'gfd', // attribute `temperature: string` on the public schema with the ID `gfd`
    },
  },
  Sensor: {
    typeID: 'wzx',
    attributes: {
      temperature: 'asd', // attribute `temperature: number` on the public schema with the ID `asd`
    },
  },
};

If we now want to create an entity that is of type Location & Sensor we can't create a type safe API. If we provide a number, the type is wrong for Location and if we provide a string it's wrong for Sensor.
In the current design we could resolve this to map the conflicting attributes to two different fields.

```tsx
import { GraphProvider } from '@graph-protocol/graph-framework';

<GraphProvider defaultSpaceId={'abc'} spaces={['cde', 'fgh']}>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baiirun @yanivtal This would be one way of defining which spaces should be synced in order to query all the necessary data to be shown in an applications. That said I wonder where this data might come from in an application. It could be stored locally, but if I login to another device this must be synced. Naive approach would be to sync all, but a user could have hundreds of private spaces across hundreds of apps and we probably only want the relevant ones.

I was thinking about if apps can decide where to read/write data to then we maybe could create information in the personal space (private so it's not public info) which private spaces this specific app used. So after I login as an app I can read which spaces are relevant and should be synced.

Then the next question is if every app can access all my information or if we want some kind of mechanism to provide permissions.

This is bit of missing puzzle piece to me. Have you had more ideas here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is super interesting and part of what we need to think about. I do think it could make sense for a user to have to grant permissions to an app to access a space. I'm not sure exactly how we should enforce that, using something that looks like a wallet UI (can be done later).

I'm up for the idea of having apps write to the user's personal space with configuration data that it needs to do things like sync.

)
```

Version 2:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@baiirun @yanivtal In the call we talked about an API like Version 1. Personally I like Version 2. Which one should we go for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like one better. I don't mind having the types key but the attributes key feels verbose to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did for some reason want to do 2, I think "data" could be better than "attributes"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the second one because it's more explicit, even if it's more verbose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced w/ Yaniv and we are going with the second one and use data instead of attributes.

Copy link
Collaborator

@yanivtal yanivtal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments

## Glossary

- **Space**: A place for grouping information.
- **Private Space**: Information of a space that is not publicly accessible, but only accessible to members of a space.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I additionally segregate between Private spaces which are just private to a single individual and Shared spaces which are private spaces that are shared and synchronized.

- **Space**: A place for grouping information.
- **Private Space**: Information of a space that is not publicly accessible, but only accessible to members of a space.
- **Public Space**: Information of a space that is publicly accessible.
- **Personal Space**: A space controlled by a single person.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Controlled by a single person or organization.

- We want immediate feedback on invalid relations.
- Handling invalid Relations
- In case the from or to is missing we ignore the relation completely.
- In case the index is missing, we set an index at the end of the list. Later we can provide a callback to choose different behavior. Note: he data service should be validating this already, but can happen in case of end-to-end encrypted sync.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the part from this and the next item that changes the state of the data.. I would rather have bad behavior than rewriting of state.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should the bad behaviour behave in this case? e.g. for React it would make sense to have at least predictable bad behaviour. Thinking about this could otherwise lead to endless loops and suddenly one piece of bad data can crash other peoples apps.

```tsx
import { GraphProvider } from '@graph-protocol/graph-framework';

<GraphProvider defaultSpaceId={'abc'} spaces={['cde', 'fgh']}>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is super interesting and part of what we need to think about. I do think it could make sense for a user to have to grant permissions to an app to access a space. I'm not sure exactly how we should enforce that, using something that looks like a wallet UI (can be done later).

I'm up for the idea of having apps write to the user's personal space with configuration data that it needs to do things like sync.

)
```

Version 2:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like one better. I don't mind having the types key but the attributes key feels verbose to me.

This will automatically retrieve the latest public version of the entity, create a diff and based on that create an OPS to publish the difference.

```ts
publishEdit(entity.id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this to just publish?

)
```

Version 2:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did for some reason want to do 2, I think "data" could be better than "attributes"

It's possible to set an attribute on an entity and publish it directly. Since there is no entry in the schema or mappings file we need to set and publish the attribute directly. If we only store the attribute locally on the next publish the attribute would not be taken into account since we don't know what's the public attribute ID.

```ts
setAndPublishEntityAttribute({
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call this an "entity attribute" maybe we use "triple"? And we can just call them "entity", "attribute" and "value". Do we need the key?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also probably want to support publishing relations in the same way and not just triples. So it would be a setAndPublishProperty if we're calling data that is either a triple or a relation a "Property."


1. Run sync via a CLI command to create a local schema incl. mappings

### Initial development on a fresh schema and publish it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want to do a design sprint on the development environment Chris will be working on before committing to any flows but these seem plausible.

)
```

Version 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the second one because it's more explicit, even if it's more verbose.

This will automatically retrieve the latest public version of the entity, create a diff and based on that create an OPS to publish the difference.

```ts
publish(entity.id);
Copy link
Contributor

@baiirun baiirun Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is publishing to the public graph?

There's other fields in an EDIT that a user might want to specify, like the name, additional authors that contributed to the EDIT, etc. GRC-20 outlines which fields are allowed on an EDIT so we probably want to expose all of those to the publish API.

Copy link
Collaborator Author

@nikgraf nikgraf Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spot on! also synced with Yaniv and it makes more sense to create a multi-step process

Step 1: createEdit calculate the ops based on the diff, allow to add meta data e.g. name, additional authors
Step 2: user can review the ops and remove parts that should not be published
Step 3: publish submit the edit

In theory if the app knows you exact intentions manual review could be skipped, but this might not be always the case.

It's possible to set an attribute on an entity and publish it directly. Since there is no entry in the schema or mappings file we need to set and publish the attribute directly. If we only store the attribute locally on the next publish the attribute would not be taken into account since we don't know what's the public attribute ID.

```ts
setAndPublishEntityAttribute({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also probably want to support publishing relations in the same way and not just triples. So it would be a setAndPublishProperty if we're calling data that is either a triple or a relation a "Property."


### Querying entities

Here we want to match the SDK for the public GraphQL. Still in progress here: https://www.notion.so/Data-block-query-strings-152273e214eb808898dac2d6b1b3820c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is up-for-debate IMO. My perspective is that the graph-framework should have the same constraints for which operators are allowed (e.g., is, isNot >=, etc.) as the filter string spec. But the query format can be optimized/idiomatic for JS and React developers. I posted in Slack a bit more on this as well.

data: {
name: "John Doe",
},
spaceId: 'abc', // optional space ID where the entity should be published in
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to space instead of spaceId

})
```

### Publishing an entity
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename: Publishing edits

```ts
setAndPublishEntityAttribute({
id: entity.id,
key: 'name',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove key and use the attributeId directly. This way we also don't need to publish it right away and we can include it in the createEdit step.

Think of a query API style to also retrieve all the properties for an entity.

@nikgraf nikgraf force-pushed the schema-and-mapping-design branch from 2b757f9 to 0e09855 Compare December 18, 2024 15:55
@nikgraf nikgraf mentioned this pull request Dec 28, 2024
@nikgraf
Copy link
Collaborator Author

nikgraf commented Jan 29, 2025

More explorations/thoughts on the mapping file:

const mappings = {
  Person: { // matches the local type name
    id: 'xyz', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'gfd',
      email: 'asd',
      isAttending: 'opi',
    }
  },
  Event: {
    id: 'ytu', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'asd',
      attendees: 'opi',
    },
  },
  isAttending: {
    id: 'opi', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      from: 'xyz', // type ID of the person
      to: 'ytu', // type ID of the event
    },
  },
  attendees: {
    id: 'opi', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      from: 'ytu', // type ID of the person
      to: 'xyz', // type ID of the event
    },
  },
}

// misses one way relations in the definition
const mappings = {
  Person: { // matches the local type name
    id: 'xyz', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'gfd',
      email: 'asd',
      isAttending: { id: 'opi', field: 'from' },
    }
  },
  Event: {
    id: 'ytu', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'asd',
      attendees: { id: 'opi', field: 'to' },
    },
  },
};



// relation + reverse relation with implicit definition
const mappings = {
  Person: { // matches the local type name
    id: 'xyz', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'gfd',
      email: 'asd',
      isAttending: { id: 'opi', reverseId: 'opi2', to: 'Event' },
    }
  },
  Event: {
    id: 'ytu', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'asd',
      attendees: { id: 'opi2', reverseId: 'opi1', to: 'Person' },
    },
  },
};

// relation without reverse relation
const mappings2 = {
  Person: { // matches the local type name
    id: 'xyz', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'gfd',
      email: 'asd',
      isAttending: { id: 'opi', to: 'Event' },
    }
  },
  Event: {
    id: 'ytu', // matches the public type ID
    spaceId: 'bbb', // matches the public space ID
    properties: {
      name: 'asd',
    },
  },
};




// relation without reverse relation
const mappings2 = {
  Person: { // matches the local type name
    id: 'xyz', // matches the public type ID
    spaceId: 'abc', // matches the public space ID
    properties: {
      name: 'gfd',
      email: 'asd',
      isAttending: { id: 'opi', to: 'Event' },
    }
  },
  Event: {
    id: 'ytu', // matches the public type ID
    spaceId: 'bbb', // matches the public space ID
    properties: {
      name: 'asd',
      isTakingPlaceIn: { id: 'opi', to: 'City' },
    },
  },
  Tour: {
    properties: {
      name: 'asd',
      isTakingPlaceIn: { id: 'opi', to: 'City' },
    },
  }
};

@nikgraf nikgraf closed this May 12, 2025
@nikgraf nikgraf deleted the schema-and-mapping-design branch August 20, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants