Skip to content

Add lightweight Schema.org-based validation for entity properties #6

@villegar

Description

@villegar

In the current version of rocrateR, entities are considered valid as long as they contain @id and @type. This means it is currently possible for users to define entities with arbitrary properties, for example:

person <- rocrateR::entity(
  x = "#person:ab761662ca15f3f7658a0b3adeaae564",
  type = "Person",
  name = "xxx",
  sub = "xxxx",
  email_verified = TRUE,
  given_name = "xxx",
  family_name = "xxx",
  picture = "xxx",
  email = "xxx"
)

In the above example, sub, email_verified and pictureare not standard properties defined by **Schema.org** for aPerson(e.g.,givenName, familyNameandimage` would be the expected properties).

While an RO-Crate itself does not strictly forbid additional properties, it would be useful for rocrateR to provide a more robust validation mechanism that can flag non-standard properties based on an entity’s @type. This does not need to be a full RO-Crate validator (as those are already under development elsewhere), but could take the form of optional warnings or a lightweight validation function to help users identify potential issues early.

Possible implementation approach

A potential approach would be to automatically derive allowed properties per @type from the official Schema.org JSON-LD definitions, and use this to validate entity fields. Validation could be:

  • optional (opt-in or enabled via a function argument),
  • warning-based by default (rather than stopping execution), and
  • scoped to property names only (without validating value types).

This functionality could be exposed either as:

  • a standalone validation function (e.g., validate_entity() or validate_rocrate()), or
  • an optional validation step during entity or crate creation.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions