Skip to content

Rework as the Linked Data Formats WG #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 13, 2025

Conversation

BigBlueHat
Copy link
Member

@BigBlueHat BigBlueHat commented Jul 14, 2025

This is an early stage reworking of the charter to try to address the group's interest in a more inclusive name indicative of our interest in publishing YAML-LD and CBOR-LD alongside a new/updated JSON-LD.

Much more editing to do, but I wanted to share this early to get feedback.

Cheers!
🎩


Preview | Diff

@gkellogg
Copy link
Member

  1. Please add me as co-chair.
  2. Both Manu, and I think Ivan, suggested trimming down the explicit issues in the In Scope section. The RDF & SPARQL WG has the following in the Scope section: "The Working Group will also consider allowing new features in these recommendations, according to Section 6.3.11.4 of the W3C process, in order to render future evolutions easier."
  3. Is this the current version of the document template for charters? Does it matter?
  4. To get ahead of the game, the new specs should probably be "CBOR-LD 1.0" and "YAML-LD 1.0".
  5. For liaison:
  • The RDF Dataset Canonicalization and Hash WG is in maintenance mode, and probably doesn't have any coordination requirements.
  • The RDF-Star WG is now called the RDF & SPARQL WG.
  • I don't think we need to explicitly coordinate with Schema.org CG, RDF-DEV CG, or the RDF JavaScript Libraries CG.

@iherman
Copy link

iherman commented Jul 15, 2025

  • Both Manu, and I think Ivan, suggested trimming down the explicit issues in the In Scope section. The RDF & SPARQL WG has the following in the Scope section: "The Working Group will also consider allowing new features in these recommendations, according to Section 6.3.11.4 of the W3C process, in order to render future evolutions easier."

Indeed. Having such a long and detailed laundry list would make it too difficult for the AC to review.

As I said before on calls, I believe the scope section should pick 2-3 very high priority items, and commit the full recommendation work for the present charter only for those. All the other items should be formulated as possible rec-track work, but only if the high priority items are completed; otherwise, their completion should be postponed to a later charter.

In my view, the high priority items (in terms of public need) are:

  1. Handling the main objection to JSON-LD, namely the security questions raised v.a.v. the reference and access to a @context; this is, I believe, Consider context by reference with metadata w3c/json-ld-syntax#108
  2. Compatibility with RDF 1.2
  3. RDF 1.2 compatible CBOR-LD

All other work items, including YAML-LD, should be considered as a "wouldn't it be nice", conditional items, as referred to above.

@@ -3,7 +3,7 @@
<head>
<meta charset="utf-8">

<title>JSON-LD Working Group Charter</title>
<title>Linked Data Formats Working Group Charter</title>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during our last meeting, I believe that "JSON-LD" should still appear in the name of the WG. @BigBlueHat proposed "JSON-LD data formats" (or simply "JSON-LD formats", as D alteady means "data" 😉 )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon first reading the name "Linked Data Formats", I thought the goal was also to bring other formats such as Turtle, RDF/XML, and so on under the scope of this WG. But this doesn't seem to be the cause. I would suspect similar confusion to be raised outside of the WG.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What brings the these formats under the same roof is the fact that these formats add an rdf (a.k.a. linked data) layer on top of an existing and widespread data syntax. On could argue that this is the case of rdf/xml, and that would be true, except that people hardly care about that format anyway.

I am sure someone will find a nice term for that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What brings the these formats under the same roof is the fact that these formats add an rdf (a.k.a. linked data) layer on top of an existing and widespread data syntax.

Perhaps something along the direction of Embedded Linked Data Formats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on the last call. We need to find a good name, but basically it’s appropriate for polyglot formats that can be reduced to some variation of our internal representation. This explicitly does not include other established RDF formats.

“Embedded” is a loaded term, and also describes how scripts may be embedded in HTML using the script element.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formats that can be reduced to some variation of our internal representation

I believe this is the core criteria, and that's one of my argument for keeping JSON-LD in the name (in addition to showing continuity and avoiding confusion).

@TallTed
Copy link

TallTed commented Jul 17, 2025

For your consideration...

Linked Data Formats Working Group, shorthand LDF-WG.

JSON-LD, CBOD-LD, etc., can be a subheading on all relevant pages/documents. (I considered a parenthetical after "Formats" but that doesn't read well.)

@TallTed
Copy link

TallTed commented Jul 24, 2025

For your consideration...

JSON-LD and Related Formats Working Group, shorthand JSON-LD-WG.

@BigBlueHat
Copy link
Member Author

For your consideration...

JSON-LD and Related Formats Working Group, shorthand JSON-LD-WG.

Maybe...
JSON-LD Linked Data Formats Working Group

@niklasl
Copy link
Member

niklasl commented Aug 6, 2025

One thing that's bugged me is that, hitherto, JSON-LD allows you to use only JSON tooling, but if a future processor is expected to be able to ingest all formats (e.g. a JSON document referencing a context expressed in YAML), this is no longer the case (and AFAIK, a fully conforming YAML parser is (way) more complex than a JSON parser).

This depends on expected usage contexts of course; we may need to detail some scenarios to draw usable lines around them.

@TallTed
Copy link

TallTed commented Aug 6, 2025

if a future processor is expected to be able to ingest all formats (e.g. a JSON document referencing a context expressed in YAML

I think a reasonable restriction would be that any external context file must be in the same serialization as the document that references that context. A YAML-based stack should not need to understand JSON or CBOR, etc., but a tool that can translate a JSON-LD document to YAML or CBOR (or vice versa) must obviously understand both serializations.

@pchampin
Copy link
Collaborator

pchampin commented Aug 7, 2025

if a future processor is expected to be able to ingest all formats (e.g. a JSON document referencing a context expressed in YAML

I think a reasonable restriction would be that any external context file must be in the same serialization as the document that references that context. A YAML-based stack should not need to understand JSON or CBOR, etc., but a tool that can translate a JSON-LD document to YAML or CBOR (or vice versa) must obviously understand both serializations.

This would strongly reduce the usability of YAML-LD if only contexts published as YAML-LD could be used... I would suggest that support for JSON-LD be a baseline, and that additional syntaxes be optional.

@gkellogg
Copy link
Member

gkellogg commented Aug 7, 2025

if a future processor is expected to be able to ingest all formats (e.g. a JSON document referencing a context expressed in YAML

I think a reasonable restriction would be that any external context file must be in the same serialization as the document that references that context. A YAML-based stack should not need to understand JSON or CBOR, etc., but a tool that can translate a JSON-LD document to YAML or CBOR (or vice versa) must obviously understand both serializations.

This would strongly reduce the usability of YAML-LD if only contexts published as YAML-LD could be used... I would suggest that support for JSON-LD be a baseline, and that additional syntaxes be optional.

My feeling is that it depends on what specs an application depends on; if just JSON-LD, then it is only expecting JSON-LD content and contexts. If YAML-LD, then either JSON-LD or YAML-LD content or contexts (or a mixture). If CBOR-LD, then probably only JSON-LD contexts, but could interpret YAML-LD contexts if specifically included by the app, but this could vary between implementations, so not too good for interoperability.

My own default document loads will load based on specific libraries that are loaded alongside the core JSON-LD implementation.

@niklasl
Copy link
Member

niklasl commented Aug 7, 2025

I agree that YAML-LD processors should also handle JSON-LD. That's probably a minor addition to implementations, but the converse is not generally true. (Aside: JSON is mostly a subset of YAML, but with some potentially gnarly edge-cases like 1e0.)

@iherman
Copy link

iherman commented Aug 8, 2025

As far as I am concerned, one of the major advantage of YAML-LD is the ability to produce a much more readable (with the possibility for comments!) version of the context files. The complexity is usually in these context files, and they are also difficult to maintain. Compared to the context, the real data itself is, usually, way simpler.

What these tells me that, ideally, a JSON-LD processor not only SHOULD but MUST be able to import YAML-LD files. (I say "ideally" because I am aware of the difficulties.)

One way of making things a bit simpler is to refrain from using the non-JSON features of YAML. For me, YAML-LD should just be a straight syntax alternative to JSON-LD (comments put aside). In my (limited) experience, this'd mean that an off-the-shelf YAML parser would compile into the same datastructure as what a JSON parser would do, and that would greatly reduce the load.

@niklasl
Copy link
Member

niklasl commented Aug 8, 2025

JSON is about simplicity, YAML is about convenience. If we mix up these concerns and mandate the latter, the tooling simplicity is lost. Others might claim that like JSON5 strikes the hypothetically better balance, or TOML or Pkl. (Who of course are way younger than YAML, which is older than JSON; though the latter "won" during the XML backlash. S-expressions are still patiently waiting for a comeback.) At that point I'd argue stronger for Turtle, to get rid of all the complex indirection for getting triples out (and that the real(er) problem is vocabulary choice).

I know developers tend to push these boundaries (and IM(NS?)HO sometimes to an alarming fault), and this warrants a discussion about principles, technology stacks and ecosystems. I don't know in what ways the existing deployments of JSON-LD (ActivityStreams, Schema.org, etc.) are consumed, and which of their stacks would appreciate other serializations coming back. There are strong opinions here towards both ends of the spectrum (and beyond). My own work on JSON-LD has always been about finding a workable intersection of these (which has been hard enough).

Without further research, I'd claim that JSON must remain the baseline, and that if others want to author and publish contexts in e.g. YAML, that's fine as long as the serving of that supports Content Negotiation, and serves the JSON form (also implying that you mustn't reference the YAML representation directly as a context, but the information resource you can negotiate on). One argument for that is: if converting between formats is easy and you're willing to mandate all your consumers to do so, lead by example and do so upon publication (which of course spares the consumers from doing it).

(Inserting mandatory principle references for good measure.)

@gkellogg
Copy link
Member

gkellogg commented Aug 8, 2025

As far as I am concerned, one of the major advantage of YAML-LD is the ability to produce a much more readable (with the possibility for comments!) version of the context files. The complexity is usually in these context files, and they are also difficult to maintain. Compared to the context, the real data itself is, usually, way simpler.

Yes, this is a good use case for contexts in YAML-LD.

What these tells me that, ideally, a JSON-LD processor not only SHOULD but MUST be able to import YAML-LD files. (I say "ideally" because I am aware of the difficulties.)

That upsets layering; I don't think we can require that all JSON-LD processors handle YAML-LD, but as designed, all YAML-LD processors must support JSON-LD (including application/ld+json). The community can work to make sure this becomes the norm.

One way of making things a bit simpler is to refrain from using the non-JSON features of YAML. For me, YAML-LD should just be a straight syntax alternative to JSON-LD (comments put aside). In my (limited) experience, this'd mean that an off-the-shelf YAML parser would compile into the same datastructure as what a JSON parser would do, and that would greatly reduce the load.

That's pretty much what's in YAML-LD now, although we've explored "extended" profiles that can use other YAML features, just not currently in the document. CBOR comes with its own set of features.

@anatoly-scherbakov
Copy link

Agreed with what has been said above. Let me restate that to be sure that I understand everyone correctly.

  • x-LD toolkit should be compliant iff it supports JSON-LD contexts;
  • x-LD toolkit is not required to support y-LD context if y ≠ JSON ∧ x ≠ y;
  • I propose to address the argument about whether TOML-LD, S-Expression-LD, etc. are preferable over YAML-LD by the statement that, in the ideal world, for each JSON-like language x there should be an x-LD standard.

They do not compete with each other. Linked Data can be thought of as lingua franca, and all x-LD formats contriubte to that purpose.

Unfortunately, quoting of @-keywords in YAML-LD contexts presently cannot be avoided.

"@context":
    owl: https://www.w3.org/2002/07/owl#

    # If we import another ontology its address is an URI
    owl:import:
        "@type": "@id"

That's a nuisance. But even with that, writing contexts with YAML-LD is (IMHO) much more pleasant than writing these things by hand in JSON-LD.

@@ -164,7 +181,7 @@ <h2>Scope</h2>
It will also develop new Recommendation Track deliverables, based on work incubated by the <a href="https://www.w3.org/community/json-ld/">JSON for Linking Data Community Group</a>,
specifying the use of JSON-LD algorithms with similar formats (YAML, CBOR, and more).
</p>
<p>The Working group is expected to coordinate with the <a href='https://www.w3.org/community/json-ld/'>JSON for Linking Data Community Group</a> on consensus-based proposals related to content changes for the JSON-LD Working Group Deliverables. The Chairs of this group may choose to reject proposals that are incompatible with this Charter.</p>
<p>The Working group is expected to coordinate with the <a href='https://www.w3.org/community/json-ld/'>JSON for Linking Data Community Group</a> on consensus-based proposals related to content changes for the Linked Data Formats Working Group Deliverables. The Chairs of this group may choose to reject proposals that are incompatible with this Charter.</p>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the Linked Data Formats WG name very much.

Should the Community Group also be renamed? Does it somehow relate to the rechartering, and renaming, of the Working Group?

@iherman
Copy link

iherman commented Aug 11, 2025

But even with that, writing contexts with YAML-LD is (IMHO) much more pleasant than writing these things by hand in JSON-LD.

We are in agreement on this. But, according your equation (x-LD toolkit is not required to support y-LD context if y ≠ JSON ∧ x ≠ y;), the requirement is such that a JSON-LD tool is not required to support a YAML-LD context, but a YAML-LD tool is required to support a JSON-LD context. That means that, in practice, very few people will write contexts in YAML.

It may be improved if we are very strict in syntactic interoperability of YAML-LD and JSON-LD, i.e., a (putting it in js code) something

JSON.stringify(yaml.parse(yaml_content))

produces a valid and semantically equivalent JSON-LD context. But if we do so, then directly importing a YAML context file into a JSON-LD tool becomes trivial, meaning that we can safely expect a JSON-LD tool to import a YAML context...

I do not expect anyone writing context files in CBOR. CBOR-LD, in my view, is very different from YAML-LD in its audience and usage, and the comparisons are misleading. Also, I do not want to go down the hypothetical x-LD road for other than YAML and CBOR. I am not convinced that any of the other formats would respond to a real need at this point (except maybe JSON5, but that is, from our point of view, trivial).

@anatoly-scherbakov
Copy link

very few people will write contexts in YAML

This makes sense, but I personally would rather write a context in YAML-LD but convert it to JSON-LD (which is trivial) for publication. I very well know that JSON-LD toolkits are out there, and they do not support YAML-LD.

Well, I probably would have acted differently if I knew that up-to-date toolkits do support it.

hypothetical x-LD road

Yes, it is very much hypothetical, I do not propose to start working on any such format before YAML-LD and CBOR-LD come to fruition. It is just to keep the possibilities open. To prevent cutting that road off, especially bearing in mind how generic the new WG name is.

@pchampin
Copy link
Collaborator

That means that, in practice, very few people will write contexts in YAML.

I disagree. What that implies is that people will not publish contexts in YAML only.
I suspect that many will use YAML-LD to author context, then convert them to JSON-LD or set-up content-negociation when publishing them.
(that's what I would do, anyway)

BigBlueHat and others added 4 commits August 12, 2025 16:56
Co-authored-by: Pierre-Antoine Champin <[email protected]>
Mostly removing the web-scale/web-developer specificity as JSON-LD reaches
well beyond the browser-based Web to bots, IoT, big data, AI, and LLMs.
@BigBlueHat BigBlueHat marked this pull request as ready for review August 13, 2025 14:19
@BigBlueHat
Copy link
Member Author

@gkellogg @iherman @pchampin this is starting to feel pretty close to ready for heavy critique. 😉 We'll be talking about it today and the rest of August (if/as needed), but let me know if you see anything glaring I may have missed.

Cheers!
🎩

Copy link
Member

@gkellogg gkellogg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, at least good enough to merge.

@BigBlueHat BigBlueHat merged commit d541f3d into json-ld-wg Aug 13, 2025
BigBlueHat added a commit that referenced this pull request Aug 13, 2025
Also add note about new features referencing the W3C process.

Based on @gkellogg suggestion in #10 (comment)
@BigBlueHat
Copy link
Member Author

Merged based on reviews here and confirmation on today's WG call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants