Rules Document content #637

afs · 2025-11-04T17:15:39Z

afs
Nov 4, 2025
Collaborator

1: Introduction

Terminology
Terminology from RDF Concepts, SPARQL Query
Document Conventions
Namespaces - including shnex, shrl:, sparql:, xsd:, rdf:
Conformance
Test suite

2: Shape Rules

Informative section (that is, non-normative) explaining the main features

Basic outline, key features, not all features.
Audience: rules engineer

What are "rules"? A set of conditions on the shape of the data that infer new information from existing data.

Apply rules: rules+data -> new information

A rule set is a collection of rules. Unit of execution.

An execution is rule set + data (base data)

2.1 Structure of a Rule

Head-Body: match body (the "if",) and generate new information ("triples")
Or as H is true when B

Body is a pattern, with value restriction.

Example of a rule that does not depend on other rules.
Example of a rule with a filter.

2.2 Evalaution

"match body, get variables, use head as a template.

Makes new information available to other rules.

Execute until no change.

Operations - infer(), query()

A rules system can provide one or both of these operations.

"infer" - creates all new information
"query" - given a simple pattern, can it be inferred by the rules+data.

2.3 Rule Dependencies

Rules can depend on rules.

Including recursion.

Example: A rule that depends on another rule. :ancestor
Example: :ancestor of :ancestor

2.4 Rule Set Stratification

"We say that rule1 depends on rule2 if ..."

2.5 Negation-as-failure

Shape Rules also supports

{ :s :p ?z . NOT EXISTS { ?z :q ?v } ... }

2.6 Assignment

and why it is beyond datalog

"safe assignment" - previous stratification - is it enough?

3. Shape Rules Abstract Syntax

Normative.

This section is the formal definition of rules and rule set execution.

3.1 Well-formedness Condition for a rule body

FILTERs, Assignments conditions so variables are defined before use.

3.2 Dependency Relationship

Definition rule A deopends on rule B if ...

3.3 Stratification

Definition

... and well-formedness conditions (NAF and recursion)

NB EXISTS - with body pattern (a limited graph pattern)

4. Concrete Syntax forms for Shapes Rules

4.1 RDF Rules Syntax

4.2 Compact Rules Syntax

4.2.1 Compact Syntax Abbreviations

4.3 SPARQL function restrictions

No BOUND, FILTER (NOT) EXISTS (available as a pattern)

Explain pattern of EXISTS/NOT EXISTS is a body pattern

5. Shape Rules Evaluation

Define evaluation

Evaluation of an expression
Evaluation of a rule
Evaluation of a Rule Set

Stratification.

Necessary if NAF.

6. Workspace named tuples

Possible addition. Tuples as space during execution. Avoids repeating pattern fragments.

Syntax:

TUPLE(name/string, varTerm1, varTerm2, ...) -- include this and maybe a short form.
?? &name(varTerm1, varTerm2, ...)
other character: % (caveat %xx), ^, *, \name, |name|(varTerm1, varTerm2, ...)

7. Attaching Rules to Shapes

Does (some) targets as patterns work for a definition?

Appendix A: Shape Rules Grammar

Appendix B: Relationship to SHACL-AF

SHACL-AF Triple rules
SHACL-AF SPARQL rules

Appendix C: Relationship to node expressions

liviorobaldo · 2025-11-06T16:48:16Z

liviorobaldo
Nov 6, 2025
Collaborator

Hello!

I don’t remember if we are expected to comment on the provisional table of contents above. If not, I’ll just delete this comment 🙂

Anyway, I have three comments, which I also discussed at the latest meeting.

============================

Attaching Rules to Shapes
Does (some) targets as patterns work for a definition?

As I mentioned at the meeting, I think this should be specified in 2. Packaging SHACL. In my view, as I proposed in the other discussion group, there should be a "cluster" (or "bundle," or any other suitable name) that groups together some data, a set of shapes, and a set of rules. We should also decide whether shapes are applied before the rules or vice versa, with respect to the two operations infer() and query().

I understood that you and the others were considering this a viable idea, but it is definitely something to discuss with the SHACL Profiling task force. How should we proceed? I can raise the point at the next WG meeting on Monday, or I could open an issue directly in the Packaging SHACL section (once I figure out how 😅), but I’m not sure if that’s the proper procedure. By the way, the SHACL 1.2 Editor Draft now displays entirely black in my browser, and the githack URLs give a 404, so I’m not sure what’s going on.

============================
We know that:

Apply rules: rules+data -> new information

could also mean creating new blank nodes or literals, which could lead to infinite loops. We discussed at the meeting that we should explain at some point how to prevent these infinite loops (but not in Section 2!).

It's not fully clear to me how to formally prevent them in the grammar, because infinite loops are triggered by rules whose antecedents are always satisfied (or satisfied, then not, then satisfied again, and so on, infinitely). Is there a way to constrain the grammar to avoid this? I can't think of one, but I'm happy to learn 😊 Alternatively, we could simply add a disclaimer noting the risk of infinite loops and that it's the user's responsibility to ensure their rule set does not generate them.

============================
I notice that the table of contents above no longer mentions aggregate functions. I think these should be enabled by the grammar, as they are needed in many use cases.

For example, with a PhD student of mine, I am developing a system to check compliance with Ghana Petroleum Commission regulations (see this paper). One regulation requires companies operating in Ghana to employ at least 80% Ghanaians among the technical staff after 5 years. To infer whether a company complies, we must: (1) count the total employees; (2) count the Ghanaian employees; (3) check that (2)>0.8*(1)

This obviously requires aggregate functions (COUNT).

I've read more carefully how aggregates are used in SHACL 1.2 Node Expressions, but I don’t think they can be directly used here. This isn't a validation problem: the data are valid, we must infer whether the data comply with the regulations or not. Also, these regulations can include exceptions, e.g., companies may be exempt under certain conditions, but evaluating these conditions may require several additional operations, potentially including more aggregates. Therefore, I don't think this is a validation issue, it looks like an inference one.

The problem is indeed more general: some inferences are required only when certain quotas or thresholds are met, or when the sum of some values exceeds a given limit, or in similar situations. Think, for example, of applications in finance.

Nevertheless, I’m not an expert in stratification enough to know if the basic stratification method, used for negation-as-failure, can be easily extended to aggregates as well, e.g., evaluating aggregate rules only after all non-aggregate rules have been evaluated.

Cheers,
Livio

1 reply

afs Nov 6, 2025
Collaborator Author

============================

Attaching Rules to Shapes
Does (some) targets as patterns work for a definition?

As I mentioned at the meeting, I think this should be specified in 2. Packaging SHACL.

Maybe raise a github issue and label it appropriately?

We should also decide whether shapes are applied before the rules or vice versa, with respect to the two operations infer() and query().

If shapes are triggered by targets, including in the pattern (targets are ways to find nodes in the data graph) may be the way to treat them as rules like any other.

[ ] sh:targetClass :SomeClass .

is

. . . WHERE { ?x rdf:type :SomeClass .  ... }

Some thing to explore.

Does packaging change that?

============================
We know that:

Apply rules: rules+data -> new information

could also mean creating new blank nodes or literals, which could lead to infinite loops. We discussed at the meeting that we should explain at some point how to prevent these infinite loops (but not in Section 2!).

It's not fully clear to me how to formally prevent them in the grammar,

Not by syntax. The unbounded outputs come from dependency loops.

What could be done is ensuring that there aren't dependency loops (same situation as NAF).

============================
I notice that the table of contents above no longer mentions aggregate functions. I think these should be enabled by the grammar, as they are needed in many use cases.

"many" need qualifying.

This obviously requires aggregate functions (COUNT).

I've read more carefully how aggregates are used in SHACL 1.2 Node Expressions, but I don’t think they can be directly used here.

What's the problem with node expression aggregates?

liviorobaldo · 2025-11-06T21:16:37Z

liviorobaldo
Nov 6, 2025
Collaborator

Maybe raise a github issue and label it appropriately?

Ok, let me try... tomorrow :-)

If shapes are triggered by targets, including in the pattern (targets are ways to find nodes in the data graph) may be the way to treat them as rules like any other.

Indeed. I was already told that shapes are like rules, with the main difference being that they produce an error message rather than new triples. So, in principle, everything could be modelled as rules. As you say, it’s something worth exploring. However, even if it works technically, I’m not sure it’s a good idea conceptually. Perhaps we should instead maintain a clear conceptual distinction between validation and inference, using two separate constructs to better emphasize this difference.

Okay about infinite loops. I also thought about the parallel with NAF, but the difference is that an infinite loop can be triggered even by a single rule. However, I can now see that the rule would depend on itself, so what you propose would still work.

"many" need qualifying.
What's the problem with node expression aggregates?

As I mentioned in my previous reply, we might need rules to infer new values when certain quotas or thresholds are met, or when the sum of some values exceeds a given limit. I haven’t conducted empirical analyses to determine how often this need arises, but intuitively, it seems like it could be fairly common. I actually worked on a small use case with my PhD student and already ran into this need. Maybe I was "unlucky" and found a rare case, but even if it is seldom needed, I don’t see why we should prevent aggregates in the bodies. Are they really that much harder to handle than negation-as-failure?

Perhaps the problem (my problem) is that I haven’t fully understood node expressions. Can they also be used for rules? From the current draft, I understood that they can only be used for shapes, i.e., for validating the data.

How would that work, for example, in the use case I described earlier? Suppose we have two classes: Company and Employee. Individuals of type Company are linked to individuals of type Employee via the property employs, while another property, has-nationality, links employees to their nationality, e.g., UK or Ghana.

Can we use node expressions to state that if at least 80% of a company’s employees are from Ghana, then the company belongs to the class Compliant? In my experience, this shouldn't be modeled as a validation problem. Compliance with regulations is an inference problem. In fact, determining whether a norm applies can require complex reasoning, because exceptions may exist. Similarly, if we infer that a company is not compliant, we may need to trigger additional rules to determine the sanctions or compensations the company must provide for non-compliance.

With SHACL-SPARQL rules, we still had to specify sh:targetClass. Node expression aggregates could then be used in this sh:targetClass. But with the new SHACL 1.2 format, we no longer have sh:targetClass. Can we use node expression aggregates in the bodies of rules?

As I understand it, with the current grammar we cannot. That’s why I was proposing allowing aggregates in the bodies and then extending stratification to them (evaluating aggregate rules only after all non-aggregate rules have been evaluated). However, there may be some reason (which I don’t know) why stratification works for negation-as-failure but not for aggregates.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rules Document content #637

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Rules Document content #637

Uh oh!

Uh oh!

afs Nov 4, 2025 Collaborator

1: Introduction

2: Shape Rules

2.1 Structure of a Rule

2.2 Evalaution

2.3 Rule Dependencies

2.4 Rule Set Stratification

2.5 Negation-as-failure

2.6 Assignment

3. Shape Rules Abstract Syntax

3.1 Well-formedness Condition for a rule body

3.2 Dependency Relationship

3.3 Stratification

4. Concrete Syntax forms for Shapes Rules

4.1 RDF Rules Syntax

4.2 Compact Rules Syntax

4.2.1 Compact Syntax Abbreviations

4.3 SPARQL function restrictions

5. Shape Rules Evaluation

6. Workspace named tuples

7. Attaching Rules to Shapes

Appendix A: Shape Rules Grammar

Appendix B: Relationship to SHACL-AF

Appendix C: Relationship to node expressions

Replies: 2 comments · 1 reply

Uh oh!

Uh oh!

liviorobaldo Nov 6, 2025 Collaborator

Uh oh!

afs Nov 6, 2025 Collaborator Author

Uh oh!

Uh oh!

liviorobaldo Nov 6, 2025 Collaborator

afs
Nov 4, 2025
Collaborator

Replies: 2 comments 1 reply

liviorobaldo
Nov 6, 2025
Collaborator

afs Nov 6, 2025
Collaborator Author

liviorobaldo
Nov 6, 2025
Collaborator