Skip to content

Commit e61595b

Browse files
committed
Restyle RBAC limitations and warn about use of fail-open DENY semantics (neo4j#2488)
1 parent a0d7ec4 commit e61595b

File tree

1 file changed

+124
-50
lines changed

1 file changed

+124
-50
lines changed

modules/ROOT/pages/authentication-authorization/limitations.adoc

Lines changed: 124 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -14,26 +14,33 @@ CREATE ROLE unrestricted;
1414
[[access-control-limitations]]
1515
= Limitations
1616

17-
The known limitations and implications of Neo4j's role-based access control security are described in this section.
17+
It is very important to apply the principle of least privilege when defining user roles and privileges.
18+
Further to that, Neo4j's role-based access control has some limitations and implications that users should be aware of, such as:
19+
20+
* Impact on query results regardless of whether indexes are used.
21+
* Impact on query results when nodes have multiple labels.
22+
* The need for careful management of user roles and privileges to avoid unintended data exposure.
23+
* Potential performance impacts when querying large graphs with complex security rules.
1824

1925
[[access-control-limitations-indexes]]
2026
== Security and indexes
2127

22-
As described in link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/search-performance-indexes/overview/[Cypher Manual -> Indexes for search performance], Neo4j {neo4j-version} supports the creation and use of indexes to improve the performance of Cypher queries.
28+
Neo4j lets you create and use indexes to speed up Cypher queries.
29+
See the link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/search-performance-indexes/[Cypher Manual -> Indexes] for more details on the different types of indexes available in Neo4j.
2330

24-
Note that the Neo4j security model impacts the results of queries, regardless if the indexes are used or not.
25-
When using non full-text Neo4j indexes, a Cypher query will always return the same results it would have if no index existed.
26-
This means that, if the security model causes fewer results to be returned due to restricted read access in xref:authentication-authorization/manage-privileges.adoc[Graph and sub-graph access control],
31+
However, Neo4j’s security model still controls what results you see, regardless of whether or not you use indexes.
32+
For example, when you use link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/search-performance-indexes/overview/[search-performance indexes] (non–full-text) indexes, queries return the same results they would without any index.
33+
This means that, if the security model causes fewer results to be returned due to restricted read access in xref:authentication-authorization/manage-privileges.adoc[graph and sub-graph access control],
2734
the index will also return the same fewer results.
2835

29-
However, this rule is not fully obeyed by link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/semantic-indexes/full-text-indexes/[Cypher Manual -> Indexes for full-text search].
30-
These specific indexes are backed by _Lucene_ internally.
31-
It is therefore not possible to know for certain whether a security violation has affected each specific entry returned from the index.
32-
In face of this, Neo4j will return zero results from full-text indexes in case it is determined that any result might be violating the security privileges active for that query.
36+
link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/semantic-indexes/full-text-indexes/[Full-text indexes] work differently.
37+
These indexes use Lucene under the hood.
38+
Because of that, Neo4j cannot check whether a security violation has affected each specific entry returned from the index.
39+
So, if there is any chance a result might violate active security privileges for a query, Neo4j returns zero results from the full-text indexes.
3340

34-
Since full-text indexes are not automatically used by Cypher, they do not lead to the case where the same Cypher query would return different results simply because such an index was created.
35-
Users need to explicitly call procedures to use these indexes.
36-
The problem is only that, if this behavior is not known by the user, they might expect the full-text index to return the same results that a different, but semantically similar, Cypher query does.
41+
Also, Cypher does not use full-text indexes automatically — you have to explicitly call procedures to use them.
42+
This avoids a situation where the same Cypher query would return different results simply because such an index exists.
43+
The problem is that if you do not know this behavior, you might expect the full-text index to return the same results that a different but semantically similar Cypher query does.
3744

3845
=== Example with denied properties
3946

@@ -54,16 +61,16 @@ Full-text indexes support multiple labels.
5461
See link:{neo4j-docs-base-uri}/cypher-manual/current/indexes/semantic-indexes/full-text-indexes//[Cypher Manual -> Indexes for full-text search] for more details on creating and using full-text indexes.
5562
====
5663

57-
After creating these indexes, it would appear that the latter two indexes accomplish the same thing.
64+
After creating these indexes, it may look that the latter two indexes accomplish the same thing.
5865
However, this is not completely accurate.
5966
The composite and full-text indexes behave in different ways and are focused on different use cases.
6067
A key difference is that full-text indexes are backed by _Lucene_, and will use the _Lucene_ syntax for querying.
6168

6269
This has consequences for users restricted on the labels or properties involved in the indexes.
6370
Ideally, if the labels and properties in the index are denied, they can correctly return zero results from both native indexes and full-text indexes.
64-
However, there are borderline cases where this is not as simple.
71+
However, there are borderline cases where this is not that simple.
6572

66-
Imagine the following nodes were added to the database:
73+
Imagine the following nodes are added to the database:
6774

6875
[source, cypher]
6976
----
@@ -120,7 +127,7 @@ CALL db.index.fulltext.queryNodes("userNames", "ndy") YIELD node, score
120127
RETURN node.name
121128
----
122129

123-
The problem now is that it is not certain whether the results provided by the index were achieved due to a match to the `name` or the `surname` property.
130+
The problem now is that it is not certain whether the results provided by the index are achieved due to a match to the `name` or the `surname` property.
124131
The steps taken by the query engine would be:
125132

126133
* Run a _Lucene_ query on the full-text index to produce results containing `ndy` in either property, leading to five results.
@@ -180,60 +187,127 @@ Otherwise, it will process as described before.
180187

181188
In this case, the query will return zero results rather than simply returning the results `Andy` and `Sandy`, which might have been expected.
182189

190+
=== Avoiding fail-open `DENY` behavior
191+
192+
A `DENY` rule fails open when its criteria is not met, so Neo4j does not apply the restriction and it grants access by default if a broader `GRANT` exists.
193+
This can lead to unintended data exposure if the `DENY` rule is not carefully crafted.
194+
To avoid this, you can apply the principle of least privilege and allow access only to the specific data that the user should see.
195+
196+
For example, consider the following scenarios:
197+
198+
.Example of an un-met `DENY` failing open with property-based RBAC
199+
====
200+
You grant a user access to a property and try to restrict it with a `DENY` rule.
201+
However, if the `DENY` rule does not match any data, for example, if the property is null or misspelled, the `DENY` rule will not apply, and the user can still access the property.
202+
[source, cypher]
203+
----
204+
GRANT READ {salary} ON GRAPH * NODES Employee TO myRole
205+
DENY READ {salary} ON GRAPH * FOR (e:Employee) WHERE e.position = 'CEO' TO myRole
206+
----
207+
In this case, if the `e.position` property is null or misspelled, the `DENY` rule will not apply, and `myRole` will see the `salary` property.
208+
209+
A better way is to apply the principle of least privilege and only grant access to the `salary` property for employees whose position is not 'CEO'.
210+
[source, cypher]
211+
----
212+
GRANT READ {salary} ON GRAPH * FOR (e:Employee) WHERE e.position <> 'CEO' TO myRole
213+
----
214+
215+
Or, if for some reason using `DENY` is unavoidable, the problem can be mitigated by adding an additional `DENY` to cover the case where `e.position` is null:
216+
[source, cypher]
217+
----
218+
DENY READ {salary} ON GRAPH * FOR (e:Employee) WHERE e.position IS NULL TO myRole
219+
----
220+
This way, if `e.position` is null, the user will not see the `salary` property, and the `DENY` will not apply.
221+
222+
Alternatively, you can add a constraint to ensure that the `e.position` property cannot be null, so the `DENY` condition is always checkable:
223+
[source, cypher]
224+
----
225+
CREATE CONSTRAINT ON (e:Employee) ASSERT e.position IS NOT NULL;
226+
----
227+
This way, the `DENY` will never apply due to null values, and the user will not see the `salary` property for employees whose position is 'CEO'.
228+
229+
====
230+
231+
.Example of an un-met `DENY` failing open with label-based RBAC
232+
====
233+
234+
In a similar way, a `DENY` rule will not apply when it is too broad and does not match the data.
235+
[source, cypher]
236+
----
237+
GRANT READ {salary} ON GRAPH * NODES * TO myRole;
238+
----
239+
240+
This grants read access to the `salary` property on all nodes, including those that should not be accessible.
241+
242+
Then, you try to restrict it with a `DENY` rule to prevent access to the `salary` property on nodes labeled `Management`:
243+
[source, cypher]
244+
----
245+
DENY READ {salary} ON GRAPH * NODES Management TO myRole;
246+
----
247+
In this case, if the `Management` label is not present on a node that has the `salary` property, the `DENY` rule will not apply, and `myRole` will still see the `salary` property on that node.
248+
249+
A better way is to apply the principle of least privilege and only grant access to the `salary` property for nodes that have a specific label, such as `IndividualContributor`:
250+
[source, cypher]
251+
----
252+
GRANT READ {salary} ON GRAPH * NODES IndividualContributor TO myRole;
253+
----
254+
This way, the user will only see the `salary` property on nodes that have the `IndividualContributor` label, and not on any other nodes.
255+
====
183256

184257
[[access-control-limitations-labels]]
185258
== Security and labels
186259

187260
=== Traversing the graph with multi-labeled nodes
188261

189-
The general influence of access control privileges on graph traversal is described in detail in xref:authentication-authorization/manage-privileges.adoc[Graph and sub-graph access control].
190-
The following section will only focus on nodes due to their ability to have multiple labels.
191-
Relationships can only have one type of label and thus they do not exhibit the behavior this section aims to clarify.
192-
While this section will not mention relationships further, the general function of the traverse privilege also applies to them.
262+
In Neo4j, nodes can have multiple labels, but relationships only have one type.
263+
This is important when it comes to controlling who can see what.
264+
265+
The following section only focuses on nodes because they can have multiple labels.
266+
The same general rules apply to relationships, but they are simpler.
193267

194-
For any node that is traversable, due to `GRANT TRAVERSE` or `GRANT MATCH`,
195-
the user can get information about the attached labels by calling the built-in `labels()` function.
196-
In the case of nodes with multiple labels, they can be returned to users that weren't directly granted access to.
268+
For details on the general influence of access control privileges on graph traversal, see xref:authentication-authorization/manage-privileges.adoc[Graph and sub-graph access control].
197269

198-
To give an illustrative example, imagine a graph with three nodes: one labeled `:A`, another labeled `:B` and one with the labels `:A` and `:B`.
199-
In this case, there is a user with the role `custom` defined by:
200270

271+
If a user is granted access to a traversable node using `GRANT TRAVERSE` or `GRANT MATCH`, they will be able to get information about the attached labels by calling the built-in `labels()` function.
272+
In the case of nodes with multiple labels, this means that the user will be able to see all labels attached to the node, even if they were not granted access to traverse on some of those labels.
273+
274+
For example, if a user has the following role:
201275
[source, cypher]
202276
----
203277
GRANT TRAVERSE ON GRAPH * NODES A TO custom
204278
----
205279

206-
If that user were to execute
207-
280+
And the graph contains three nodes: one labeled `:A`, another labeled `:B`, and one with both labels `:A` and `:B`.
281+
If the user executes the following query:
208282
[source, cypher]
209283
----
210284
MATCH (n:A)
211285
RETURN n, labels(n)
212286
----
287+
They will get a result with two nodes: the node with label `:A` and the node with labels `:A :B`.
213288

214-
They would get a result with two nodes: the node that was labeled with `:A` and the node with labels `:A :B`.
215-
216-
In contrast, executing
289+
In contrast, if the user executes:
217290

218291
[source, cypher]
219292
----
220293
MATCH (n:B)
221294
RETURN n, labels(n)
222295
----
223296

224-
This will return only the one node that has both labels: `:A` and `:B`.
225-
Even though `:B` did not have access to traversals, there is one node with that label accessible in the dataset due to the allow-listed label `:A` that is attached to the same node.
297+
They will get only the node that has both labels: `:A` and `:B`.
298+
Even though `:B` does not have access to traversals, there is one node with that label accessible in the dataset due to the allow-listed label `:A` that is attached to the same node.
226299

227-
If a user is denied to traverse on a label they will never get results from any node that has this label attached to it.
300+
If a user is denied to traverse on a label, they will never get results from any node that has this label attached to it.
228301
Thus, the label name will never show up for them.
229-
As an example, this can be done by executing:
302+
For example, if the user has the following role:
230303

231304
[source, cypher]
232305
----
233306
DENY TRAVERSE ON GRAPH * NODES B TO custom
234307
----
235308

236-
The query
309+
And the graph contains the same three nodes as before, the user will not be able to traverse the node with label `:B`.
310+
Thus, the query
237311

238312
[source, cypher]
239313
----
@@ -257,25 +331,22 @@ In contrast to the normal graph traversal described in the previous section, the
257331
That means:
258332

259333
* If a label is explicitly whitelisted (granted), it will be returned by this procedure.
260-
* If a label is denied or isn't explicitly allowed, it will not be returned by this procedure.
261-
262-
Reusing the previous example, imagine a graph with three nodes: one labeled `:A`, another labeled `:B` and one with the labels `:A` and `:B`.
263-
In this case, there is a user with the role `custom` defined by:
334+
* If a label is denied or is not explicitly allowed, it will not be returned by this procedure.
264335

336+
For example, if a user has the following role:
265337
[source, cypher]
266338
----
267339
GRANT TRAVERSE ON GRAPH * NODES A TO custom
268340
----
269341

270-
This means that only label `:A` is explicitly allow-listed.
271-
Thus, executing
272-
342+
and the graph contains three nodes: one labeled `:A`, another labeled `:B`, and one with both labels `:A` and `:B`,
343+
the user will be able to execute the following query:
273344
[source, cypher]
274345
----
275346
CALL db.labels()
276347
----
277-
278-
will only return label `:A`, because that is the only label for which traversal was granted.
348+
This will return a list of labels, which in this case will only include the label `:A`.
349+
The label `:B` will not be returned, because the user does not have access to traverse on it.
279350

280351
[[access-control-limitations-non-existing-labels]]
281352
=== Privileges for non-existing labels, relationship types, and property names
@@ -332,15 +403,17 @@ To ensure success on the first attempt, when setting up the privileges for the `
332403
In this example, when creating the custom role, connect to `testing` and run `CALL db.createLabel('A')` to ensure Alice creates the node successfully on her first attempt.
333404

334405

335-
336406
[[access-control-limitations-db-operations]]
337407
== Security and performance
338408

339-
The rules of a security model may impact the performance of some database operations.
340-
This is because extra security checks are necessary, and they require additional data access.
409+
=== Security rules and database operations
410+
411+
The rules of a security model may impact the performance of some database operations, because Neo4j has to do extra security checks, which require additional data access.
341412
For example, count store operations, which are usually fast lookups, may experience notable differences in performance.
342413

343-
The following example shows how the database behaves when adding security rules to roles `restricted` and `unrestricted`:
414+
Let's take the following example.
415+
The database has two roles defined `restricted` and `unrestricted`.
416+
The `restricted` role has limited access to traversals, while the `unrestricted` role has no restrictions.
344417

345418
[source, cypher]
346419
----
@@ -389,10 +462,11 @@ So due to the additional data access required by the security checks, this opera
389462
|===
390463

391464
[[property-based-access-control-limitations]]
392-
=== Property-based access control limitations
465+
=== Security rules based on property rules and performance
466+
393467
Extra node or relationship-level security checks are necessary when adding security rules based on property rules, and these can have a significant performance impact.
394468

395-
The following example shows how the database behaves when adding security rules for nodes to roles `restricted` and `unrestricted`.
469+
The following example shows how the database behaves when adding security rules for nodes to roles `restricted` and `unrestricted`.
396470
The same limitations apply to relationships.
397471

398472
[source, cypher]

0 commit comments

Comments
 (0)