diff --git a/modules/ROOT/images/with_clause.svg b/modules/ROOT/images/with_clause.svg new file mode 100644 index 000000000..f77938445 --- /dev/null +++ b/modules/ROOT/images/with_clause.svg @@ -0,0 +1 @@ +BOUGHTdate:DATESUPPLIESCustomerfirstName:STRINGlastName:STRINGemail:STRINGdiscount:FLOATProductname:STRINGprice:INTEGERSuppliername:STRINGemail:STRING \ No newline at end of file diff --git a/modules/ROOT/pages/clauses/with.adoc b/modules/ROOT/pages/clauses/with.adoc index 693bd4002..370c14ac1 100644 --- a/modules/ROOT/pages/clauses/with.adoc +++ b/modules/ROOT/pages/clauses/with.adoc @@ -1,289 +1,532 @@ -:description: The `WITH` clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next. +:description: Information about Cypher's `WITH` clause, which allows query parts to be chained together, piping the results from one part to be used as the starting point of the next. +:table-caption!: -[[query-with]] = WITH -The `WITH` clause allows query parts to be chained together, piping the results from one to be used as starting points or criteria in the next. +The `WITH` clause serves multiple purposes in Cypher: -[NOTE] -==== -It is important to note that `WITH` affects variables in scope. -Any variables not included in the `WITH` clause are not carried over to the rest of the query. -The wildcard `*` can be used to include all variables that are currently in scope. -==== +* xref:clauses/with.adoc#create-new-variables[Create new variables] +* xref:clauses/with.adoc#variable-scope[Control variables in scope] +* xref:clauses/with.adoc#bind-values-to-variables[Bind the results of expressions to new variables] +* xref:clauses/with.adoc#aggregations[Perform aggregations] +* xref:clauses/with.adoc#remove-duplicate-values[Remove duplicate values] +* xref:clauses/with.adoc#ordering-and-pagination[Order and paginate results] +* xref:clauses/with.adoc#filter-results[Filter results] -Using `WITH`, you can manipulate the output before it is passed on to the following query parts. -Manipulations can be done to the shape and/or number of entries in the result set. +[[example-graph]] +== Example graph +A graph with the following schema is used for the examples below: -One common usage of `WITH` is to limit the number of entries passed on to other `MATCH` clauses. -By combining `ORDER BY` and `LIMIT`, it is possible to get the top X entries by some criteria and then bring in additional data from the graph. +image::with_clause.svg[width="600",role="middle"] -`WITH` can also be used to introduce new variables containing the results of expressions for use in the following query parts (see xref::clauses/with.adoc#with-introduce-variables[Introducing variables for expressions]). -For convenience, the wildcard `*` expands to all variables that are currently in scope and carries them over to the next query part (see xref::clauses/with.adoc#with-wildcard[Using the wildcard to carry over variables]). +To recreate the graph, run the following query against an empty Neo4j database. -Another use is to filter on aggregated values. -`WITH` is used to introduce aggregates which can then be used in predicates in `WHERE`. -These aggregate expressions create new bindings in the results. +[source, cypher, role=test-setup] +---- +CREATE (techCorp:Supplier {name: 'TechCorp', email: 'contact@techcorp.com'}), + (foodies:Supplier {name: 'Foodies Inc.', email: 'info@foodies.com'}), + + (laptop:Product {name: 'Laptop', price: 1000}), + (phone:Product {name: 'Phone', price: 500}), + (headphones:Product {name: 'Headphones', price: 250}), + (chocolate:Product {name: 'Chocolate', price: 5}), + (coffee:Product {name: 'Coffee', price: 10}), + + (amir:Customer {firstName: 'Amir', lastName: 'Rahman', email: 'amir.rahman@example.com', discount: 0.1}), + (keisha:Customer {firstName: 'Keisha', lastName: 'Nguyen', email: 'keisha.nguyen@example.com', discount: 0.2}), + (mateo:Customer {firstName: 'Mateo', lastName: 'Ortega', email: 'mateo.ortega@example.com', discount: 0.05}), + (hannah:Customer {firstName: 'Hannah', lastName: 'Connor', email: 'hannah.connor@example.com', discount: 0.15}), + (leila:Customer {firstName: 'Leila', lastName: 'Haddad', email: 'leila.haddad@example.com', discount: 0.1}), + (niko:Customer {firstName: 'Niko', lastName: 'Petrov', email: 'niko.petrov@example.com', discount: 0.25}), + (yusuf:Customer {firstName: 'Yusuf', lastName: 'Abdi', email: 'yusuf.abdi@example.com', discount: 0.1}), + + (amir)-[:BUYS {date: date('2024-10-09')}]->(laptop), + (amir)-[:BUYS {date: date('2025-01-10')}]->(chocolate), + (keisha)-[:BUYS {date: date('2023-07-09')}]->(headphones), + (mateo)-[:BUYS {date: date('2025-03-05')}]->(chocolate), + (mateo)-[:BUYS {date: date('2025-03-05')}]->(coffee), + (mateo)-[:BUYS {date: date('2024-04-11')}]->(laptop), + (hannah)-[:BUYS {date: date('2023-12-11')}]->(coffee), + (hannah)-[:BUYS {date: date('2024-06-02')}]->(headphones), + (leila)-[:BUYS {date: date('2023-05-17')}]->(laptop), + (niko)-[:BUYS {date: date('2025-02-27')}]->(phone), + (niko)-[:BUYS {date: date('2024-08-23')}]->(headphones), + (niko)-[:BUYS {date: date('2024-12-24')}]->(coffee), + (yusuf)-[:BUYS {date: date('2024-12-24')}]->(chocolate), + (yusuf)-[:BUYS {date: date('2025-01-02')}]->(laptop), + + (techCorp)-[:SUPPLIES]->(laptop), + (techCorp)-[:SUPPLIES]->(phone), + (techCorp)-[:SUPPLIES]->(headphones), + (foodies)-[:SUPPLIES]->(chocolate), + (foodies)-[:SUPPLIES]->(coffee) +---- -image:graph_with_clause.svg[] +[[create-new-variables]] +== Create new variables -//// -[source, cypher, role=test-setup] +`WITH` can be used in combination with the `AS` keyword to bind new variables which can then be passed to subsequent clauses. + +.Create a new variable +[source, cypher] ---- -CREATE - (a {name: 'Anders'}), - (b {name: 'Bossman'}), - (c {name: 'Caesar'}), - (d {name: 'David'}), - (e {name: 'George'}), - (a)-[:KNOWS]->(b), - (a)-[:BLOCKS]->(c), - (d)-[:KNOWS]->(a), - (b)-[:KNOWS]->(e), - (c)-[:KNOWS]->(e), - (b)-[:BLOCKS]->(d) +WITH [1, 2, 3] AS list +RETURN list ---- -//// +.Result +[role="queryresult",options="header,footer",cols="1*(:Product {name: 'Chocolate'}) +WITH c AS customer +RETURN customer.firstName AS chocolateCustomer ---- -This query returns the name of persons connected to *'George'* whose name starts with a `C`, regardless of capitalization. - .Result [role="queryresult",options="header,footer",cols="1*(p:Product {name: 'Chocolate'}) +WITH c.name AS chocolateCustomers +RETURN chocolateCustomers, + p.price AS chocolatePrice +---- -.Query -[source, cypher, indent=0] +.Error message +[source, error] ---- -MATCH (person)-[r]->(otherPerson) -WITH *, type(r) AS connectionType -RETURN person.name, otherPerson.name, connectionType +Variable `p` not defined ---- -This query returns the names of all related persons and the type of relationship between them. +.Retain all variables with `WITH *` +[source, cypher] +---- +MATCH (supplier:Supplier)-[r]->(product:Product) +WITH * +RETURN s.name AS company, + type(r) AS relType, + product.name AS product +---- .Result [role="queryresult",options="header,footer",cols="3* Import variables]. -.Query -[source, cypher, indent=0] +[[bind-values-to-variables]] +== Bind values to variables + +`WITH` can be used to assign the values of expressions to variables. +In the below query, the value of the xref:expressions/string-operators.adoc[`STRING` concatenation] expression is bound to a new variable `customerFullName`, and the value from the expression `chocolate.price * (1 - customer.discount)` is bound to `chocolateNetPrice`, both of which are then available in the `RETURN` clause. + +.Bind values to variables +[source, cypher] ---- -MATCH (david {name: 'David'})--(otherPerson)-->() -WITH otherPerson, count(*) AS foaf -WHERE foaf > 1 -RETURN otherPerson.name +MATCH (customer:Customer)-[:BUYS]->(chocolate:Product {name: 'Chocolate'}) +WITH customer.firstName || ' ' || customer.lastName AS customerFullName, + chocolate.price * (1 - customer.discount) AS chocolateNetPrice +RETURN customerFullName, + chocolateNetPrice ---- -The name of the person connected to *'David'* with the at least more than one outgoing relationship will be returned by the query. +.Result +[role="queryresult",options="header,footer",cols="2*= 500 AS isExpensive +WITH p, isExpensive, NOT isExpensive AS isAffordable +WITH p, isExpensive, isAffordable, + CASE + WHEN isExpensive THEN 'High-end' + ELSE 'Budget' + END AS discountCategory +RETURN p.name AS product, + p.price AS price, + isAffordable, + discountCategory +ORDER BY price +---- .Result -[role="queryresult",options="header,footer",cols="1* Chaining expressions]. -[[with-sort-results-before-using-collect-on-them]] -== Sort results before using collect on them +[[aggregations]] +== Aggregations -You can sort your results before passing them to collect, thus sorting the resulting list. +The `WITH` clause can perform aggregations and bind the results to new variables. +In this example, the xref:functions/aggregating.adoc#functions-sum[`sum()`] function is used to calculate the total spent by each customer, and the value for each is bound to the new variable `totalSpent`. +The xref:functions/aggregating.adoc#functions-collect[`collect()`] function is used to collect each product into `LIST` values bound to the `productsBought` variable. -.Query -[source, cypher, indent=0] +.`WITH` performing aggregations +[source, cypher] ---- -MATCH (n) -WITH n -ORDER BY n.name DESC -LIMIT 3 -RETURN collect(n.name) +MATCH (c:Customer)-[:BUYS]->(p:Product) +WITH c.firstName AS customer, + sum(p.price) AS totalSpent, + collect(p.name) AS productsBought +RETURN customer, + totalSpent, + productsBought +ORDER BY totalSpent DESC ---- -A list of the names of people in reverse order, limited to 3, is returned in a list. - .Result -[role="queryresult",options="header,footer",cols="1* 2 -RETURN x +MATCH (c:Customer)-[:BUYS]->(p:Product) +WITH c, + sum(p.price) AS totalSpent + ORDER BY totalSpent DESC +RETURN c.firstName AS customer, totalSpent ---- -The limit is first applied, reducing the rows to the first 5 items in the list. The filter is then applied, reducing the final result as seen below: - .Result -[role="queryresult",options="header,footer",cols="1* 2 -WITH x -LIMIT 5 -RETURN x +MATCH (c:Customer)-[:BUYS]->(p:Product) +WITH c, + sum(p.price) AS totalSpent + ORDER BY totalSpent DESC + LIMIT 3 +SET c.topSpender = true +RETURN c.firstName AS customer, + totalSpent, + c.topSpender AS topSpender ---- -This time the filter is applied first, reducing the rows to consist of the list `[3, 4, 5, 6]`. -Then the limit is applied. -As the limit is larger than the total number of remaining rows, all rows are returned. +[role="queryresult",options="header,footer", cols="3*(p:Product) +WITH c, + sum(p.price) AS totalSpent + ORDER BY totalSpent DESC + SKIP 3 +SET c.topSpender = false +RETURN c.firstName AS customer, + totalSpent, + c.topSpender AS topSpender +---- + +[role="queryresult",options="header,footer", cols="3*(p) -WITH DISTINCT p.name AS name -RETURN name +MATCH (:Supplier {name: 'Foodies Inc.'})-[:SUPPLIES]->(p:Product) +WITH p + ORDER BY p.price DESC + LIMIT 1 +MATCH (p)<-[:BUYS]-(c:Customer) +RETURN p.name AS product + p.price AS price, + collect(c.firstName) AS customers ---- -`'George'` is returned by the query, but only once (without the `DISTINCT` operator it would have been returned twice because there are two `KNOWS` relationships going to it): -.Result -[role="queryresult",options="header,footer",cols="1*(p) -WITH ALL p.name AS name -RETURN name +UNWIND [1, 2, 3, 4, 5, 6] AS x +WITH x + WHERE x > 2 +RETURN x ---- -The same name is returned twice, as there are two `KNOWS` relationships connecting to it. +[role="queryresult",options="header,footer", cols="1*(p:Product)<-[:BUYS]-(c:Customer) +WITH s, + sum(p.price) AS totalSales, + count(DISTINCT c) AS uniqueCustomers + WHERE totalSales > 1000 +RETURN s.name AS supplier, + totalSales, + uniqueCustomers +---- + +[role="queryresult",options="header,footer", cols="3*` constructs. +For more information, see xref:clauses/filter.adoc#filter-with-where[`FILTER` as a substitute for `WITH * WHERE`].