diff --git a/data-explorer/.openpublishing.redirection.json b/data-explorer/.openpublishing.redirection.json index 6bbe301fbd..3380d7608a 100644 --- a/data-explorer/.openpublishing.redirection.json +++ b/data-explorer/.openpublishing.redirection.json @@ -10,6 +10,21 @@ "redirect_url": "/azure/data-explorer/integrate-overview", "redirect_document_id": false }, + { + "source_path": "graph-best-practices.md", + "redirect_url": "/kusto/query/graph-best-practices?view=azure-data-explorer&preserve-view=true", + "redirect_document_id": false + }, + { + "source_path": "graph-overview.md", + "redirect_url": "/kusto/query/graph-semantics-overview?view=azure-data-explorer&preserve-view=true", + "redirect_document_id": false + }, + { + "source_path": "graph-scenarios.md", + "redirect_url": "/kusto/query/graph-scenarios?view=azure-data-explorer&preserve-view=true", + "redirect_document_id": false + }, { "source_path": "net-standard-ingest-data.md", "redirect_url": "/azure/data-explorer/net-sdk-ingest-data", diff --git a/data-explorer/delete-cluster.md b/data-explorer/delete-cluster.md index 00a9111925..42a6c19d80 100644 --- a/data-explorer/delete-cluster.md +++ b/data-explorer/delete-cluster.md @@ -2,7 +2,7 @@ title: Delete an Azure Data Explorer cluster description: Learn how to delete an Azure Data Explorer cluster. ms.topic: how-to -ms.date: 10/08/2024 +ms.date: 05/28/2025 --- # Delete an Azure Data Explorer cluster @@ -32,11 +32,11 @@ To delete your Azure Data Explorer cluster: 1. In the **Delete cluster** window, type in the name of the cluster. Then, select **Delete**. > [!CAUTION] - > Deleting a cluster is a permanent action and cannot be undone. All cluster content will be lost. To recover the cluster in the initial 14 days soft-delete period, please open a support ticket. + > Deleting a cluster is a permanent action and can't be undone. All cluster content is lost. To recover the cluster in the initial 14 days soft-delete period, open a support ticket. ## Opt out of soft delete -You can opt out of the soft delete period by setting a tag on your cluster. Once you've set the tag, a deleted cluster won't enter the soft delete period and is permanently deleted immediately. +You can opt out of the soft delete period by setting a tag on your cluster. Once you set the tag, a deleted cluster doesn't enter the soft delete period and is permanently deleted immediately. Tags can be set on Azure resources through the portal, PowerShell, Azure CLI, or ARM templates. For more information on these different methods, see [Use tags to organize your Azure resources](/azure/azure-resource-manager/management/tag-resources). @@ -46,3 +46,4 @@ The key for the tag is `opt-out-of-soft-delete` and the value is `true`. * [Troubleshoot: Failure to create or delete a database or table](troubleshoot-database-table.md) * [Quickstart: Create a cluster and database](create-cluster-and-database.md) +* [Create Support Ticket: Create an Azure support request](/azure/azure-portal/supportability/how-to-create-azure-support-request?branch=main) \ No newline at end of file diff --git a/data-explorer/graph-best-practices.md b/data-explorer/graph-best-practices.md deleted file mode 100644 index 103bf1c6fa..0000000000 --- a/data-explorer/graph-best-practices.md +++ /dev/null @@ -1,284 +0,0 @@ ---- -title: Best practices for Kusto Query Language (KQL) graph semantics -description: Learn about the best practices for Kusto Query Language (KQL) graph semantics. -ms.reviewer: herauch -ms.topic: conceptual -ms.date: 09/03/2023 -# Customer intent: As a data analyst, I want to learn about best practices for KQL graph semantics. ---- - -# Best practices for Kusto Query Language (KQL) graph semantics - -This article explains how to use the graph semantics feature in KQL effectively and efficiently for different use cases and scenarios. It shows how to create and query graphs with the syntax and operators, and how to integrate them with other KQL features and functions. It also helps users avoid common pitfalls or errors, such as creating graphs that exceed memory or performance limits, or applying unsuitable or incompatible filters, projections, or aggregations. - -## Size of graph - -The [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) creates an in-memory representation of a graph. It consists of the graph structure itself and its properties. When making a graph, use appropriate filters, projections, and aggregations to select only the relevant nodes and edges and their properties. - -The following example shows how to reduce the number of nodes and edges and their properties. In this scenario, Bob changed manager from Alice to Eve and the user only wants to see the latest state of the graph for their organization. To reduce the size of the graph, the nodes are first filtered by the organization property and then the property is removed from the graph using the [project-away operator](/kusto/query/project-away-operator?view=azure-data-explorer&preserve-view=true). The same happens for edges. Then [summarize operator](/kusto/query/summarize-operator?view=azure-data-explorer&preserve-view=true) together with [arg_max](/kusto/query/arg-max-aggregation-function?view=azure-data-explorer&preserve-view=true) is used to get the last known state of the graph. - -```kusto -let allEmployees = datatable(organization: string, name:string, age:long) -[ - "R&D", "Alice", 32, - "R&D","Bob", 31, - "R&D","Eve", 27, - "R&D","Mallory", 29, - "Marketing", "Alex", 35 -]; -let allReports = datatable(employee:string, manager:string, modificationDate: datetime) -[ - "Bob", "Alice", datetime(2022-05-23), - "Bob", "Eve", datetime(2023-01-01), - "Eve", "Mallory", datetime(2022-05-23), - "Alice", "Dave", datetime(2022-05-23) -]; -let filteredEmployees = - allEmployees - | where organization == "R&D" - | project-away age, organization; -let filteredReports = - allReports - | summarize arg_max(modificationDate, *) by employee - | project-away modificationDate; -filteredReports -| make-graph employee --> manager with filteredEmployees on name -| graph-match (employee)-[hasManager*2..5]-(manager) - where employee.name == "Bob" - project employee = employee.name, topManager = manager.name -``` - -**Output** - -| employee | topManager | -| -------- | ---------- | -| Bob | Mallory | - -## Last known state of the graph - -The [Size of graph](#size-of-graph) example demonstrated how to get the last known state of the edges of a graph by using `summarize` operator and the `arg_max` aggregation function. Obtaining the last known state is a compute-intensive operation. - -Consider creating a materialized view to improve the query performance, as follows: - -1. Create tables that have some notion of version as part of their model. We recommend using a `datetime` column that you can later use to create a graph time series. - - ```kusto - .create table employees (organization: string, name:string, stateOfEmployment:string, properties:dynamic, modificationDate:datetime) - - .create table reportsTo (employee:string, manager:string, modificationDate: datetime) - ``` - -1. Create a materialized view for each table and use the [arg_max aggregation](/kusto/query/arg-max-aggregation-function?view=azure-data-explorer&preserve-view=true) function to determine the *last known state* of employees and the *reportsTo* relation. - - ```kusto - .create materialized-view employees_MV on table employees - { - employees - | summarize arg_max(modificationDate, *) by name - } - - .create materialized-view reportsTo_MV on table reportsTo - { - reportsTo - | summarize arg_max(modificationDate, *) by employee - } - ``` - -1. Create two functions that ensure that only the materialized component of the materialized view is used and additional filters and projections are applied. - - ```kusto - .create function currentEmployees () { - materialized_view('employees_MV') - | where stateOfEmployment == "employed" - } - - .create function reportsTo_lastKnownState () { - materialized_view('reportsTo_MV') - | project-away modificationDate - } - ``` - -The resulting query using materialized makes the query faster and more efficient for larger graphs. It also enables higher concurrency and lower latency queries for the latest state of the graph. The user can still query the graph history based on the employees and *reportsTo* tables, if needed - -```kusto -let filteredEmployees = - currentEmployees - | where organization == "R&D" - | project-away organization; -reportsTo_lastKnownState -| make-graph employee --> manager with filteredEmployees on name -| graph-match (employee)-[hasManager*2..5]-(manager) - where employee.name == "Bob" - project employee = employee.name, reportingPath = hasManager.manager -``` - -## Graph time travel - -Some scenarios require you to analyze data based on the state of a graph at a specific point in time. Graph time travel uses a combination of time filters and summarizes using the arg_max aggregation function. - -The following KQL statement creates a function with a parameter that defines the interesting point in time for the graph. It returns a ready-made graph. - -```kusto -.create function graph_time_travel (interestingPointInTime:datetime ) { - let filteredEmployees = - employees - | where modificationDate < interestingPointInTime - | summarize arg_max(modificationDate, *) by name; - let filteredReports = - reportsTo - | where modificationDate < interestingPointInTime - | summarize arg_max(modificationDate, *) by employee - | project-away modificationDate; - filteredReports - | make-graph employee --> manager with filteredEmployees on name -} -``` - -With the function in place, the user can craft a query to get the top manager of Bob based on the graph in June 2022. - -```kusto -graph_time_travel(datetime(2022-06-01)) -| graph-match (employee)-[hasManager*2..5]-(manager) - where employee.name == "Bob" - project employee = employee.name, reportingPath = hasManager.manager -``` - -**Output** - -| employee | topManager | -| -------- | ---------- | -| Bob | Dave | - -## Dealing with multiple node and edge types - -Sometimes it's required to contextualize time series data with a graph that consists of multiple node types. One way of handling this scenario is creating a general-purpose property graph that is represented by a canonical model. - -Occasionally, you may need to contextualize time series data with a graph that has multiple node types. You could approach the problem by creating a general-purpose property graph that is based on a canonical model, such as the following. - -- nodes - - nodeId (string) - - label (string) - - properties (dynamic) -- edges - - source (string) - - destination (string) - - label (string) - - properties (dynamic) - -The following example shows how to transform the data into a canonical model and how to query it. The base tables for the nodes and edges of the graph have different schemas. - -This scenario involves a factory manager who wants to find out why equipment isn't working well and who is responsible for fixing it. The manager decides to use a graph that combines the asset graph of the production floor and the maintenance staff hierarchy which changes every day. - -The following graph shows the relations between assets and their time series, such as speed, temperature, and pressure. The operators and the assets, such as *pump*, are connected via the *operates* edge. The operators themselves report up to management. - -:::image type="content" source="media/graph/graph-property-graph.png" alt-text="Infographic on the property graph scenario." lightbox="media/graph/graph-property-graph.png"::: - -The data for those entities can be stored directly in your cluster or acquired using query federation to a different service, such as Azure Cosmos DB, Azure SQL, or Azure Digital Twin. To illustrate the example, the following tabular data is created as part of the query: - -```kusto -let sensors = datatable(sensorId:string, tagName:string, unitOfMeasuree:string) -[ - "1", "temperature", "°C", - "2", "pressure", "Pa", - "3", "speed", "m/s" -]; -let timeseriesData = datatable(sensorId:string, timestamp:string, value:double, anomaly: bool ) -[ - "1", datetime(2023-01-23 10:00:00), 32, false, - "1", datetime(2023-01-24 10:00:00), 400, true, - "3", datetime(2023-01-24 09:00:00), 9, false -]; -let employees = datatable(name:string, age:long) -[ - "Alice", 32, - "Bob", 31, - "Eve", 27, - "Mallory", 29, - "Alex", 35, - "Dave", 45 -]; -let allReports = datatable(employee:string, manager:string) -[ - "Bob", "Alice", - "Alice", "Dave", - "Eve", "Mallory", - "Alex", "Dave" -]; -let operates = datatable(employee:string, machine:string, timestamp:datetime) -[ - "Bob", "Pump", datetime(2023-01-23), - "Eve", "Pump", datetime(2023-01-24), - "Mallory", "Press", datetime(2023-01-24), - "Alex", "Conveyor belt", datetime(2023-01-24), -]; -let assetHierarchy = datatable(source:string, destination:string) -[ - "1", "Pump", - "2", "Pump", - "Pump", "Press", - "3", "Conveyor belt" -]; -``` - -The *employees*, *sensors*, and other entities and relationships don't share a canonical data model. You can use the [union operator](/kusto/query/union-operator?view=azure-data-explorer&preserve-view=true) to combine and canonize the data. - -The following query joins the sensor data with the time series data to find the sensors that have abnormal readings. Then, it uses a projection to create a common model for the graph nodes. - -```kusto -let nodes = - union - ( - sensors - | join kind=leftouter - ( - timeseriesData - | summarize hasAnomaly=max(anomaly) by sensorId - ) on sensorId - | project nodeId = sensorId, label = "tag", properties = pack_all(true) - ), - ( employees | project nodeId = name, label = "employee", properties = pack_all(true)); -``` - -The edges are transformed in a similar way. - -```kusto -let edges = - union - ( assetHierarchy | extend label = "hasParent" ), - ( allReports | project source = employee, destination = manager, label = "reportsTo" ), - ( operates | project source = employee, destination = machine, properties = pack_all(true), label = "operates" ); -``` - -With the canonized nodes and edges data, you can create a graph using the [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true), as follows: - -```kusto -let graph = edges -| make-graph source --> destination with nodes on nodeId; -``` - -Once created, define the path pattern and project the information required. The pattern starts at a tag node followed by a variable length edge to an asset. That asset is operated by an operator that reports to a top manager via a variable length edge, called *reportsTo*. The constraints section of the [graph-match operator](/kusto/query/graph-match-operator?view=azure-data-explorer&preserve-view=true), in this instance **where**, reduces the tags to the ones that have an anomaly and were operated on a specific day. - -```kusto -graph -| graph-match (tag)-[hasParent*1..5]->(asset)<-[operates]-(operator)-[reportsTo*1..5]->(topManager) - where tag.label=="tag" and tobool(tag.properties.hasAnomaly) and - startofday(todatetime(operates.properties.timestamp)) == datetime(2023-01-24) - and topManager.label=="employee" - project - tagWithAnomaly = tostring(tag.properties.tagName), - impactedAsset = asset.nodeId, - operatorName = operator.nodeId, - responsibleManager = tostring(topManager.nodeId) -``` - -**Output** - -| tagWithAnomaly | impactedAsset | operatorName | responsibleManager | -| -------------- | ------------- | ------------ | ------------------ | -| temperature | Pump | Eve | Mallory | - -The projection in graph-match outputs the information that the temperature sensor showed an anomaly on the specified day. It was operated by Eve who ultimately reports to Mallory. With this information, the factory manager can reach out to Eve and potentially Mallory to get a better understanding of the anomaly. - -## Related content - -* [Graph operators](/kusto/query/graph-operators?view=azure-data-explorer&preserve-view=true) diff --git a/data-explorer/graph-overview.md b/data-explorer/graph-overview.md deleted file mode 100644 index 72da3a2b4b..0000000000 --- a/data-explorer/graph-overview.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: Kusto Query Language (KQL) graph semantics overview -description: Learn about how to contextualize data in queries using KQL graph semantics -ms.reviewer: herauch -ms.topic: conceptual -ms.date: 09/03/2023 -# Customer intent: As a data analyst, I want to learn about how to contextualize data in queries using KQL graph semantics ---- - -# Kusto Query Language (KQL) graph semantics overview - - - -Graph semantics in Kusto Query Language (KQL) allows you to model and query data as graphs. The structure of a graph comprises nodes and edges that connect them. Both nodes and edges can have properties that describe them. - -Graphs are useful for representing complex and dynamic data that involve many-to-many, hierarchical, or networked relationships, such as social networks, recommendation systems, connected assets, or knowledge graphs. -For example, the following graph illustrates a social network that consists of four nodes and three edges. Each node has a property for its name, such as *Bob*, and each edge has a property for its type, such as *reportsTo*. - -:::image type="content" source="media/graph/graph-social-network.png" alt-text="Diagram that shows a social network as a graph."::: - -Graphs store data differently from relational databases, which use tables and need indexes and joins to connect related data. In graphs, each node has a direct pointer to its neighbors (adjacency), so there's no need to index or join anything, making it easy and fast to traverse the graph. Graph queries can use the graph structure and meaning to do complex and powerful operations, such as finding paths, patterns, shortest distances, communities, or centrality measures. - -You can create and query graphs using KQL graph semantics, which has a simple and intuitive syntax that works well with the existing KQL features. You can also mix graph queries with other KQL features, such as time-based, location-based, and machine-learning queries, to do more advanced and powerful data analysis. By using KQL with graph semantics, you get the speed and scale of KQL queries with the flexibility and expressiveness of graphs. - -For example, you can use: - -- Time-based queries to analyze the evolution of a graph over time, such as how the network structure or the node properties change -- Geospatial queries to analyze the spatial distribution or proximity of nodes and edges, such as how the location or distance affects the relationship -- Machine learning queries to apply various algorithms or models to graph data, such as clustering, classification, or anomaly detection - -## How does it work? - -Every query of the graph semantics in Kusto requires creating a new graph representation. You use a graph operator that converts tabular expressions for edges and optionally nodes into a graph representation of the data. Once the graph is created, you can apply different operations to further enhance or examine the graph data. - -The graph semantics extension uses an in-memory graph engine that works on the data in the memory of your cluster, making graph analysis interactive and fast. The memory consumption of a graph representation is affected by the number of nodes and edges and their respective properties. The graph engine uses a property graph model that supports arbitrary properties for nodes and edges. It also integrates with all the existing scalar operators of KQL, which gives users the ability to write expressive and complex graph queries that can use the full power and functionality of KQL. - -## Why use graph semantics in KQL? - -There are several reasons to use graph semantics in KQL, such as the following examples: - -- KQL doesn't support recursive joins, so you have to explicitly define the traversals you want to run (see [Scenario: Friends of a friend](graph-scenarios.md#friends-of-a-friend)). You can use the [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) to define hops of variable length, which is useful when the relationship distance or depth isn't fixed. For example, you can use this operator to find all the resources that are connected in a graph or all the places you can reach from a source in a transportation network. - -- Time-aware graphs are a unique feature of graph semantics in KQL that allow users to model graph data as a series of graph manipulation events over time. Users can examine how the graph evolves over time, such as how the graph's network structure or the node properties change, or how the graph events or anomalies happen. For example, users can use time series queries to discover trends, patterns, or outliers in the graph data, such as how the network density, centrality, or modularity change over time - -- The intellisense feature of the KQL query editor assists users in writing and executing queries in the query language. It provides syntax highlighting, autocompletion, error checking, and suggestions. It also helps users with the graph semantics extension by offering graph-specific keywords, operators, functions, and examples to guide users through the graph creation and querying process. - -## Limits - -The following are some of the main limits of the graph semantics feature in KQL: - -- You can only create or query graphs that fit into the memory of one cluster node. -- Graph data isn't persisted or distributed across cluster nodes, and is discarded after the query execution. - -Therefore, When using the graph semantics feature in KQL, you should consider the memory consumption and performance implications of creating and querying large or dense graphs. Where possible, you should use filters, projections, and aggregations to reduce the graph size and complexity. - -## Related content - -- [Graph operators](/kusto/query/graph-operators?view=azure-data-explorer&preserve-view=true) -- [Scenarios](graph-scenarios.md) -- [Best practices](graph-best-practices.md) diff --git a/data-explorer/graph-scenarios.md b/data-explorer/graph-scenarios.md deleted file mode 100644 index 364f784678..0000000000 --- a/data-explorer/graph-scenarios.md +++ /dev/null @@ -1,115 +0,0 @@ ---- -title: Scenarios for using Kusto Query Language (KQL) graph semantics -description: Learn about common scenarios for using Kusto Query Language (KQL) graph semantics. -ms.reviewer: herauch -ms.topic: conceptual -ms.date: 09/03/2023 -# Customer intent: As a data analyst, I want to learn about common scenarios for using Kusto Query Language (KQL) graph semantics. ---- - -# What are common scenarios for using Kusto Query Language (KQL) graph semantics? - - - - -Graph semantics in Kusto Query Language (KQL) allows you to model and query data as graphs. There are many scenarios where graphs are useful for representing complex and dynamic data that involve many-to-many, hierarchical, or networked relationships, such as social networks, recommendation systems, connected assets, or knowledge graphs. - -In this article, you learn about the following common scenarios for using KQL graph semantics: - -- [Friends of a friend](#friends-of-a-friend) -- [Insights from log data](#insights-from-log-data) - -## Friends of a friend - -One common use case for graphs is to model and query social networks, where nodes are users and edges are friendships or interactions. For example, imagine we have a table called *Users* that has data about users, such as their name and organization, and a table called *Knows* that has data about the friendships between users as shown in the following diagram: - -:::image type="content" source="media/graph/graph-friends-of-a-friend.png" alt-text="Diagram that shows a graph of friends of a friend."::: - -Without using graph semantics in KQL, you could create a graph to find friends of a friend by using multiple joins, as follows: - -```kusto -let Users = datatable (UserId: string, name: string, org: string)[]; // nodes -let Knows = datatable (FirstUser: string, SecondUser: string)[]; // edges -Users -| where org == "Contoso" -| join kind=inner (Knows) on $left.UserId == $right.FirstUser -| join kind=innerunique(Users) on $left.SecondUser == $right.UserId -| join kind=inner (Knows) on $left.SecondUser == $right.FirstUser -| join kind=innerunique(Users) on $left.SecondUser1 == $right.UserId -| where UserId != UserId1 -| project name, name1, name2 -``` - -You can use graph semantics in KQL to perform the same query in a more intuitive and efficient way. The following query uses the [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) to create a directed graph from *FirstUser* to *SecondUser* and enriches the properties on the nodes with the columns provided by the *Users* table. Once the graph is instantiated, the [graph-match operator](/kusto/query/graph-match-operator?view=azure-data-explorer&preserve-view=true) provides the friend-of-a-friend pattern including filters and a projection that results in a tabular output. - -```kusto -let Users = datatable (UserId:string , name:string , org:string)[]; // nodes -let Knows = datatable (FirstUser:string , SecondUser:string)[]; // edges -Knows -| make-graph FirstUser --> SecondUser with Users on UserId -| graph-match (user)-->(middle_man)-->(friendOfAFriend) - where user.org == "Contoso" and user.UserId != friendOfAFriend.UserId - project contoso_person = user.name, middle_man = middle_man.name, kusto_friend_of_friend = friendOfAFriend.name -``` - -## Insights from log data - -In some use cases, you want to gain insights from a simple flat table containing time series information, such as log data. The data in each row is a string that contains raw data. To create a graph from this data, you must first identify the entities and relationships that are relevant to the graph analysis. For example, suppose you have a table called *rawLogs* from a web server that contains information about requests, such as the timestamp, the source IP address, the destination resource, and much more. - -The following table shows an example of the raw data: - -```kusto -let rawLogs = datatable (rawLog: string) [ - "31.56.96.51 - - [2019-01-22 03:54:16 +0330] \"GET /product/27 HTTP/1.1\" 200 5379 \"https://www.contoso.com/m/filter/b113\" \"some client\" \"-\"", - "31.56.96.51 - - [2019-01-22 03:55:17 +0330] \"GET /product/42 HTTP/1.1\" 200 5667 \"https://www.contoso.com/m/filter/b113\" \"some client\" \"-\"", - "54.36.149.41 - - [2019-01-22 03:56:14 +0330] \"GET /product/27 HTTP/1.1\" 200 30577 \"-\" \"some client\" \"-\"" -]; -``` - -One possible way to model a graph from this table is to treat the source IP addresses as nodes and the web requests to resources as edges. You can use the [parse operator](/kusto/query/parse-operator?view=azure-data-explorer&preserve-view=true) to extract the columns you need for the graph and then you can create a graph that represents the network traffic and interactions between different sources and destinations. To create the graph, you can use the [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) specifying the source and destination columns as the edge endpoints, and optionally providing additional columns as edge or node properties. - -The following query creates a graph from the raw logs: - -```kusto -let parsedLogs = rawLogs - | parse rawLog with ipAddress: string " - - [" timestamp: datetime "] \"" httpVerb: string " " resource: string " " * - | project-away rawLog; -let edges = parsedLogs; -let nodes = - union - (parsedLogs - | distinct ipAddress - | project nodeId = ipAddress, label = "IP address"), - (parsedLogs | distinct resource | project nodeId = resource, label = "resource"); -let graph = edges - | make-graph ipAddress --> resource with nodes on nodeId; -``` - -This query parses the raw logs and creates a directed graph where the nodes are either IP addresses or resources and each edge is a request from the source to the destination, with the timestamp and HTTP verb as edge properties. - -:::image type="content" source="media/graph/graph-recommendation.png" alt-text="Diagram that shows a graph of the parsed log data."::: - -Once the graph is created, you can use the [graph-match operator](/kusto/query/graph-match-operator?view=azure-data-explorer&preserve-view=true) to query the graph data using patterns, filters, and projections. For example, you can create a pattern that makes a simple recommendation based on the resources that other IP addresses requested within the last five minutes, as follows: - -```kusto -graph -| graph-match (startIp)-[request]->(resource)<--(otherIP)-[otherRequest]->(otherResource) - where startIp.label == "IP address" and //start with an IP address - resource.nodeId != otherResource.nodeId and //recommending a different resource - startIp.nodeId != otherIP.nodeId and //only other IP addresses are interesting - (request.timestamp - otherRequest.timestamp < 5m) //filter on recommendations based on the last 5 minutes - project Recommendation=otherResource.nodeId -``` - -**Output** - -| Recommendation | -| -------------- | -| /product/42 | - -The query returns "/product/42" as a recommendation based on a raw text-based log. - -## Related content - -- [Best practices](graph-best-practices.md) -- [Graph operators](/kusto/query/graph-operators?view=azure-data-explorer&preserve-view=true) diff --git a/data-explorer/kusto-tocs/query/toc.yml b/data-explorer/kusto-tocs/query/toc.yml index 9c7a4c6273..f3b5bb9182 100644 --- a/data-explorer/kusto-tocs/query/toc.yml +++ b/data-explorer/kusto-tocs/query/toc.yml @@ -496,6 +496,8 @@ items: - name: external_table() href: /kusto/query/external-table-function?view=azure-data-explorer&preserve-view=true displayName: external table external-table + - name: graph() + href: /kusto/query/graph-function?view=azure-data-explorer&preserve-view=true - name: materialize() href: /kusto/query/materialize-function?view=azure-data-explorer&preserve-view=true - name: materialized_view() @@ -1466,8 +1468,8 @@ items: href: /kusto/query/variancepif-aggregation-function?view=azure-data-explorer&preserve-view=true - name: Graph items: - - name: Graph overview - href: /kusto/query/graph-overview?view=azure-data-explorer&preserve-view=true + - name: Graph semantics overview + href: /kusto/query/graph-semantics-overview?view=azure-data-explorer&preserve-view=true - name: Graph best practices href: /kusto/query/graph-best-practices?view=azure-data-explorer&preserve-view=true - name: Graph scenarios @@ -1476,6 +1478,8 @@ items: items: - name: Graph operators overview href: /kusto/query/graph-operators?view=azure-data-explorer&preserve-view=true + - name: graph + href: /kusto/query/graph-operator?view=azure-data-explorer&preserve-view=true - name: make-graph href: /kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true - name: graph-match diff --git a/data-explorer/kusto/.openpublishing.redirection.json b/data-explorer/kusto/.openpublishing.redirection.json index d08ac36328..8c7b723cda 100644 --- a/data-explorer/kusto/.openpublishing.redirection.json +++ b/data-explorer/kusto/.openpublishing.redirection.json @@ -10,6 +10,11 @@ "redirect_url": "/kusto/management/alter-database-prettyname", "redirect_document_id": false }, + { + "source_path": "query/graph-overview.md", + "redirect_url": "/kusto/query/graph-semantics-overview", + "redirect_document_id": true + }, { "source_path": "management/show-cluster-database.md", "redirect_url": "/kusto/management/show-databases", diff --git a/data-explorer/kusto/api/get-started/app-basic-query.md b/data-explorer/kusto/api/get-started/app-basic-query.md index 552300e5a7..f7edccb0e2 100644 --- a/data-explorer/kusto/api/get-started/app-basic-query.md +++ b/data-explorer/kusto/api/get-started/app-basic-query.md @@ -3,7 +3,7 @@ title: Create an app to run basic queries description: Learn how to create an app to run basic queries using Kusto client libraries. ms.reviewer: yogilad ms.topic: how-to -ms.date: 08/11/2024 +ms.date: 05/28/2025 monikerRange: "azure-data-explorer" #customer intent: To learn about creating an app to run basic queries using Kusto client libraries. --- @@ -515,7 +515,7 @@ using (var response = kustoClient.ExecuteQuery(database, query, crp)) { ```python from azure.kusto.data import ClientRequestProperties -from datetime import datetime +import datetime import uuid; crp = ClientRequestProperties() diff --git a/data-explorer/kusto/functions-library/graph-blast-radius-fl.md b/data-explorer/kusto/functions-library/graph-blast-radius-fl.md index 1a89867a9d..1bd53a4fbf 100644 --- a/data-explorer/kusto/functions-library/graph-blast-radius-fl.md +++ b/data-explorer/kusto/functions-library/graph-blast-radius-fl.md @@ -3,7 +3,7 @@ title: graph_blast_radius_fl() description: Learn how to use the graph_blast_radius_fl() function to calculate the Blast Radius of source nodes over path or edge data. ms.reviewer: andkar ms.topic: reference -ms.date: 03/03/2025 +ms.date: 05/25/2025 monikerRange: "microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel" --- # graph_blast_radius_fl() @@ -28,13 +28,12 @@ The function outputs a list of connected targets for each source and also a scor | Name | Type | Required | Description | |--|--|--|--| -| *sourceIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the source node Ids (either for edges or paths). | -| *targetIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the target node Ids (either for edges or paths). | +| *sourceIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the source node IDs (either for edges or paths). | +| *targetIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the target node IDs (either for edges or paths). | | *targetWeightColumnName* | `string` | | The name of the column containing the target nodes' weights (such as criticality). If no relevant weights are present, the weighted score is equal to 0. The default column name is *noWeightsColumn*. | | *resultCountLimit* | `long` | | The maximum number of returned rows (sorted by descending score). The default value is 100000. | | *listedIdsLimit* | `long` | | The maximum number of targets listed for each source. The default value is 50. | - ## Function definition You can define the function by either embedding its code as a query-defined function, or creating it as a stored function in your database, as follows: @@ -173,30 +172,30 @@ connections > For this example to run successfully, you must first run the [Function definition](#function-definition) code to store the function. ```kusto -let connections = datatable (SourceNodeName:string, TargetNodeName:string, TargetNodeCriticality:int)[ - 'vm-work-1', 'webapp-prd', 3, - 'vm-custom', 'webapp-prd', 3, - 'webapp-prd', 'vm-custom', 1, - 'webapp-prd', 'test-machine', 1, - 'vm-custom', 'server-0126', 1, - 'vm-custom', 'hub_router', 2, - 'webapp-prd', 'hub_router', 2, - 'test-machine', 'vm-custom', 1, - 'test-machine', 'hub_router', 2, - 'hub_router', 'remote_DT', 1, - 'vm-work-1', 'storage_main_backup', 5, - 'hub_router', 'vm-work-2', 1, - 'vm-work-2', 'backup_prc', 3, - 'remote_DT', 'backup_prc', 3, - 'backup_prc', 'storage_main_backup', 5, - 'backup_prc', 'storage_DevBox', 1, - 'device_A1', 'sevice_B2', 2, - 'sevice_B2', 'device_A1', 2 +let connections = datatable (SourceNodeName:string, TargetNodeName:string, TargetNodeCriticality:int)[ + 'vm-work-1', 'webapp-prd', 3, + 'vm-custom', 'webapp-prd', 3, + 'webapp-prd', 'vm-custom', 1, + 'webapp-prd', 'test-machine', 1, + 'vm-custom', 'server-0126', 1, + 'vm-custom', 'hub_router', 2, + 'webapp-prd', 'hub_router', 2, + 'test-machine', 'vm-custom', 1, + 'test-machine', 'hub_router', 2, + 'hub_router', 'remote_DT', 1, + 'vm-work-1', 'storage_main_backup', 5, + 'hub_router', 'vm-work-2', 1, + 'vm-work-2', 'backup_prc', 3, + 'remote_DT', 'backup_prc', 3, + 'backup_prc', 'storage_main_backup', 5, + 'backup_prc', 'storage_DevBox', 1, + 'device_A1', 'sevice_B2', 2, + 'sevice_B2', 'device_A1', 2 ]; connections -| invoke graph_blast_radius_fl(sourceIdColumnName = 'SourceNodeName' - , targetIdColumnName = 'TargetNodeName' - , targetWeightColumnName = 'TargetNodeCriticality' +| invoke graph_blast_radius_fl(sourceIdColumnName = 'SourceNodeName' + , targetIdColumnName = 'TargetNodeName' + , targetWeightColumnName = 'TargetNodeCriticality' ) ``` @@ -224,7 +223,7 @@ Running the function aggregates the connections or paths between sources and tar Each row in the output contains the following fields: * `sourceId`: ID of the source node taken from relevant column. -* `blastRadiusList`: a list of target nodes Ids (taken from relevant column) that the source node is connected to. The list is capped to maximum length limit of listedIdsLimit parameter. +* `blastRadiusList`: a list of target nodes IDs (taken from relevant column) that the source node is connected to. The list is capped to maximum length limit of listedIdsLimit parameter. * `blastRadiusScore`: the score is the count of target nodes that the source is connected to. High Blast Radius score indicates that the source node can potentially access lots of targets, and should be treated accordingly. * `blastRadiusScoreWeighted`: the weighted score is the sum of the optional target nodes' weight column, representing their value - such as criticality or cost. If such weight exists, weighted Blast Radius score might be a more accurate metric of source node value due to potential access to high value targets. * `isBlastRadiusListCapped`: boolean flag whether the list of targets was capped by listedIdsLimit parameter. If it's true, then other targets can be accessed from the source in addition to the listed one (up to the number of blastRadiusScore). @@ -240,8 +239,8 @@ The function `graph_blast_radius_fl()` can be used to calculate the Blast Radius ## Related content * [Functions library](functions-library.md) -* [Kusto Query Language (KQL) graph semantics overview](../query/graph-overview.md) -* [Graph operators](../query/graph-operators.md) -* [Scenarios](../query/graph-scenarios.md) +* [Graph semantics overview](../query/graph-semantics-overview.md) +* [Graph operators](../query/graph-function.md) +* [Graph Scenarios](../query/graph-scenarios.md) * [Best practices](../query/graph-best-practices.md) -* [graph_path_discovery_fl()](graph-path-discovery-fl.md) +* [graph-path-discovery-fl()](graph-path-discovery-fl.md) diff --git a/data-explorer/kusto/functions-library/graph-exposure-perimeter-fl.md b/data-explorer/kusto/functions-library/graph-exposure-perimeter-fl.md index 188451d7e1..a958bdd9ca 100644 --- a/data-explorer/kusto/functions-library/graph-exposure-perimeter-fl.md +++ b/data-explorer/kusto/functions-library/graph-exposure-perimeter-fl.md @@ -28,8 +28,8 @@ The function outputs a list of connected sources that can reach each target and | Name | Type | Required | Description | |--|--|--|--| -| *sourceIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the source node Ids (either for edges or paths). | -| *targetIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the target node Ids (either for edges or paths). | +| *sourceIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the source node IDs (either for edges or paths). | +| *targetIdColumnName* | `string` | :heavy_check_mark: | The name of the column containing the target node IDs (either for edges or paths). | | *sourceWeightColumnName* | `string` | | The name of the column containing the source nodes' weights (such as vulnerability). If no relevant weights are present, the weighted score is equal to 0. The default column name is 'noWeightsColumn'. | | *resultCountLimit* | `long` | | The maximum number of returned rows (sorted by descending score). The default value is 100000. | | *listedIdsLimit* | `long` | | The maximum number of targets listed for each source. The default value is 50. | @@ -228,7 +228,7 @@ Running the function aggregates the connections or paths between sources and tar Each row in the output contains the following fields: * `targetId`: ID of the target node taken from relevant column. -* `exposurePerimeterList`: a list of source nodes Ids (taken from relevant column) that can connect to the target node. The list is capped to maximum length limit of listedIdsLimit parameter. +* `exposurePerimeterList`: a list of source nodes IDs (taken from relevant column) that can connect to the target node. The list is capped to maximum length limit of listedIdsLimit parameter. * `exposurePerimeterScore`: the score is the count of source nodes that can connect to the target. High Exposure Perimeter score indicates that the target node can be potentially accessed from lots of sources, and should be treated accordingly. * `exposurePerimeterScoreWeighted`: the weighted score is the sum of the optional source nodes' weight column, representing their value - such as vulnerability or exposedness. If such weight exists, weighted Exposure Perimeter score might be a more accurate metric of target node value due to potential access from highly vulnerable or exposed sources. * `isExposurePerimeterCapped`: boolean flag whether the list of sources was capped by listedIdsLimit parameter. If it's true, then other sources can access the target in addition to the listed ones (up to the number of exposurePerimeterScore). @@ -239,13 +239,13 @@ In case the multi-hop paths aren't available, we can build multi-hop paths betwe The output looks similar, but represents Exposure Perimeter calculated over multi-hop paths, thus being a better indicator of target nodes true accessibility from relevant sources. In order to find the full paths between source and target scenarios (for example, for disruption), [graph_path_discovery_fl()](graph-path-discovery-fl.md) function can be used with filters on relevant source and target nodes. -The function `graph_exposure_perimeter_fl()` can be used to calculate the Exposure Perimeter of target nodes, either over direct edges or longer paths. In the cybersecurity domain, it can be used for several insights. Exposure Perimeter scores (regular and weighted), represent target node's importance both from defenders' and attackers' perspectives. Nodes with high Exposure Perimeter, especially critical ones, should be protected accordingly. For example, in terms of access monitoring and hardening. Security signals, such as alerts, should be prioritized on sources that can access these nodes. The Exposure Perimeter list should be monitored for undesired connections between sources and targets and used in disruption scenarios. For example, if some of the sources were comrpomised, connections between them and the target should be broken. +The function `graph_exposure_perimeter_fl()` can be used to calculate the Exposure Perimeter of target nodes, either over direct edges or longer paths. In the cybersecurity domain, it can be used for several insights. Exposure Perimeter scores (regular and weighted), represent target node's importance both from defenders' and attackers' perspectives. Nodes with high Exposure Perimeter, especially critical ones, should be protected accordingly. For example, in terms of access monitoring and hardening. Security signals, such as alerts, should be prioritized on sources that can access these nodes. The Exposure Perimeter list should be monitored for undesired connections between sources and targets and used in disruption scenarios. For example, if some of the sources were compromised, connections between them and the target should be broken. ## Related content * [Functions library](functions-library.md) -* [Kusto Query Language (KQL) graph semantics overview](../query/graph-overview.md) -* [Graph operators](../query/graph-operators.md) -* [Scenarios](../query/graph-scenarios.md) +* [Graph semantics overview](../query/graph-semantics-overview.md) +* [Graph operators](../query/graph-function.md) +* [Graph scenarios](../query/graph-scenarios.md) * [Best practices](../query/graph-best-practices.md) -* [graph_path_discovery_fl()](graph-path-discovery-fl.md) +* [graph-path-discovery-fl()](graph-path-discovery-fl.md) diff --git a/data-explorer/kusto/functions-library/graph-node-centrality-fl.md b/data-explorer/kusto/functions-library/graph-node-centrality-fl.md index a9ce300b01..5bd3c697a4 100644 --- a/data-explorer/kusto/functions-library/graph-node-centrality-fl.md +++ b/data-explorer/kusto/functions-library/graph-node-centrality-fl.md @@ -3,7 +3,7 @@ title: graph_node_centrality_fl() description: Learn how to use the graph_node_centrality_fl() function to calculate metrics of node centrality over graph data. ms.reviewer: andkar ms.topic: reference -ms.date: 03/25/2025 +ms.date: 05/25/2025 monikerRange: "microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel" --- # graph_node_centrality_fl() @@ -492,7 +492,8 @@ The function `graph_node_centrality_fl()` can be used in the cybersecurity domai ## Related content * [Functions library](functions-library.md) -* [Kusto Query Language (KQL) graph semantics overview](../query/graph-overview.md) -* [Graph operators](../query/graph-operators.md) -* [Scenarios](../query/graph-scenarios.md) +* [Graph semantics overview](../query/graph-semantics-overview.md) +* [Graph operators](../query/graph-function.md) +* [Graph scenarios](../query/graph-scenarios.md) * [Best practices](../query/graph-best-practices.md) +* [graph-path-discovery-fl()](graph-path-discovery-fl.md) diff --git a/data-explorer/kusto/functions-library/graph-path-discovery-fl.md b/data-explorer/kusto/functions-library/graph-path-discovery-fl.md index e512f846e7..a901ef695d 100644 --- a/data-explorer/kusto/functions-library/graph-path-discovery-fl.md +++ b/data-explorer/kusto/functions-library/graph-path-discovery-fl.md @@ -24,7 +24,7 @@ We make several assumptions: These assumptions can be adapted as needed by changing the internal logic of the function. -The function discovers all possible paths between valid sources to valid targets, under optional constraints such as path length limits, maximum output size, etc. The output is a list of discovered paths with source and target Ids, as well as list of connecting edges and nodes. The function uses only the required fields, such as node Ids and edge Ids. In case other relevant fields - such as types, property lists, security-related scores, or external signals - are available in input data, they can be added to logic and output by changing the function definition. +The function discovers all possible paths between valid sources to valid targets, under optional constraints such as path length limits, maximum output size, etc. The output is a list of discovered paths with source and target IDs, as well as list of connecting edges and nodes. The function uses only the required fields, such as node IDs and edge IDs. In case other relevant fields - such as types, property lists, security-related scores, or external signals - are available in input data, they can be added to logic and output by changing the function definition. ## Syntax @@ -393,7 +393,7 @@ The function `graph_path_discovery_fl()` can be used in cybersecurity domain to ## Related content * [Functions library](functions-library.md) -* [Kusto Query Language (KQL) graph semantics overview](../query/graph-overview.md) -* [Graph operators](../query/graph-operators.md) -* [Scenarios](../query/graph-scenarios.md) +* [Graph semantics overview](../query/graph-semantics-overview.md) +* [Graph function](../query/graph-function.md) +* [Graph scenarios](../query/graph-scenarios.md) * [Best practices](../query/graph-best-practices.md) diff --git a/data-explorer/kusto/management/graph/graph-model-create-or-alter.md b/data-explorer/kusto/management/graph/graph-model-create-or-alter.md new file mode 100644 index 0000000000..3653de515f --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-model-create-or-alter.md @@ -0,0 +1,123 @@ +--- +title: .create-or-alter graph_model command +description: Learn how to create or alter a graph model using the .create-or-alter graph_model command with syntax, parameters, and examples. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .create-or-alter graph_model (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Creates a new graph model or alters an existing one using the provided model definition payload. + +## Permissions + +To run this command, the user needs [Database Admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.create-or-alter` `graph_model` *GraphModelName* *GraphModelDefinitionPayload* + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model to create or alter. The name must be unique within the database and follow the [entity naming rules](../../query/schema-entities/entity-names.md).| +|*GraphModelDefinitionPayload*|String|✅|A valid JSON document that defines the graph model. See [Graph model definition payload](#graph-model-definition-payload).| + +### Graph model definition payload + +The graph model definition payload is a JSON document that defines the structure and processing steps for the graph model. For detailed information about the graph model definition format, see [Graph model in Kusto - Overview](graph-model-overview.md). + +## Returns + +This command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|*Name*|String|The name of the graph model that was created or altered.| +|*CreationTime*|DateTime|The timestamp when the graph model was created or altered.| +|*Id*|String|The unique identifier of the graph model.| +|*SnapshotsCount*|Int|The number of snapshots created from this graph model.| +|*Model*|String (JSON)|The JSON definition of the graph model, including schema and processing steps.| +|*AuthorizedPrincipals*|String (JSON)|Array of principals that have access to the graph model, including their identifiers and role assignments.| +|*RetentionPolicy*|String (JSON)|The retention policy configured for the graph model.| + +## Examples + +### Create a new graph model + +````kusto +.create-or-alter graph_model SocialNetwork ``` +{ + "Schema": { + "Nodes": { + "User": { + "UserId": "string", + "Username": "string", + "JoinDate": "datetime", + "IsActive": "bool" + } + }, + "Edges": { + "Follows": { + "Since": "datetime" + }, + "Likes": { + "Timestamp": "datetime", + "Rating": "int" + } + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "Users | project UserId, Username, JoinDate, IsActive", + "NodeIdColumn": "UserId", + "Labels": ["User"] + }, + { + "Kind": "AddEdges", + "Query": "FollowEvents | project SourceUser, TargetUser, CreatedAt", + "SourceColumn": "SourceUser", + "TargetColumn": "TargetUser", + "Labels": ["Follows"] + }, + { + "Kind": "AddEdges", + "Query": "LikeEvents | project UserId, ContentId, Timestamp, Score", + "SourceColumn": "UserId", + "TargetColumn": "ContentId", + "Labels": ["Likes"] + } + ] + } +} +``` +```` + +**Output** + +|Name|CreationTime|ID|SnapshotsCount|Model|AuthorizedPrincipals|RetentionPolicy| +|---|---|---|---|---|---|---| +|SocialNetwork|2025-05-23 14:42:37.5128901|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|0|model from above|[
{
"Type": "AAD User",
"DisplayName": "Alex Johnson (upn: alex.johnson@contoso.com)",
"ObjectId": "aaaaaaaa-bbbb-cccc-1111-22222222222",
"FQN": "aaduser=aaaaaaaa-bbbb-cccc-1111-22222222222;aaaabbbb-0000-cccc-1111-dddd2222eeee",
"Notes": "",
"RoleAssignmentIdentifier": "a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"
}
]|{
"SoftDeletePeriod": "3650.00:00:00"
}| + +## Notes + +* If a graph model with the specified name doesn't exist, a new one is created when using `.create-or-alter graph_model`. If one already exists, it's updated with the new definition. +* Each time a graph model is altered, a new version is created, allowing you to track changes over time and revert to previous versions if needed. +* To generate a graph snapshot from the model, use the [.make graph_snapshot](graph-snapshot-make.md) command. + +## Related content + +- [Graph model overview](graph-model-overview.md) +- [.show graph_model](graph-model-show.md) +- [.show graph_models](graph-models-show.md) +- [.drop graph_model](graph-model-drop.md) +- [.make graph_snapshot](graph-snapshot-make.md) diff --git a/data-explorer/kusto/management/graph/graph-model-drop.md b/data-explorer/kusto/management/graph/graph-model-drop.md new file mode 100644 index 0000000000..9d63d3e29e --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-model-drop.md @@ -0,0 +1,56 @@ +--- +title: .drop graph_model command +description: Learn how to delete an existing graph model and all its versions using the .drop graph_model command. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .drop graph_model (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Deletes an existing graph model and all its versions from the database, including any associated snapshots. + +## Permissions + +To run this command, you need [Database admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.drop` `graph_model` *GraphModelName* + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model to drop.| + +## Returns + +This command doesn't return any output. + +## Examples + +### Drop a graph model + +```kusto +.drop graph_model SocialNetwork +``` + +## Notes + +- The `.drop graph_model` command permanently deletes the graph model and all its versions. This operation cannot be undone. +- This command also deletes all snapshots associated with the graph model. +- Dropping a graph model doesn't affect the source data that was used to create it. + +## Next steps + +- [Graph model overview](graph-model-overview.md) +- [.create-or-alter graph_model](graph-model-create-or-alter.md) +- [.show graph_model](graph-model-show.md) +- [.show graph_models](graph-models-show.md) +- [.drop graph_snapshot](graph-snapshot-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-model-overview.md b/data-explorer/kusto/management/graph/graph-model-overview.md new file mode 100644 index 0000000000..9bea057a0f --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-model-overview.md @@ -0,0 +1,325 @@ +--- +title: Graph models in Azure Data Explorer - Overview and usage +description: Learn how to define, manage, and query persistent graph structures in Kusto +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# Graph models overview (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Graph models in Azure Data Explorer enable you to define, manage, and efficiently query persistent graph structures within your database. Unlike transient graphs created using the [make-graph](../../query/make-graph-operator.md) operator, graph models are stored representations that can be queried repeatedly without rebuilding the graph for each query, significantly improving performance for complex relationship-based analysis. + +## Overview + +A graph model is a database object that represents a labeled property graph (LPG) within Azure Data Explorer. It consists of nodes (vertices) and edges (relationships), both of which can have properties that describe them. The model defines both the schema of the graph (node and edge types with their properties) and the process for constructing the graph from tabular data stored in Kusto tables. + +## Key characteristics + +Graph models in Kusto offer: + +- **Metadata persistence**: Store graph specifications in database metadata for durability and reusability +- **Materialized snapshots**: Eliminate the need to rebuild graphs for each query, dramatically improving query performance +- **Schema definition**: Support optional but recommended defined schemas for nodes and edges, ensuring data consistency +- **Deep KQL integration**: Seamlessly integrate with Kusto Query Language (KQL) graph semantics +- **Optimized traversals**: Include specialized indexing for efficient graph traversal operations, making complex pattern matching and path-finding queries significantly faster + +## When to use graph models + +Graph models provide significant advantages for relationship-based analysis but require additional setup compared to ad-hoc graph queries. Consider using graph models when: + +- **Performance is critical**: You repeatedly run graph queries on the same underlying data and need optimized performance +- **Complex relationship data**: You have data with many interconnected relationships that benefit from a graph representation +- **Stable structure**: Your graph structure is relatively stable, with periodic but not constant updates +- **Advanced graph operations**: You need to perform complex traversals, path finding, pattern matching, or community detection on your data +- **Consistent schema**: Your graph analysis requires a well-defined structure with consistent node and edge types + +For simpler, one-time graph analysis on smaller datasets, the [make-graph](../../query/make-graph-operator.md) operator might be more appropriate. + +## Graph model components + +A graph model consists of two main components: + +### Schema (optional) + +The schema defines the structure of the nodes and edges in the graph: + +- **Nodes**: Defines the types of nodes in the graph and their properties +- **Edges**: Defines the types of relationships between nodes and their properties + +### Definition + +The Definition specifies how to build the graph from tabular data: + +* **Steps**: A sequence of operations to add nodes and edges to the graph + * **AddNodes**: Steps that define how to create nodes from tabular data + * **AddEdges**: Steps that define how to create edges from tabular data + +## Labels in Graph models + +Labels are critical identifiers that categorize nodes and edges in the graph, enabling efficient filtering and pattern matching. Azure Data Explorer graph models support two complementary types of labels: + +### Static labels + +* Defined explicitly in the Schema section of the graph model +* Represent node or edge types with predefined properties +* Provide a consistent schema for the graph elements +* Referenced in the "Labels" array in AddNodes and AddEdges steps +* Ideal for well-known, stable entity and relationship types + +### Dynamic labels + +* Not predefined in the Schema section +* Generated at runtime from data in the underlying tables +* Specified using "LabelsColumn" in the AddNodes or AddEdges steps +* Can be a single label (string column) or multiple labels (dynamic array column) +* Allow for more flexible graph structures that adapt to your data +* Useful for systems where node/edge types evolve over time + +> [!TIP] +> You can combine static and dynamic labels to get the benefits of both approaches: schema validation for core entity types while maintaining flexibility for evolving classifications. + +## Definition steps in detail + +The Definition section of a graph model contains steps that define how to construct the graph from tabular data. Each step has specific parameters depending on its kind. + +### AddNodes steps + +AddNodes steps define how to create nodes in the graph from tabular data: + +| Parameter | Required | Description | +|-----------|----------|-------------| +| Kind | Yes | Must be set to "AddNodes" | +| Query | Yes | A KQL query that retrieves the data for nodes. The query result must include all columns required for node properties and identifiers | +| NodeIdColumn | Yes | The column from the query result used as the unique identifier for each node | +| Labels | No | An array of static label names defined in the Schema section to apply to these nodes | +| LabelsColumn | No | A column from the query result that provides dynamic labels for each node. Can be a string column (single label) or dynamic array column (multiple labels) | + +### AddEdges steps + +AddEdges steps define how to create relationships between nodes in the graph: + +| Parameter | Required | Description | +|-----------|----------|-------------| +| Kind | Yes | Must be set to "AddEdges" | +| Query | Yes | A KQL query that retrieves the data for edges. The query result must include source and target node identifiers and any edge properties | +| SourceColumn | Yes | The column from the query result that contains the source node identifiers | +| TargetColumn | Yes | The column from the query result that contains the target node identifiers | +| Labels | No | An array of static label names defined in the Schema section to apply to these edges | +| LabelsColumn | No | A column from the query result that provides dynamic labels for each edge. Can be a string column (single label) or dynamic array column (multiple labels) | + +## Graph model examples + +### Basic example with both static and dynamic labels + +The following example creates a professional network graph model that combines static schema definitions with dynamic labeling: + +````kusto +.create-or-alter graph_model ProfessionalNetwork ``` +{ + "Schema": { + "Nodes": { + "Person": {"Name": "string", "Age": "long", "Title": "string"}, + "Company": {"Name": "string", "Industry": "string", "FoundedYear": "int"} + }, + "Edges": { + "WORKS_AT": {"StartDate": "datetime", "Position": "string", "Department": "string"}, + "KNOWS": {"ConnectionDate": "datetime", "ConnectionStrength": "int"} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "Employees | project Id, Name, Age, Title, NodeType", + "NodeIdColumn": "Id", + "Labels": ["Person"], + "LabelsColumn": "NodeType" + }, + { + "Kind": "AddNodes", + "Query": "Organizations | project Id, Name, Industry, FoundedYear", + "NodeIdColumn": "Id", + "Labels": ["Company"] + }, + { + "Kind": "AddEdges", + "Query": "EmploymentRecords | project EmployeeId, CompanyId, StartDate, Position, Department", + "SourceColumn": "EmployeeId", + "TargetColumn": "CompanyId", + "Labels": ["WORKS_AT"] + }, + { + "Kind": "AddEdges", + "Query": "Connections | project PersonA, PersonB, ConnectionDate, ConnectionType, ConnectionStrength", + "SourceColumn": "PersonA", + "TargetColumn": "PersonB", + "Labels": ["KNOWS"], + "LabelsColumn": "ConnectionType" + } + ] + } +} +``` +```` + +This model would enable queries such as finding colleagues connected through multiple degrees of separation, identifying people working in the same industry, or analyzing organizational relationships. + +## Creating and managing Graph models + +Azure Data Explorer provides a comprehensive set of management commands for working with graph models throughout their lifecycle. + +### Command summary + +| Command | Purpose | Key parameters | +|---------|---------|---------------| +| [.create-or-alter graph_model](graph-model-create-or-alter.md) | Create a new graph model or modify an existing one | Database, Name, Schema, Definition | +| [.drop graph_model](graph-model-drop.md) | Remove a graph model | Database, Name | +| [.show graph_models](graph-model-show.md) | List available graph models | Database [optional] | + +### Graph model lifecycle + +A typical workflow for managing graph models involves: + +1. **Development** - Create an initial graph model with a schema and definition that maps to your data +2. **Validation** - Query the model to verify correct structure and expected results +3. **Maintenance** - Periodically update the model as your data structure evolves +4. **Snapshot management** - Create and retire snapshots to balance performance and freshness + +> [!TIP] +> When starting with graph models, begin with a small subset of your data to validate your design before scaling to larger datasets. + +## Graph snapshots + +Graph snapshots are database entities that represent instances of graph models at specific points in time. While a graph model defines the structure and data sources for a graph, a snapshot is the actual materialized graph that can be queried. + +Key aspects of graph snapshots: + +* Each snapshot is linked to a specific graph model +* A single graph model can have multiple snapshots associated with it +* Snapshots are created with the `.make graph_snapshot` command +* Snapshots include metadata such as creation time and the source graph model +* Snapshots enable querying the graph as it existed at a specific point in time + +For more detailed information about working with graph snapshots, see [Graph snapshots in Kusto](graph-snapshot-overview.md). + +## Querying Graph models + +Graph models are queried using the `graph()` function, which provides access to the graph entity. This function supports retrieving either the most recent snapshot of the graph or creating the graph at query time if snapshots aren't available. + +### Basic query structure + +```kusto +graph("GraphModelName") +| graph-match + where + project +``` + +### Query examples + +#### 1. Basic node-edge-node pattern + +```kusto +// Find people who commented on posts by employees in the last week +graph("SocialNetwork") +| graph-match (person)-[comments]->(post)<-[authored]-(employee) + where person.age > 30 + and comments.createTime > ago(7d) + project person.name, post.title, employee.userName +``` + +#### 2. Multiple relationship patterns + +```kusto +// Find people who both work with and are friends with each other +graph("ProfessionalNetwork") +| graph-match (p1:Person)-[:WORKS_WITH]->(p2:Person)-[:FRIENDS_WITH]->(p1) + project p1.name, p2.name, p1.department +``` + +#### 3. Variable-length paths + +```kusto +// Find potential influence paths up to 3 hops away +graph("InfluenceNetwork") +| graph-match (influencer)-[:INFLUENCES*1..3]->(target) + where influencer.id == "user123" + project influencePath = INFLUENCES, + pathLength = array_length(INFLUENCES), + target.name +``` + +The `graph()` function provides a consistent way to access graph data without needing to explicitly construct the graph for each query. + +> [!NOTE] +> See [Graph operators](../../query/graph-operators.md) for the complete reference on graph query syntax and capabilities. + +## Frequently Asked Questions + +### Who is responsible for refreshing the graph? + +Users or processes must refresh the graph themselves. Initially, no automatic refresh policies exist for new graph entities. However, the graph remains queryable even if the snapshot is being created or has not yet been created yet. + +### How can a graph be refreshed? + +To refresh a graph: + +1. Create a new snapshot using an asynchronous operation (`.make graph_snapshot`) +1. Once created, incoming graph queries automatically use the new snapshot +1. Optional: Drop the old snapshot to free up resources (`.drop graph_snapshot`) + +### What if different steps create duplicate edges or nodes? + +- **Edges**: Duplicates remain as duplicates by default (edges don't have unique identifiers) +- **Nodes**: "Duplicates" are merged - the system assumes they represent the same entity. If there are conflicting property values, the last value processed takes precedence + +### How do graph models handle schema changes? + +When the schema of your underlying data changes: + +1. Alter your graph model using the `.create-or-alter graph_model` command to update its schema or definition +1. To materialize these changes, create a new snapshot +1. Older snapshots remain accessible until explicitly dropped + +### Can I query across multiple graph models? + +Yes, you can query multiple graph models within a single query using composition: + +- Use the output of one `graph()` operator as input to another `graph()` operator +- Process and transform results from one graph before feeding into another graph query +- Chain multiple graph operations for cross-domain analysis without creating a unified model + +Example: + +```kusto +// Query the first graph model +graph("EmployeeNetwork") +| graph-match (person:Employee)-[:MANAGES]->(team) +| project manager=person.name, teamId=team.id +// Use these results to query another graph model +| join ( + graph("ProjectNetwork") + | graph-match (project)-[:ASSIGNED_TO]->(team) + | project projectName=project.name, teamId=team.id +) on teamId +``` + +### What's the difference between labels and properties? + +- **Labels**: Categorize nodes and edges for structural pattern matching +- **Properties**: Store data values associated with nodes and edges (used in filtering and output) + +## Related content + +* [.create-or-alter graph_model](graph-model-create-or-alter.md) +* [.drop graph_model](graph-model-drop.md) +* [.show graph_models](graph-model-show.md) +* [Key considerations](graph-persistent-overview.md#key-considerations) +* [Graph operators](../../query/graph-operators.md) +* [Graph best practices](../../query/graph-best-practices.md) diff --git a/data-explorer/kusto/management/graph/graph-model-show.md b/data-explorer/kusto/management/graph/graph-model-show.md new file mode 100644 index 0000000000..158194d3dc --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-model-show.md @@ -0,0 +1,134 @@ +--- +title: .show graph_model command +description: Learn how to display specific graph model versions using the .show graph_model command with syntax and examples. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .show graph_model (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Shows the details of a specific graph model, including its versions. + +## Permissions + +To run this command, the user needs [Database viewer permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.show` `graph_model` *GraphModelName* [`with` `(`*Property* `=` *Value* [`,` ...]`)`] + +`.show` `graph_model` *GraphModelName* `details` + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model to show.| +|*Property*|String|❌|A property to control which version(s) to show. See [Properties](#properties).| +|*Value*|String|❌|The value of the corresponding property.| +|`details`|Keyword|❌|When specified, returns more detailed information about the graph model.| + +### Properties + +|Name|Type|Required|Description| +|--|--|--|--| +|`id`|String|❌|The specific version identifier (GUID) of the graph model to show. Use `*` to show all versions. If not specified, the latest version is shown.| + +## Returns + +The command returns different output depending on which syntax is used: + +### Basic command (without details) + +When using the basic `.show graph_model` command (without the `details` keyword), the command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph model.| +|CreationTime|DateTime|The date and time when the graph model was created.| +|Id|String|The identifier of the graph model version (a GUID).| + +### Details command + +When using the `.show graph_model` command with the `details` keyword, the command returns a more detailed table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph model.| +|CreationTime|DateTime|The date and time when the graph model was created.| +|Id|String|The identifier of the graph model version (a GUID).| +|SnapshotCount|Long|The number of snapshots in the graph model.| +|Model|Dynamic|A JSON object containing the complete model definition and properties.| + +## Examples + +### Show the latest version of a graph model + +```kusto +.show graph_model SocialNetwork +``` + +**Output** + +|Name|CreationTime|Id| +|---|---|---| +|SocialNetwork|2025-04-23T14:32:18Z|aaaaaaaa-0b0b-1c1c-2d2d-333333333333| + +### Show a specific version of a graph model + +```kusto +.show graph_model ProductRecommendations with (id = "cccccccc-2d2d-3e3e-4f4f-555555555555") +``` + +**Output** + +|Name|CreationTime|Id| +|---|---|---| +|ProductRecommendations|2025-04-15T09:45:12Z|cccccccc-2d2d-3e3e-4f4f-555555555555| + +### Show all versions of a graph model + +```kusto +.show graph_model NetworkTraffic with (id = "*") +``` + +**Output** + +|Name|CreationTime|Id| +|---|---|---| +|NetworkTraffic|2025-03-10T13:24:56Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444| +|NetworkTraffic|2025-03-25T10:15:22Z|cccccccc-2d2d-3e3e-4f4f-555555555555| +|NetworkTraffic|2025-04-12T15:42:18Z|dddddddd-3e3e-4f4f-5a5a-666666666666| + +### Show detailed information about a graph model + +```kusto +.show graph_model SocialNetwork details +``` + +**Output** + +|Name|CreationTime|Id|SnapshotCount|Model| +|---|---|---|---|---| +|SocialNetwork|2025-04-23T14:32:18Z|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|12|{}| + +## Notes + +- The `.show graph_model` command is useful for examining the history and evolution of a specific graph model. +- When showing all versions of a graph model, the results are ordered by creation date, with the oldest version first. +- Use the `details` keyword when you need to see the complete model definition and additional metadata. +- The `Model` column in the detailed output contains the complete JSON definition of the graph model, which might be large for complex models. +- Use the basic command (without `details`) for a more concise overview when you don't need the full model definition. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.create-or-alter graph_model](graph-model-create-or-alter.md) +* [.show graph_models](graph-models-show.md) +* [.drop graph_model](graph-model-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-models-show.md b/data-explorer/kusto/management/graph/graph-models-show.md new file mode 100644 index 0000000000..cafe207109 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-models-show.md @@ -0,0 +1,146 @@ +--- +title: .show graph_models command +description: Learn how to list all graph models in a database using the .show graph_models command with syntax, parameters, and examples. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .show graph_models (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Lists all graph models in the database, showing the latest version for each model by default. + +## Permissions + +To run this command, the user needs [Database viewer permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.show` `graph_models` [`with` `(`*Property* `=` *Value* [`,` ...]`)`] + +`.show` `graph_models` `details` [`with` `(`*Property* `=` *Value* [`,` ...]`)`] + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*Property*|String|❌|A property to control which versions to show. See [Properties](#properties).| +|*Value*|String|❌|The value of the corresponding property.| + +### Properties + +|Name|Type|Required|Description| +|--|--|--|--| +|`showAll`|Boolean|❌|If set to `true`, returns all versions of every graph model. If set to `false` or not specified, returns only the latest version of each graph model.| + +## Returns + +### Basic output format + +When using `.show graph_models` without the `details` parameter, the command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph model.| +|CreationTime|DateTime|The date and time when the graph model was created.| +|Id|Guid|A unique identifier (GUID) for the graph model version.| + +### Detailed output format + +When using `.show graph_models details`, the command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph model.| +|CreationTime|DateTime|The date and time when the graph model was created.| +|Id|Guid|A unique identifier (GUID) for the graph model version.| +|SnapshotsCount|Long|The number of snapshots available for this graph model.| +|Model|Dynamic|A JSON object containing the graph model definition and properties.| +|AuthorizedPrincipals|Dynamic|A JSON array of principals authorized to access this graph model.| +|RetentionPolicy|Dynamic|A JSON object defining the retention policy for this graph model.| + +## Examples + +### Show the latest version of all graph models + +```kusto +.show graph_models +``` + +**Output** + +|Name|CreationTime|Id| +|---|---|---| +|SocialNetwork|2025-04-23T14:32:18Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444| +|ProductRecommendations|2025-04-15T09:45:12Z|cccccccc-2d2d-3e3e-4f4f-555555555555| +|NetworkTraffic|2025-04-12T15:42:18Z|dddddddd-3e3e-4f4f-5a5a-666666666666| + +### Show all versions of all graph models + +```kusto +.show graph_models with (showAll = true) +``` + +**Output** + +|Name|CreationTime|Id| +|---|---|---| +|SocialNetwork|2025-03-05T11:23:45Z|eeeeeeee-4f4f-5a5a-6b6b-777777777777| +|SocialNetwork|2025-03-28T09:18:32Z|ffffffff-5a5a-6b6b-7c7c-888888888888| +|SocialNetwork|2025-04-23T14:32:18Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444| +|ProductRecommendations|2025-03-10T14:25:38Z|aaaaaaaa-6b6b-7c7c-8d8d-999999999999| +|ProductRecommendations|2025-04-15T09:45:12Z|cccccccc-2d2d-3e3e-4f4f-555555555555| +|NetworkTraffic|2025-03-10T13:24:56Z|aaaaaaaa-0b0b-1c1c-2d2d-333333333333| +|NetworkTraffic|2025-03-25T10:15:22Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444| +|NetworkTraffic|2025-04-12T15:42:18Z|dddddddd-3e3e-4f4f-5a5a-666666666666| + +### Show detailed information for the latest version of all graph models + +```kusto +.show graph_models details +``` + +**Output** + +|Name|CreationTime|Id|SnapshotsCount|Model|AuthorizedPrincipals|RetentionPolicy| +|---|---|---|---|---|---|---| +|SocialNetwork|2025-04-23T14:32:18Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444|3|{}|[{"Type":"AAD User", "DisplayName":"Alex Johnson (upn: alex.johnson@contoso.com)", "ObjectId":"aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb", "FQN":"aaduser=aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|ProductRecommendations|2025-04-15T09:45:12Z|cccccccc-2d2d-3e3e-4f4f-555555555555|2|{}|[{"Type":"AAD User", "DisplayName":"Maria Garcia (upn: maria.garcia@contoso.com)", "ObjectId":"bbbbbbbb-1111-2222-3333-cccccccccccc", "FQN":"aaduser=bbbbbbbb-1111-2222-3333-cccccccccccc;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|NetworkTraffic|2025-04-12T15:42:18Z|dddddddd-3e3e-4f4f-5a5a-666666666666|3|{}|[{"Type":"AAD User", "DisplayName":"Sam Wilson (upn: sam.wilson@contoso.com)", "ObjectId":"cccccccc-2222-3333-4444-dddddddddddd", "FQN":"aaduser=cccccccc-2222-3333-4444-dddddddddddd;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| + +### Show detailed information for all versions of all graph models + +```kusto +.show graph_models details with (showAll = true) +``` + +**Output** + +|Name|CreationTime|Id|SnapshotsCount|Model|AuthorizedPrincipals|RetentionPolicy| +|---|---|---|---|---|---|---| +|SocialNetwork|2025-03-05T11:23:45Z|eeeeeeee-4f4f-5a5a-6b6b-777777777777|3|{}|[{"Type":"AAD User", "DisplayName":"Alex Johnson (upn: alex.johnson@contoso.com)", "ObjectId":"aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb", "FQN":"aaduser=aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|SocialNetwork|2025-03-28T09:18:32Z|ffffffff-5a5a-6b6b-7c7c-888888888888|3|{}|[{"Type":"AAD User", "DisplayName":"Alex Johnson (upn: alex.johnson@contoso.com)", "ObjectId":"aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb", "FQN":"aaduser=aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}, {"Type":"AAD Group", "DisplayName":"Data Scientists (upn: data.scientists@contoso.com)", "ObjectId":"dddddddd-3333-4444-5555-eeeeeeeeeeee", "FQN":"aadgroup=dddddddd-3333-4444-5555-eeeeeeeeeeee;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"Read-only access", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|SocialNetwork|2025-04-23T14:32:18Z|bbbbbbbb-1c1c-2d2d-3e3e-444444444444|3|{}|[{"Type":"AAD User", "DisplayName":"Alex Johnson (upn: alex.johnson@contoso.com)", "ObjectId":"aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb", "FQN":"aaduser=aaaaaaaa-0000-1111-2222-bbbbbbbbbbbb;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}, {"Type":"AAD Group", "DisplayName":"Data Scientists (upn: data.scientists@contoso.com)", "ObjectId":"dddddddd-3333-4444-5555-eeeeeeeeeeee", "FQN":"aadgroup=dddddddd-3333-4444-5555-eeeeeeeeeeee;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"Read-only access", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|ProductRecommendations|2025-03-10T14:25:38Z|aaaaaaaa-6b6b-7c7c-8d8d-999999999999|2|{}|[{"Type":"AAD User", "DisplayName":"Maria Garcia (upn: maria.garcia@contoso.com)", "ObjectId":"bbbbbbbb-1111-2222-3333-cccccccccccc", "FQN":"aaduser=bbbbbbbb-1111-2222-3333-cccccccccccc;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| +|ProductRecommendations|2025-04-15T09:45:12Z|cccccccc-2d2d-3e3e-4f4f-555555555555|2|{}|[{"Type":"AAD User", "DisplayName":"Maria Garcia (upn: maria.garcia@contoso.com)", "ObjectId":"bbbbbbbb-1111-2222-3333-cccccccccccc", "FQN":"aaduser=bbbbbbbb-1111-2222-3333-cccccccccccc;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}, {"Type":"AAD Service Principal", "DisplayName":"ProductAnalytics App (app: product.analytics@contoso.com)", "ObjectId":"eeeeeeee-4444-5555-6666-ffffffffffff", "FQN":"aadapp=eeeeeeee-4444-5555-6666-ffffffffffff;aaaabbbb-0000-cccc-1111-dddd2222eeee", "Notes":"Automated reporting", "RoleAssignmentIdentifier":"a0a0a0a0-bbbb-cccc-dddd-e1e1e1e1e1e1"}]|{"SoftDeletePeriod":"365000.00:00:00"}| + +## Notes + +- By default, this command returns only the latest version of each graph model. +- When using the `showAll` parameter set to `true`, you can see the complete version history of all graph models in your database. +- Use the `details` keyword to get detailed information about graph models, including the model definition, authorized principals, and retention policy. +- The basic output format (without `details`) provides a quick overview of available graph models. +- The detailed output format (with `details`) provides comprehensive information about the graph models, useful for administrative and audit purposes. +- The results are ordered alphabetically by graph model name, and then by creation date within each model when showing all versions. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.show graph_model](graph-model-show.md) +* [.create-or-alter graph_model](graph-model-create-or-alter.md) +* [.drop graph_model](graph-model-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-persistent-overview.md b/data-explorer/kusto/management/graph/graph-persistent-overview.md new file mode 100644 index 0000000000..c524d2609d --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-persistent-overview.md @@ -0,0 +1,127 @@ +--- +title: Persistent graphs in Kusto - Overview +description: Learn about persistent graphs in Kusto, including graph models, snapshots, and management commands for scalable graph analytics. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# Persistent graphs overview (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Persistent graphs in Kusto enable you to store, manage, and query graph data structures at scale. Unlike transient graphs created with the [make-graph](../../query/make-graph-operator.md) operator, persistent graphs are durable database objects that persist beyond individual query executions, providing enterprise-grade graph analytics capabilities. + +## Overview + +Persistent graphs consist of two primary components: + +- **[Graph models](graph-model-overview.md)**: Define the structure and schema of your graph +- **[Graph snapshots](graph-snapshot-overview.md)**: Persistent instances of graph models that you can query + +This architecture provides both flexibility in defining graph schemas and efficiency in querying graph data at scale. + +## Key benefits + +Persistent graphs offer significant advantages for enterprise-scale graph analytics: + +- **Durable storage**: Graph models and snapshots persist in database metadata for long-term availability +- **Scalability**: Handle large graphs that exceed single-node memory limitations +- **Reusability**: Multiple users and applications can query the same graph structure without reconstruction +- **Performance optimization**: Eliminate graph construction overhead for repeated queries +- **Schema support**: Structured definitions for different node and edge types with their properties +- **Version control**: Multiple snapshots enable representation of graphs at different points in time + +## Graph models + +A graph model defines the specifications of a graph stored in your database metadata. It includes: + +- **Schema definition**: Node and edge types with their properties +- **Data source mappings**: Instructions for building the graph from tabular data +- **Labels**: Both static (predefined) and dynamic (generated at runtime) labels for nodes and edges + +Graph models contain the blueprint for creating graph snapshots, not the actual graph data. + +### Managing graph models + +The following commands are available for managing graph models: + +| Command | Description | +|---------|-------------| +| [.create-or-alter graph_model](graph-model-create-or-alter.md) | Creates a new graph model or alters an existing one | +| [.show graph_model](graph-model-show.md) | Displays details of a specific graph model | +| [.show graph_models](graph-models-show.md) | Lists all graph models in the database | +| [.drop graph_model](graph-model-drop.md) | Removes a graph model | + +## Graph snapshots + +A graph snapshot is the actual graph instance materialized from a graph model. It represents: + +- A specific point-in-time view of the data as defined by the model +- The nodes, edges, and their properties in a queryable format +- A self-contained entity that persists until explicitly removed + +Snapshots are the entities you query when working with persistent graphs. + +### Managing graph snapshots + +The following commands are available for managing graph snapshots: + +| Command | Description | +|---------|-------------| +| [.make graph_snapshot](graph-snapshot-make.md) | Creates a new graph snapshot from a graph model | +| [.show graph_snapshot](graph-snapshot-show.md) | Displays details of a specific graph snapshot | +| [.show graph_snapshots](graph-snapshots-show.md) | Lists all graph snapshots in the database | +| [.drop graph_snapshot](graph-snapshot-drop.md) | Removes a single graph snapshot | +| [.drop graph_snapshots](graph-snapshots-drop.md) | Removes multiple graph snapshots based on criteria | + +## Workflow + +The typical workflow for creating and using persistent graphs follows these steps: + +1. **Create a graph model** - Define the structure and data sources for your graph +2. **Create a graph snapshot** - Materialize the graph model into a queryable snapshot +3. **Query the graph snapshot** - Use KQL graph operators to analyze the graph data +4. **Manage lifecycle** - Create new snapshots as needed and drop old ones + +## Querying persistent graphs + +Once a graph snapshot is created, it can be queried using the [`graph`](../../query/graph-function.md) function followed by other KQL graph operators: + +```kusto +graph("MyGraphModel") +| graph-match (n)-[e]->(m) +| project n, e, m +``` + +To query a specific snapshot, provide the snapshot name: + +```kusto +graph("MyGraphModel", "MyGraphSnapshot") +| graph-match (n)-[e]->(m) +| project n, e, m +``` + +The [`graph-match`](../../query/graph-match-operator.md) operator enables pattern matching and traversal operations, while [`graph-shortest-paths`](../../query/graph-shortest-paths-operator.md) helps find optimal connections between entities. The [`graph-to-table`](../../query/graph-to-table-operator.md) operator converts graph results back to tabular format. + +## Key considerations + +This section describes key considerations and current limitations of graph models and snapshots in Kusto. + +### Snapshot limitations + +Persistent graphs in Kusto have the following limitations: + +- **Regular database limit**: Maximum of 5,000 graph snapshots per database +- **Free virtual cluster limit**: Maximum of 500 graph snapshots per database +- **Snapshot creation time**: Limited to 1 hour + +## Next steps + +* [Graph model overview](graph-model-overview.md) +* [Graph snapshot overview](graph-snapshot-overview.md) +* [Graph operators in Kusto](../../query/graph-operators.md) +* [Graph best practices](../../query/graph-best-practices.md) \ No newline at end of file diff --git a/data-explorer/kusto/management/graph/graph-snapshot-drop.md b/data-explorer/kusto/management/graph/graph-snapshot-drop.md new file mode 100644 index 0000000000..dcfea21f22 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshot-drop.md @@ -0,0 +1,72 @@ +--- +title: .drop graph_snapshot command +description: Learn how to delete a specific graph snapshot using the .drop graph_snapshot command with syntax, parameters, and examples. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .drop graph_snapshot (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Deletes a specific graph snapshot from a graph model. + +## Permissions + +To run this command, the user needs [Database Admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.drop` `graph_snapshot` *GraphModelName*`.`*SnapshotName* + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model that the snapshot belongs to.| +|*SnapshotName*|String|✅|The name of the graph snapshot to drop.| + +## Returns + +This command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the dropped snapshot.| +|SnapshotTime|DateTime|The time when the snapshot was created.| +|ModelName|String|The name of the graph model.| +|ModelId|String|The unique identifier of the graph model.| +|ModelCreationTime|DateTime|The time when the graph model was created.| + +## Examples + +### Drop a specific graph snapshot + +```kusto +.drop graph_snapshot SocialNetwork.OldSnapshot +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime| +|---|---|---|---|---| +|Latest|2025-05-21 10:47:05.9122575|SomeGraph|eeeeeeee-4f4f-5a5a-6b6b-777777777777|2025-05-21 10:47:05.8611670| + +## Notes + +- The `.drop graph_snapshot` command permanently deletes a specific graph snapshot. This operation can't be undone. +- Before dropping a snapshot, ensure that no queries or processes are currently using it. +- Dropping a snapshot doesn't affect the graph model from which it was created. +- To drop all snapshots for a specific graph model, use the [.drop graph_snapshots](graph-snapshots-drop.md) command. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.make graph_snapshot](graph-snapshot-make.md) +* [.show graph_snapshot](graph-snapshot-show.md) +* [.show graph_snapshots](graph-snapshots-show.md) +* [.drop graph_snapshots](graph-snapshots-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-snapshot-make.md b/data-explorer/kusto/management/graph/graph-snapshot-make.md new file mode 100644 index 0000000000..19238ec7a7 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshot-make.md @@ -0,0 +1,89 @@ +--- +title: .make graph_snapshot command +description: Learn how to create a graph snapshot from a graph model using the .make graph_snapshot command with syntax, parameters, and examples. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .make graph_snapshot (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Creates a new graph snapshot from a specified graph model. A graph snapshot is a materialized instance of a graph model that can be efficiently queried. + +## Permissions + +To run this command, the user needs [Database admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.make` [`async`] `graph_snapshot` *SnapshotName* `from` *GraphModelName* + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|`async`|Keyword|❌|If specified, the command runs asynchronously and returns immediately.| +|*SnapshotName*|String|✅|The name of the snapshot to create. The name must be unique within the scope of the graph model.| +|*GraphModelName*|String|✅|The name of the graph model from which to create the snapshot.| + +## Returns + +If run synchronously, this command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|*Name*|String|The name of the created snapshot.| +|*SnapshotTime*|DateTime|The timestamp when the snapshot was created.| +|*ModelName*|String|The name of the graph model.| +|*ModelId*|String|The unique identifier of the graph model.| +|*ModelCreationTime*|DateTime|The timestamp when the graph model was created.| +|*NodesCount*|Long|The number of nodes in the snapshot.| +|*EdgesCount*|Long|The number of edges in the snapshot.| +|*RetentionPolicy*|String|The retention policy applied to the snapshot in JSON format.| + +If run asynchronously, the command returns an operation ID that can be used to check the status of the operation. + +## Examples + +### Create a graph snapshot synchronously + +```kusto +.make graph_snapshot WeeklySnapshot from SocialNetwork +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime|NodesCount|EdgesCount|RetentionPolicy| +|---|---|---|---|---|---|---|---| +|WeeklySnapshot|2025-05-24 05:26:35.1495944|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-05-21 10:47:05.8611670|2|1|{
"SoftDeletePeriod": "365000.00:00:00"}| + +### Create a graph snapshot asynchronously + +```kusto +.make async graph_snapshot DailySnapshot from ProductRecommendations +``` + +**Output** + +|OperationId|Status| +|---|---| +|bbbbbbbb-1c1c-2d2d-3e3e-444444444444|InProgress| + +## Notes + +- Creating a graph snapshot materializes the graph model definition into a queryable format. This process can be time-consuming for large graphs. +- For large graphs, it's recommended to use the `async` option to run the operation in the background. +- A graph model can have multiple snapshots, each representing the state of the graph at different points in time. +- Snapshots are immutable. To update a snapshot with fresh data, you need to create a new snapshot. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.show graph_snapshot](graph-snapshot-show.md) +* [.show graph_snapshots](graph-snapshots-show.md) +* [.drop graph_snapshot](graph-snapshot-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-snapshot-overview.md b/data-explorer/kusto/management/graph/graph-snapshot-overview.md new file mode 100644 index 0000000000..1b0b6918b3 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshot-overview.md @@ -0,0 +1,114 @@ +--- +title: Graph snapshots overview +description: Learn about graph snapshots in Kusto, including their structure, benefits, and how to create and query them for efficient graph data analysis. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# Graph snapshots overview (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +A graph snapshot is a database entity that represents a materialized instance of a graph model at a specific point in time. While a [graph model](graph-model-overview.md) defines the structure and data sources, a snapshot is the queryable graph implementation. + +## Overview + +Graph snapshots provide: + +- **Model linkage**: Connected to a specific graph model +- **Point-in-time materialization**: Represents the graph state at creation time +- **Persistence**: Stored in the database until explicitly dropped +- **Direct querying**: Enables queries without rebuilding the graph +- **Metadata storage**: Contains creation time and model information + +Multiple snapshots from the same graph model enable historical analysis and temporal comparison of graph data. + +## Graph snapshot structure + +Each graph snapshot contains two primary components: + +### Metadata + +- **Name**: Unique snapshot identifier +- **SnapshotTime**: Creation timestamp +- **Model information**: + - **ModelName**: Source graph model name + - **ModelVersion**: Model version at snapshot creation + - **ModelCreationTime**: Source model creation timestamp + +### Graph data + +- **Nodes**: Materialized nodes from the model's `AddNodes` operations +- **Edges**: Materialized relationships from the model's `AddEdges` operations +- **Properties**: Node and edge properties as defined in the model + +## Example snapshot configuration + +```json +{ + "Metadata": { + "Name": "UserInteractionsSnapshot", + "SnapshotTime": "2025-04-28T10:15:30Z" + }, + "ModelInformation": { + "ModelName": "SocialNetworkGraph", + "ModelVersion": "v1.2", + "ModelCreationTime": "2025-04-15T08:20:10Z" + } +} +``` + +## Management commands + +Use these commands to manage graph snapshots: + +| Command | Purpose | +|---------|---------| +| [.make graph_snapshot](graph-snapshot-make.md) | Create a snapshot from an existing graph model | +| [.drop graph_snapshot](graph-snapshot-drop.md) | Remove a snapshot from the database | +| [.show graph_snapshots](graph-snapshot-show.md) | List available snapshots in the database | + +## Querying snapshots + +Query graph snapshots using the `graph()` function: + +### Query the latest snapshot + +```kusto +graph("SocialNetworkGraph") +| graph-match (person)-[knows]->(friend) + where person.age > 30 + project person.name, friend.name +``` + +### Query a specific snapshot + +```kusto +graph("SocialNetworkGraph", "UserInteractionsSnapshot") +| graph-match (person)-[knows]->(friend) + where person.age > 30 + project person.name, friend.name +``` + +For advanced pattern matching and traversals, see [Graph operators](../../query/graph-operators.md). + +## Key benefits + +Graph snapshots provide: + +* **Enhanced performance**: Eliminates graph rebuilding for each query +* **Data consistency**: Ensures all queries operate on identical graph state +* **Temporal analysis**: Enables historical comparison across time periods +* **Resource optimization**: Reduces CPU and memory consumption for repeated operations + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.make graph_snapshot](graph-snapshot-make.md) +* [.drop graph_snapshot](graph-snapshot-drop.md) +* [.show graph_snapshots](graph-snapshot-show.md) +* [Graph operators](../../query/graph-operators.md) diff --git a/data-explorer/kusto/management/graph/graph-snapshot-show.md b/data-explorer/kusto/management/graph/graph-snapshot-show.md new file mode 100644 index 0000000000..57f76b9edf --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshot-show.md @@ -0,0 +1,99 @@ +--- +title: .show graph_snapshot command +description: Learn how to display information about a specific graph snapshot using the .show graph_snapshot command. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .show graph_snapshot (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Shows detailed information about a specific graph snapshot. + +## Permissions + +To run this command, the user needs [Database admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.show` `graph_snapshot` *GraphModelName*`.`*SnapshotName* [`details`] + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model that the snapshot belongs to.| +|*SnapshotName*|String|✅|The name of the graph snapshot to show.| +|`details`|String|❌|Optional parameter to show additional detailed information about the snapshot, including node count, edge count, and retention policy.| + +## Returns + +This command returns a table with different columns depending on whether the `details` parameter is specified. + +### Basic output (without `details`) + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph snapshot.| +|SnapshotTime|DateTime|The date and time when the snapshot was created.| +|ModelName|String|The name of the graph model that the snapshot belongs to.| +|ModelId|String|The unique identifier of the graph model.| +|ModelCreationTime|DateTime|The date and time when the graph model was created.| + +### Detailed output (with `details`) + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph snapshot.| +|SnapshotTime|DateTime|The date and time when the snapshot was created.| +|ModelName|String|The name of the graph model that the snapshot belongs to.| +|ModelId|String|The unique identifier of the graph model.| +|ModelCreationTime|DateTime|The date and time when the graph model was created.| +|NodesCount|Long|The number of nodes in the graph snapshot.| +|EdgesCount|Long|The number of edges in the graph snapshot.| +|RetentionPolicy|Dynamic|A JSON object containing the retention policy settings for the snapshot.| + +## Examples + +### Show basic graph snapshot information + +```kusto +.show graph_snapshot SomeGraph.Latest2 +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime| +|---|---|---|---|---| +|Latest2|2025-05-24 06:34:51.6518833|SomeGraph|eeeeeeee-4f4f-5a5a-6b6b-777777777777|2025-05-21 10:47:05.8611670| + +### Show detailed graph snapshot information + +```kusto +.show graph_snapshot SomeGraph.Latest2 details +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime|NodesCount|EdgesCount|RetentionPolicy| +|---|---|---|---|---|---|---|---| +|Latest2|2025-05-24 06:34:51.6518833|SomeGraph|eeeeeeee-4f4f-5a5a-6b6b-777777777777|2025-05-21 10:47:05.8611670|2|1|{
"SoftDeletePeriod": "365000.00:00:00"
}| + +## Notes + +- The `.show graph_snapshot` command provides information about a specific graph snapshot. +- Use the basic format to get essential snapshot information including creation time and model details. +- Use the `details` parameter to get additional information including node count, edge count, and retention policy. +- The retention policy shows the soft delete period, which determines how long the snapshot is retained before being permanently deleted. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.make graph_snapshot](graph-snapshot-make.md) +* [.show graph_snapshots](graph-snapshots-show.md) +* [.drop graph_snapshot](graph-snapshot-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-snapshots-drop.md b/data-explorer/kusto/management/graph/graph-snapshots-drop.md new file mode 100644 index 0000000000..e5330ab619 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshots-drop.md @@ -0,0 +1,58 @@ +--- +title: .drop graph_snapshots command +description: Learn how to delete all graph snapshots for a specific graph model using the .drop graph_snapshots command. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .drop graph_snapshots (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Deletes all graph snapshots associated with a specific graph model. + +## Permissions + +To run this command, you need [Database admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.drop` `graph_snapshots` *GraphModelName* + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|✅|The name of the graph model for which to drop all snapshots.| + +## Returns + +This command doesn't return any output upon successful completion. + +## Examples + +### Drop all snapshots for a graph model + +```kusto +.drop graph_snapshots SocialNetwork +``` + +The command completes successfully without returning any output. + +## Important notes + +- The `.drop graph_snapshots` command permanently deletes all snapshots associated with a graph model. This operation cannot be undone. +- Dropping snapshots doesn't affect the graph model itself. +- To drop a specific snapshot instead of all snapshots, use the [.drop graph_snapshot](graph-snapshot-drop.md) command. + +## Next steps + +* [Graph model overview](graph-model-overview.md) +* [.make graph_snapshot](graph-snapshot-make.md) +* [.show graph_snapshot](graph-snapshot-show.md) +* [.show graph_snapshots](graph-snapshots-show.md) +* [.drop graph_snapshot](graph-snapshot-drop.md) diff --git a/data-explorer/kusto/management/graph/graph-snapshots-show.md b/data-explorer/kusto/management/graph/graph-snapshots-show.md new file mode 100644 index 0000000000..be61b18a17 --- /dev/null +++ b/data-explorer/kusto/management/graph/graph-snapshots-show.md @@ -0,0 +1,93 @@ +--- +title: .show graph_snapshots command +description: Learn how to list all graph snapshots for a graph model or all graph models using the .show graph_snapshots command. +ms.reviewer: herauch +ms.topic: reference +ms.date: 05/24/2025 +--- + +# .show graph_snapshots (preview) + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Lists all graph snapshots for a specific graph model or for all graph models. + +## Permissions + +To run this command, the user needs [Database admin permissions](../../access-control/role-based-access-control.md). + +## Syntax + +`.show` `graph_snapshots` *GraphModelName* + +`.show` `graph_snapshots` `*` + +## Parameters + +|Name|Type|Required|Description| +|--|--|--|--| +|*GraphModelName*|String|❌|The name of the graph model for which to show snapshots. If specified, only snapshots for this model are returned.| +|`*`|Symbol|❌|If specified instead of a graph model name, snapshots for all graph models are returned.| + +## Returns + +This command returns a table with the following columns: + +|Column|Type|Description| +|--|--|--| +|Name|String|The name of the graph snapshot.| +|SnapshotTime|DateTime|The date and time when the snapshot was created.| +|ModelName|String|The name of the graph model that the snapshot belongs to.| +|ModelId|GUID|The unique identifier of the graph model.| +|ModelCreationTime|DateTime|The date and time when the graph model was created.| + +## Examples + +### Show all snapshots for a specific graph model + +```kusto +.show graph_snapshots SocialNetwork +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime| +|---|---|---|---|---| +|DailySnapshot|2025-04-25T08:15:30Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| +|WeeklySnapshot|2025-04-18T09:20:45Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| +|MonthlySnapshot|2025-03-28T14:10:22Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| + +### Show snapshots for all graph models + +```kusto +.show graph_snapshots * +``` + +**Output** + +|Name|SnapshotTime|ModelName|ModelId|ModelCreationTime| +|---|---|---|---|---| +|DailySnapshot|2025-04-25T08:15:30Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| +|WeeklySnapshot|2025-04-18T09:20:45Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| +|MonthlySnapshot|2025-03-28T14:10:22Z|SocialNetwork|aaaaaaaa-0b0b-1c1c-2d2d-333333333333|2025-03-01T10:00:00Z| +|DailySnapshot|2025-04-26T07:05:18Z|ProductRecommendations|bbbbbbbb-1c1c-2d2d-3e3e-444444444444|2025-02-15T14:30:00Z| +|WeeklySnapshot|2025-04-19T06:30:42Z|ProductRecommendations|bbbbbbbb-1c1c-2d2d-3e3e-444444444444|2025-02-15T14:30:00Z| +|HourlySnapshot|2025-04-26T14:00:05Z|NetworkTraffic|cccccccc-2d2d-3e3e-4f4f-555555555555|2025-01-20T09:15:00Z| +|DailySnapshot|2025-04-25T08:00:15Z|NetworkTraffic|cccccccc-2d2d-3e3e-4f4f-555555555555|2025-01-20T09:15:00Z| + +## Notes + +- The `.show graph_snapshots` command is useful for listing all available snapshots, which can be queried or managed. +- The results are ordered alphabetically by snapshot name, and then by creation time within each snapshot name. +- To get more detailed information about a specific snapshot, use the [.show graph_snapshot](graph-snapshot-show.md) command. + +## Related content + +* [Graph model overview](graph-model-overview.md) +* [.make graph_snapshot](graph-snapshot-make.md) +* [.show graph_snapshot](graph-snapshot-show.md) +* [.drop graph_snapshot](graph-snapshot-drop.md) +* [.drop graph_snapshots](graph-snapshots-drop.md) diff --git a/data-explorer/kusto/management/toc.yml b/data-explorer/kusto/management/toc.yml index 03ba2e8a9a..9c72e60d2b 100644 --- a/data-explorer/kusto/management/toc.yml +++ b/data-explorer/kusto/management/toc.yml @@ -222,6 +222,32 @@ items: - name: .show materialized-view(s) details displayName: show materialized view details href: materialized-views/materialized-view-show-details-command.md + - name: Graphs + items: + - name: Persistent graph overview + href: graph/graph-persistent-overview.md + - name: Graph models overview + href: graph/graph-model-overview.md + - name: .create-or-alter graph_model + href: graph/graph-model-create-or-alter.md + - name: .drop graph_model + href: graph/graph-model-drop.md + - name: .show graph_model + href: graph/graph-model-show.md + - name: .show graph_models + href: graph/graph-models-show.md + - name: Graph snapshot overview + href: graph/graph-snapshot-overview.md + - name: .make graph_snapshot + href: graph/graph-snapshot-make.md + - name: .drop graph_snapshot + href: graph/graph-snapshot-drop.md + - name: .drop graph_snapshots + href: graph/graph-snapshots-drop.md + - name : .show graph_snapshot + href: graph/graph-snapshot-show.md + - name: .show graph_snapshots + href: graph/graph-snapshots-show.md - name: Stored query results items: - name: Stored query results diff --git a/data-explorer/kusto/query/graph-best-practices.md b/data-explorer/kusto/query/graph-best-practices.md index d1f79ff474..261eda4ca8 100644 --- a/data-explorer/kusto/query/graph-best-practices.md +++ b/data-explorer/kusto/query/graph-best-practices.md @@ -3,21 +3,55 @@ title: Best practices for Kusto Query Language (KQL) graph semantics description: Learn about the best practices for Kusto Query Language (KQL) graph semantics. ms.reviewer: herauch ms.topic: conceptual -ms.date: 02/17/2025 +ms.date: 05/26/2025 # Customer intent: As a data analyst, I want to learn about best practices for KQL graph semantics. --- -# Best practices for Kusto Query Language (KQL) graph semantics +# Best practices for graph semantics -> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] +>[!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] -This article explains how to use the graph semantics feature in KQL effectively and efficiently for different use cases and scenarios. It shows how to create and query graphs with the syntax and operators, and how to integrate them with other KQL features and functions. It also helps users avoid common pitfalls or errors. For instance, creating graphs that exceed memory or performance limits, or applying unsuitable or incompatible filters, projections, or aggregations. +Kusto supports two primary approaches for working with graphs: transient graphs created in-memory for each query, and persistent graphs defined as graph models and snapshots within the database. This article provides best practices for both methods, enabling you to select the optimal approach and use KQL graph semantics efficiently. -## Size of graph +This guidance covers: -The [make-graph operator](make-graph-operator.md) creates an in-memory representation of a graph. It consists of the graph structure itself and its properties. When making a graph, use appropriate filters, projections, and aggregations to select only the relevant nodes and edges and their properties. +- Graph creation and optimization strategies +- Querying techniques and performance considerations +- Schema design for persistent graphs +- Integration with other KQL features +- Common pitfalls to avoid -The following example shows how to reduce the number of nodes and edges and their properties. In this scenario, Bob changed manager from Alice to Eve and the user only wants to see the latest state of the graph for their organization. To reduce the size of the graph, the nodes are first filtered by the organization property and then the property is removed from the graph using the [project-away operator](project-away-operator.md). The same happens for edges. Then [summarize operator](summarize-operator.md) together with [arg_max](arg-max-aggregation-function.md) is used to get the last known state of the graph. +:::moniker range="azure-data-explorer || microsoft-fabric" + +## Graph modeling approaches in Kusto + +Kusto provides two approaches for working with graphs: transient and persistent. + +### Transient graphs + +Created dynamically using the [`make-graph`](make-graph-operator.md) operator. These graphs exist only during query execution and are optimal for ad hoc or exploratory analysis on small to medium datasets. + +### Persistent graphs + +Defined using [graph models](../management/graph/graph-model-overview.md) and [graph snapshots](../management/graph/graph-snapshot-overview.md). These graphs are stored in the database, support schema and versioning, and are optimized for repeated, large-scale, or collaborative analysis. + +:::moniker-end + +## Best practices for transient graphs + +Transient graphs, created in-memory using the `make-graph` operator, are ideal for ad hoc analysis, prototyping, and scenarios where graph structure changes frequently or requires only a subset of available data. + +### Optimize graph size for performance + +The [`make-graph`](make-graph-operator.md) creates an in-memory representation including both structure and properties. Optimize performance by: + +- **Apply filters early** - Select only relevant nodes, edges, and properties before graph creation +- **Use projections** - Remove unnecessary columns to minimize memory consumption +- **Apply aggregations** - Summarize data where appropriate to reduce graph complexity + +**Example: Reducing graph size through filtering and projection** + +In this scenario, Bob changed managers from Alice to Eve. To view only the latest organizational state while minimizing graph size: ```kusto let allEmployees = datatable(organization: string, name:string, age:long) @@ -50,57 +84,61 @@ filteredReports project employee = employee.name, topManager = manager.name ``` -**Output** +**Output:** | employee | topManager | | -------- | ---------- | | Bob | Mallory | -## Last known state of the graph +### Maintain current state with materialized views -The [Size of graph](#size-of-graph) example demonstrated how to get the last known state of the edges of a graph by using `summarize` operator and the `arg_max` aggregation function. Obtaining the last known state is a compute-intensive operation. +The previous example showed how to obtain the last known state using `summarize` and `arg_max`. This operation can be compute-intensive, so consider using materialized views for improved performance. -Consider creating a materialized view to improve the query performance, as follows: +**Step 1: Create tables with versioning** -1. Create tables that have some notion of version as part of their model. We recommend using a `datetime` column that you can later use to create a graph time series. +Create tables with a versioning mechanism for graph time series: - ```kusto - .create table employees (organization: string, name:string, stateOfEmployment:string, properties:dynamic, modificationDate:datetime) +```kusto +.create table employees (organization: string, name:string, stateOfEmployment:string, properties:dynamic, modificationDate:datetime) - .create table reportsTo (employee:string, manager:string, modificationDate: datetime) - ``` +.create table reportsTo (employee:string, manager:string, modificationDate: datetime) +``` -1. Create a materialized view for each table and use the [arg_max aggregation](arg-max-aggregation-function.md) function to determine the *last known state* of employees and the *reportsTo* relation. +**Step 2: Create materialized views** - ```kusto - .create materialized-view employees_MV on table employees - { - employees - | summarize arg_max(modificationDate, *) by name - } +Use the [arg_max aggregation](arg-max-aggregation-function.md) function to determine the latest state: - .create materialized-view reportsTo_MV on table reportsTo - { - reportsTo - | summarize arg_max(modificationDate, *) by employee - } - ``` +```kusto +.create materialized-view employees_MV on table employees +{ + employees + | summarize arg_max(modificationDate, *) by name +} + +.create materialized-view reportsTo_MV on table reportsTo +{ + reportsTo + | summarize arg_max(modificationDate, *) by employee +} +``` -1. Create two functions that ensure that only the materialized component of the materialized view is used and other filters and projections are applied. +**Step 3: Create helper functions** - ```kusto - .create function currentEmployees () { - materialized_view('employees_MV') - | where stateOfEmployment == "employed" - } +Ensure only the materialized component is used and apply additional filters: - .create function reportsTo_lastKnownState () { - materialized_view('reportsTo_MV') - | project-away modificationDate - } - ``` +```kusto +.create function currentEmployees () { + materialized_view('employees_MV') + | where stateOfEmployment == "employed" +} -The resulting query using materialized makes the query faster and more efficient for larger graphs. It also enables higher concurrency and lower latency queries for the latest state of the graph. The user can still query the graph history based on the employees and *reportsTo* tables, if needed +.create function reportsTo_lastKnownState () { + materialized_view('reportsTo_MV') + | project-away modificationDate +} +``` + +This approach provides faster queries, higher concurrency, and lower latency for current state analysis while preserving access to historical data. ```kusto let filteredEmployees = @@ -111,14 +149,12 @@ reportsTo_lastKnownState | make-graph employee --> manager with filteredEmployees on name | graph-match (employee)-[hasManager*2..5]-(manager) where employee.name == "Bob" - project employee = employee.name, reportingPath = map(hasManager, manager) + project employee = employee.name, reportingPath = hasManager.manager ``` -## Graph time travel +### Implement graph time travel -Some scenarios require you to analyze data based on the state of a graph at a specific point in time. Graph time travel uses a combination of time filters and summarizes using the arg_max aggregation function. - -The following KQL statement creates a function with a parameter that defines the interesting point in time for the graph. It returns a ready-made graph. +Analyzing data based on historical graph states provides valuable temporal context. Implement this "time travel" capability by combining time filters with `summarize` and `arg_max`: ```kusto .create function graph_time_travel (interestingPointInTime:datetime ) { @@ -136,49 +172,37 @@ The following KQL statement creates a function with a parameter that defines the } ``` -With the function in place, the user can craft a query to get the top manager of Bob based on the graph in June 2022. +**Usage example:** + +Query Bob's top manager based on June 2022 graph state: ```kusto graph_time_travel(datetime(2022-06-01)) | graph-match (employee)-[hasManager*2..5]-(manager) where employee.name == "Bob" - project employee = employee.name, reportingPath = map(hasManager, manager) + project employee = employee.name, reportingPath = hasManager.manager ``` -**Output** +**Output:** | employee | topManager | | -------- | ---------- | | Bob | Dave | -## Dealing with multiple node and edge types +### Handle multiple node and edge types -Sometimes it's required to contextualize time series data with a graph that consists of multiple node types. One way of handling this scenario is creating a general-purpose property graph that is represented by a canonical model. +When working with complex graphs containing multiple node types, use a canonical property graph model. Define nodes with attributes like `nodeId` (string), `label` (string), and `properties` (dynamic), while edges include `source` (string), `destination` (string), `label` (string), and `properties` (dynamic) fields. -Occasionally, you might need to contextualize time series data with a graph that has multiple node types. You could approach the problem by creating a general-purpose property graph that is based on a canonical model, such as the following. +**Example: Factory maintenance analysis** -- nodes - - nodeId (string) - - label (string) - - properties (dynamic) -- edges - - source (string) - - destination (string) - - label (string) - - properties (dynamic) +Consider a factory manager investigating equipment issues and responsible personnel. The scenario combines asset graphs of production equipment with maintenance staff hierarchy: -The following example shows how to transform the data into a canonical model and how to query it. The base tables for the nodes and edges of the graph have different schemas. +:::image type="content" source="media/graphs/factory-maintenance-analysis.png" alt-text="A graph of factory people, equiptment, and measurements"::: -This scenario involves a factory manager who wants to find out why equipment isn't working well and who is responsible for fixing it. The manager decides to use a graph that combines the asset graph of the production floor and the maintenance staff hierarchy which changes every day. - -The following graph shows the relations between assets and their time series, such as speed, temperature, and pressure. The operators and the assets, such as *pump*, are connected via the *operates* edge. The operators themselves report up to management. - -:::image type="content" source="media/graph/graph-property-graph.png" alt-text="Infographic on the property graph scenario." lightbox="media/graph/graph-property-graph.png"::: - -The data for those entities can be stored directly in your cluster or acquired using query federation to a different service, such as Azure Cosmos DB, Azure SQL, or Azure Digital Twin. To illustrate the example, the following tabular data is created as part of the query: +The data for those entities can be stored directly in your cluster or acquired using query federation to a different service. To illustrate the example, the following tabular data is created as part of the query: ```kusto -let sensors = datatable(sensorId:string, tagName:string, unitOfMeasuree:string) +let sensors = datatable(sensorId:string, tagName:string, unitOfMeasure:string) [ "1", "temperature", "°C", "2", "pressure", "Pa", @@ -222,9 +246,9 @@ let assetHierarchy = datatable(source:string, destination:string) ]; ``` -The *employees*, *sensors*, and other entities and relationships don't share a canonical data model. You can use the [union operator](union-operator.md) to combine and canonize the data. +The employees, sensors, and other entities and relationships do not share a canonical data model. The [union operator](union-operator.md) can be used to combine and standardize the data. -The following query joins the sensor data with the time series data to find the sensors that have abnormal readings. Then, it uses a projection to create a common model for the graph nodes. +The following query joins the sensor data with the time series data to identify sensors with abnormal readings, then uses a projection to create a common model for the graph nodes. ```kusto let nodes = @@ -241,7 +265,7 @@ let nodes = ( employees | project nodeId = name, label = "employee", properties = pack_all(true)); ``` -The edges are transformed in a similar way. +The edges are transformed in a similar manner. ```kusto let edges = @@ -251,14 +275,15 @@ let edges = ( operates | project source = employee, destination = machine, properties = pack_all(true), label = "operates" ); ``` -With the canonized nodes and edges data, you can create a graph using the [make-graph operator](make-graph-operator.md), as follows: +With the standardized nodes and edges data, you can create a graph using the [make-graph operator](make-graph-operator.md) + ```kusto let graph = edges | make-graph source --> destination with nodes on nodeId; ``` -Once created, define the path pattern and project the information required. The pattern starts at a tag node followed by a variable length edge to an asset. That asset is operated by an operator that reports to a top manager via a variable length edge, called *reportsTo*. The constraints section of the [graph-match operator](graph-match-operator.md), in this instance **where**, reduces the tags to the ones that have an anomaly and were operated on a specific day. +Once the graph is created, define the path pattern and project the required information. The pattern begins at a tag node, followed by a variable-length edge to an asset. That asset is operated by an operator who reports to a top manager via a variable-length edge called *reportsTo*. The constraints section of the [graph-match operator](graph-match-operator.md), in this case the **where** clause, filters the tags to those with an anomaly that were operated on a specific day. ```kusto graph @@ -279,9 +304,228 @@ graph | -------------- | ------------- | ------------ | ------------------ | | temperature | Pump | Eve | Mallory | -The projection in graph-match outputs the information that the temperature sensor showed an anomaly on the specified day. It was operated by Eve who ultimately reports to Mallory. With this information, the factory manager can reach out to Eve and potentially Mallory to get a better understanding of the anomaly. +The projection in `graph-match` shows that the temperature sensor exhibited an anomaly on the specified day. The sensor was operated by Eve, who ultimately reports to Mallory. With this information, the factory manager can contact Eve and, if necessary, Mallory to better understand the anomaly. + +:::moniker range="azure-data-explorer || microsoft-fabric" + +## Best practices for persistent graphs + +Persistent graphs, defined using [graph models](../management/graph/graph-model-overview.md) and [graph snapshots](../management/graph/graph-snapshot-overview.md), provide robust solutions for advanced graph analytics needs. These graphs excel in scenarios requiring repeated analysis of large, complex, or evolving data relationships, and facilitate collaboration by enabling teams to share standardized graph definitions and consistent analytical results. By persisting graph structures in the database, this approach significantly enhances performance for recurring queries and supports sophisticated versioning capabilities. + +### Use schema and definition for consistency and performance + +A clear schema for your graph model is essential, as it specifies node and edge types along with their properties. This approach ensures data consistency and enables efficient querying. Utilize the `Definition` section to specify how nodes and edges are constructed from your tabular data through `AddNodes` and `AddEdges` steps. + +### Leverage static and dynamic labels for flexible modeling + +When modeling your graph, you can utilize both static and dynamic labeling approaches for optimal flexibility. Static labels are ideal for well-defined node and edge types that rarely change—define these in the `Schema` section and reference them in the `Labels` array of your steps. For cases where node or edge types are determined by data values (for example, when the type is stored in a column), use dynamic labels by specifying a `LabelsColumn` in your step to assign labels at runtime. This approach is especially useful for graphs with heterogeneous or evolving schemas. Both mechanisms can be effectively combined—you can define a `Labels` array for static labels and also specify a `LabelsColumn` to incorporate additional labels from your data, providing maximum flexibility when modeling complex graphs with both fixed and data-driven categorization. + +#### Example: Using dynamic labels for multiple node and edge types + +The following example demonstrates an effective implementation of dynamic labels in a graph representing professional relationships. In this scenario, the graph contains people and companies as nodes, with employment relationships forming the edges between them. The flexibility of this model comes from determining node and edge types directly from columns in the source data, allowing the graph structure to adapt organically to the underlying information. + +```` +.create-or-alter graph_model ProfessionalNetwork ``` +{ + "Schema": { + "Nodes": { + "Person": {"Name": "string", "Age": "long"}, + "Company": {"Name": "string", "Industry": "string"} + }, + "Edges": { + "WORKS_AT": {"StartDate": "datetime", "Position": "string"} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "Employees | project Id, Name, Age, NodeType", + "NodeIdColumn": "Id", + "Labels": ["Person"], + "LabelsColumn": "NodeType" + }, + { + "Kind": "AddEdges", + "Query": "EmploymentRecords | project EmployeeId, CompanyId, StartDate, Position, RelationType", + "SourceColumn": "EmployeeId", + "TargetColumn": "CompanyId", + "Labels": ["WORKS_AT"], + "LabelsColumn": "RelationType" + } + ] + } +} +``` +```` + +This dynamic labeling approach provides exceptional flexibility when modeling graphs with numerous node and edge types, eliminating the need to modify your schema each time a new entity type appears in your data. By decoupling the logical model from the physical implementation, your graph can continuously evolve to represent new relationships without requiring structural changes to the underlying schema. + +## Multi-tenant partitioning strategies for large-scale ISV scenarios + +In large organizations, particularly ISV scenarios, graphs can consist of multiple billions of nodes and edges. This scale presents unique challenges that require strategic partitioning approaches to maintain performance while managing costs and complexity. + +### Understanding the challenge + +Large-scale multi-tenant environments often exhibit the following characteristics: + +- **Billions of nodes and edges** - Enterprise-scale graphs that exceed traditional graph database capabilities +- **Tenant size distribution** - Typically follows a power law where 99.9% of tenants have small to medium graphs, while 0.1% have massive graphs +- **Performance requirements** - Need for both real-time analysis (current data) and historical analysis capabilities +- **Cost considerations** - Balance between infrastructure costs and analytical capabilities + +### Partitioning by natural boundaries + +The most effective approach for managing large-scale graphs is partitioning by natural boundaries, typically tenant identifiers or organizational units: + +**Key partitioning strategies:** + +- **Tenant-based partitioning** - Separate graphs by customer, organization, or business unit +- **Geographic partitioning** - Divide by region, country, or datacenter location +- **Temporal partitioning** - Separate by time periods for historical analysis +- **Functional partitioning** - Split by business domain or application area + +**Example: Multi-tenant organizational structure** + +```kusto +// Partition employees and reports by tenant +let tenantEmployees = + allEmployees + | where tenantId == "tenant_123" + | project-away tenantId; + +let tenantReports = + allReports + | where tenantId == "tenant_123" + | summarize arg_max(modificationDate, *) by employee + | project-away modificationDate, tenantId; + +tenantReports +| make-graph employee --> manager with tenantEmployees on name +| graph-match (employee)-[hasManager*1..5]-(manager) + where employee.name == "Bob" + project employee = employee.name, reportingChain = hasManager.manager +``` + +### Hybrid approach: Transient vs. persistent graphs by tenant size + +The most cost-effective strategy combines both transient and persistent graphs based on tenant characteristics: + +#### Small to medium tenants (99.9% of tenants) + +Use **transient graphs** for the majority of tenants: + +**Advantages:** + +- **Always up-to-date data** - No snapshot maintenance required +- **Lower operational overhead** - No graph model or snapshot management +- **Cost-effective** - No additional storage costs for graph structures +- **Immediate availability** - No pre-processing delays + +**Implementation pattern:** + +```kusto +.create function getTenantGraph(tenantId: string) { + let tenantEmployees = + employees + | where tenant == tenantId and stateOfEmployment == "employed" + | project-away tenant, stateOfEmployment; + let tenantReports = + reportsTo + | where tenant == tenantId + | summarize arg_max(modificationDate, *) by employee + | project-away modificationDate, tenant; + tenantReports + | make-graph employee --> manager with tenantEmployees on name +} + +// Usage for small tenant +getTenantGraph("small_tenant_456") +| graph-match (employee)-[reports*1..3]-(manager) + where employee.name == "Alice" + project employee = employee.name, managerChain = reports.manager +``` + +#### Large tenants (0.1% of tenants) + +Use **persistent graphs** for the largest tenants: + +**Advantages:** + +- **Scalability** - Handle graphs exceeding memory limitations +- **Performance optimization** - Eliminate construction latency for complex queries +- **Advanced analytics** - Support sophisticated graph algorithms and analysis +- **Historical analysis** - Multiple snapshots for temporal comparison + +**Implementation pattern:** + +````kusto +// Create graph model for large tenant (example: Contoso) +.create-or-alter graph_model ContosoOrgChart ``` +{ + "Schema": { + "Nodes": { + "Employee": { + "Name": "string", + "Department": "string", + "Level": "int", + "JoinDate": "datetime" + } + }, + "Edges": { + "ReportsTo": { + "Since": "datetime", + "Relationship": "string" + } + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "employees | where tenant == 'Contoso' and stateOfEmployment == 'employed' | project Name, Department, Level, JoinDate", + "NodeIdColumn": "Name", + "Labels": ["Employee"] + }, + { + "Kind": "AddEdges", + "Query": "reportsTo | where tenant == 'Contoso' | summarize arg_max(modificationDate, *) by employee | project employee, manager, modificationDate as Since | extend Relationship = 'DirectReport'", + "SourceColumn": "employee", + "TargetColumn": "manager", + "Labels": ["ReportsTo"] + } + ] + } +} +``` + +// Create snapshot for Contoso +.create graph snapshot ContosoSnapshot from ContosoOrgChart + +// Query Contoso's organizational graph +graph("ContosoOrgChart") +| graph-match (employee)-[reports*1..10]-(executive) + where employee.Department == "Engineering" + project employee = employee.Name, executive = executive.Name, pathLength = array_length(reports) +```` + +### Best practices for ISV scenarios + +1. **Start with transient graphs** - Begin all new tenants with transient graphs for simplicity +2. **Monitor growth patterns** - Implement automatic detection of tenants requiring persistent graphs +3. **Batch snapshot creation** - Schedule snapshot updates during low-usage periods +4. **Tenant isolation** - Ensure graph models and snapshots are properly isolated between tenants +5. **Resource management** - Use workload groups to prevent large tenant queries from affecting smaller tenants +6. **Cost optimization** - Regularly review and optimize the persistent/transient threshold based on actual usage patterns + +This hybrid approach enables organizations to provide always-current data analysis for the majority of tenants while delivering enterprise-scale analytics capabilities for the largest tenants, optimizing both cost and performance across the entire customer base. + +:::moniker-end ## Related content -* [Kusto Query Language (KQL) graph semantics overview](graph-overview.md) -* [Graph operators](graph-operators.md) +- [Graph semantics overview](graph-semantics-overview.md) +- [Common scenarios for using graph semantics](graph-scenarios.md) +- [Graph function](graph-function.md) +- [make-graph operator](make-graph-operator.md) +- [Graph models overview](../management/graph/graph-model-overview.md) diff --git a/data-explorer/kusto/query/graph-function.md b/data-explorer/kusto/query/graph-function.md new file mode 100644 index 0000000000..719a767798 --- /dev/null +++ b/data-explorer/kusto/query/graph-function.md @@ -0,0 +1,77 @@ +--- +title: graph function +description: Learn how to use the graph function to reference a persisted graph entity for querying. +ms.reviewer: royo +ms.topic: reference +ms.date: 05/23/2025 +--- +# graph function (preview) + +>[!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +The `graph` function is an intrinsic function that enables querying of a persisted graph entity, similar to the `cluster()`, `database()`, `external_table()`, and `table()` functions. It supports retrieving either the most recent snapshot of the graph or a specific snapshot. + +## Syntax + +`graph(` *GraphName* `)` + +`graph(` *GraphName* `,` *SnapshotName* `)` + +`graph(` *GraphName* `,` `snapshot=` *SnapshotName* `)` + +## Parameters + +| Name | Type | Required | Description | +|----------------|----------|--------------------|-----------------------------------------------------------------------------| +| *GraphName* | `string` | :heavy_check_mark: | The name of the [graph model](../management/graph/graph-model-overview.md) to query. | +| *SnapshotName* | `string` | | The name of a specific snapshot to retrieve. If not specified, the most recent snapshot is used. | + +## Returns + +The `graph` function returns a graph and must be followed by a [graph operator](graph-operators.md#supported-graph-operators). The function retrieves the specified graph model name, either as the latest snapshot or a specific named snapshot. + +## Examples + +### Query the latest snapshot + +The following example queries the most recent snapshot of a persisted graph named "SecurityGraph": + +```kusto +graph("SecurityGraph") +| graph-match (user)-[permission]->(resource) + where user.type == "User" and resource.type == "Database" + project UserName = user.name, ResourceName = resource.name, Permission = permission.type +``` + +### Query a specific snapshot + +The following example queries a specific snapshot of the graph: + +```kusto +graph("SecurityGraph", "Snapshot_2025_05_01") +| graph-match (attacker)-[attacks]->(target)-[connects]->(system) + where attacker.name == "MaliciousActor" + project Attacker = attacker.name, Target = target.name, System = system.name +``` + +### Query with named parameter syntax + +The following example uses the named parameter syntax to specify a snapshot: + +```kusto +graph("SecurityGraph", snapshot="Snapshot_2025_05_01") +| graph-shortest-paths (start)-[*]->(end) + where start.name == "Alice" and end.name == "Database" + project PathLength = path_length, Path = path_nodes +``` + +## Related content + +* [Graph semantics overview](graph-semantics-overview.md) +* [Persistent graphs overview](../management/graph/graph-persistent-overview.md) +* [Graph model overview](../management/graph/graph-model-overview.md) +* [Graph snapshots overview](../management/graph/graph-snapshot-overview.md) +* [Graph operators](graph-operators.md) diff --git a/data-explorer/kusto/query/graph-mark-components-operator.md b/data-explorer/kusto/query/graph-mark-components-operator.md index 3465f78bd6..66c0ab2a71 100644 --- a/data-explorer/kusto/query/graph-mark-components-operator.md +++ b/data-explorer/kusto/query/graph-mark-components-operator.md @@ -3,7 +3,7 @@ title: graph-mark-components operator (preview) description: Learn how to use the graph-mark-components operator to find and mark all connected components of a graph. ms.reviewer: royo ms.topic: reference -ms.date: 02/17/2025 +ms.date: 05/25/2025 --- # graph-mark-components operator (preview) @@ -112,7 +112,7 @@ ChildOf ## Related content -* [Best practices for Kusto Query Language (KQL) graph semantics](graph-best-practices.md) +* [Graph best practices](graph-best-practices.md) * [Graph operators](graph-operators.md) * [make-graph operator](make-graph-operator.md) * [graph-match operator](graph-match-operator.md) diff --git a/data-explorer/kusto/query/graph-operators.md b/data-explorer/kusto/query/graph-operators.md index 5dad470135..f19f3acb9b 100644 --- a/data-explorer/kusto/query/graph-operators.md +++ b/data-explorer/kusto/query/graph-operators.md @@ -9,15 +9,9 @@ ms.date: 11/05/2024 > [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] -Kusto Query Language (KQL) graph operators enable graph analysis of data by representing tabular data as a graph with nodes and edges. This setup lets us use graph operations to study the connections and relationships between different data points. +Kusto Query Language (KQL) graph operators enable graph analysis of data by representing tabular data as a graph with nodes and edges, or by referencing persistent graph entities. This setup lets you use graph operations to study the connections and relationships between different data points. -Graph analysis is typically comprised of the following steps: - -1. Prepare and preprocess the data using tabular operators -1. Build a graph from the prepared tabular data using [make-graph](make-graph-operator.md) -1. Perform graph analysis using [graph-match](graph-match-operator.md) -1. Transform the results of the graph analysis back into tabular form using [graph-to-table](graph-to-table-operator.md) -1. Continue the query with tabular operators +Graph analysis can be performed using either transient graphs (created dynamically from tabular data using [make-graph](make-graph-operator.md)) or persistent graphs (referenced using the [graph](graph-function.md) function). Once a graph is established, you can use graph operators such as [graph-match](graph-match-operator.md), [graph-shortest-paths](graph-shortest-paths-operator.md), and [graph-to-table](graph-to-table-operator.md) to analyze relationships, find patterns, and transform results back into tabular form for further processing. ## Supported graph operators @@ -26,26 +20,13 @@ The following table describes the supported graph operators. | Operator | Description | |--|--| | [make-graph](make-graph-operator.md) | Builds a graph from tabular data. | +| [graph](graph-function.md) | References a persisted graph entity and retrieves the latest or a specific snapshot. | | [graph-match](graph-match-operator.md) | Searches for patterns in a graph. | | [graph-to-table](graph-to-table-operator.md) | Builds nodes or edges tables from a graph. | | [graph-shortest-paths](graph-shortest-paths-operator.md) | Finds the shortest paths from a given set of source nodes to a set of target nodes. | | [graph-mark-components](graph-mark-components-operator.md) | Finds and marks all connected components. | -## Graph model - -A graph is modeled as a *directed property graph* that represents the data as a network of vertices, or *nodes*, connected by *edges*. Both nodes and edges can have properties that store more information about them, and a node in the graph must have a unique identifier. A pair of nodes can have multiple edges between them that have different properties or direction. There's no special distinction of *labels* in the graph, and any property can act as a label. - -## Graph lifetime - -A graph is a transient object. It's built in each query that contains graph operators and ceases to exist once the query is completed. To persist a graph, it has to first be transformed back into tabular form and then stored as edges or nodes tables. - -## Limitations and recommendations - -The graph object is built in memory on the fly for each graph query. As such, there's a performance cost for building the graph and a limit to the size of the graph that can be built. - -Although it isn't strictly enforced, we recommend building graphs with at most 10 million elements (nodes and edges). The actual memory limit for the graph is determined by [query operators memory limit](../concepts/query-limits.md#limit-on-memory-consumed-by-query-operators-e_runaway_query). - ## Related content -* [Graph overview](graph-overview.md) +* [Graph semantics overview](graph-semantics-overview.md) * [Graph best practices](graph-best-practices.md) diff --git a/data-explorer/kusto/query/graph-overview.md b/data-explorer/kusto/query/graph-overview.md deleted file mode 100644 index 06aad16470..0000000000 --- a/data-explorer/kusto/query/graph-overview.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: Kusto Query Language (KQL) graph semantics overview -description: Learn about how to contextualize data in queries using KQL graph semantics -ms.reviewer: herauch -ms.topic: conceptual -ms.date: 08/11/2024 -# Customer intent: As a data analyst, I want to learn about how to contextualize data in queries using KQL graph semantics ---- - -# Kusto Query Language (KQL) graph semantics overview - -> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] - -Graph semantics in Kusto Query Language (KQL) allows you to model and query data as graphs. The structure of a graph comprises nodes and edges that connect them. Both nodes and edges can have properties that describe them. - -Graphs are useful for representing complex and dynamic data that involve many-to-many, hierarchical, or networked relationships, such as social networks, recommendation systems, connected assets, or knowledge graphs. -For example, the following graph illustrates a social network that consists of four nodes and three edges. Each node has a property for its name, such as *Bob*, and each edge has a property for its type, such as *reportsTo*. - -:::image type="content" source="media/graph/graph-social-network.png" alt-text="Diagram that shows a social network as a graph."::: - -Graphs store data differently from relational databases, which use tables and need indexes and joins to connect related data. In graphs, each node has a direct pointer to its neighbors (adjacency), so there's no need to index or join anything, making it easy and fast to traverse the graph. Graph queries can use the graph structure and meaning to do complex and powerful operations, such as finding paths, patterns, shortest distances, communities, or centrality measures. - -You can create and query graphs using KQL graph semantics, which has a simple and intuitive syntax that works well with the existing KQL features. You can also mix graph queries with other KQL features, such as time-based, location-based, and machine-learning queries, to do more advanced and powerful data analysis. By using KQL with graph semantics, you get the speed and scale of KQL queries with the flexibility and expressiveness of graphs. - -For example, you can use: - -- Time-based queries to analyze the evolution of a graph over time, such as how the network structure or the node properties change -- Geospatial queries to analyze the spatial distribution or proximity of nodes and edges, such as how the location or distance affects the relationship -- Machine learning queries to apply various algorithms or models to graph data, such as clustering, classification, or anomaly detection - -## How does it work? - -Every query of the graph semantics in Kusto requires creating a new graph representation. You use a graph operator that converts tabular expressions for edges and optionally nodes into a graph representation of the data. Once the graph is created, you can apply different operations to further enhance or examine the graph data. - -The graph semantics extension uses an in-memory graph engine that works on the data in the memory of your cluster, making graph analysis interactive and fast. The memory consumption of a graph representation is affected by the number of nodes and edges and their respective properties. The graph engine uses a property graph model that supports arbitrary properties for nodes and edges. It also integrates with all the existing scalar operators of KQL, which gives users the ability to write expressive and complex graph queries that can use the full power and functionality of KQL. - -## Why use graph semantics in KQL? - -There are several reasons to use graph semantics in KQL, such as the following examples: - -- KQL doesn't support recursive joins, so you have to explicitly define the traversals you want to run (see [Scenario: Friends of a friend](graph-scenarios.md#friends-of-a-friend)). You can use the [make-graph operator](make-graph-operator.md) to define hops of variable length, which is useful when the relationship distance or depth isn't fixed. For example, you can use this operator to find all the resources that are connected in a graph or all the places you can reach from a source in a transportation network. - -- Time-aware graphs are a unique feature of graph semantics in KQL that allow users to model graph data as a series of graph manipulation events over time. Users can examine how the graph evolves over time, such as how the graph's network structure or the node properties change, or how the graph events or anomalies happen. For example, users can use time series queries to discover trends, patterns, or outliers in the graph data, such as how the network density, centrality, or modularity change over time - -- The intellisense feature of the KQL query editor assists users in writing and executing queries in the query language. It provides syntax highlighting, autocompletion, error checking, and suggestions. It also helps users with the graph semantics extension by offering graph-specific keywords, operators, functions, and examples to guide users through the graph creation and querying process. - -## Limits - -The following are some of the main limits of the graph semantics feature in KQL: - -- You can only create or query graphs that fit into the memory of one cluster node. -- Graph data isn't persisted or distributed across cluster nodes, and is discarded after the query execution. - -Therefore, When using the graph semantics feature in KQL, you should consider the memory consumption and performance implications of creating and querying large or dense graphs. Where possible, you should use filters, projections, and aggregations to reduce the graph size and complexity. - -## Related content - -- [Graph operators](graph-operators.md) -- [Scenarios](graph-scenarios.md) -- [Best practices](graph-best-practices.md) diff --git a/data-explorer/kusto/query/graph-scenarios.md b/data-explorer/kusto/query/graph-scenarios.md index ce632bbf43..a64d2b6ba8 100644 --- a/data-explorer/kusto/query/graph-scenarios.md +++ b/data-explorer/kusto/query/graph-scenarios.md @@ -3,28 +3,34 @@ title: Scenarios for using Kusto Query Language (KQL) graph semantics description: Learn about common scenarios for using Kusto Query Language (KQL) graph semantics. ms.reviewer: herauch ms.topic: conceptual -ms.date: 08/11/2024 +ms.date: 05/25/2025 # Customer intent: As a data analyst, I want to learn about common scenarios for using Kusto Query Language (KQL) graph semantics. --- -# What are common scenarios for using Kusto Query Language (KQL) graph semantics? +# Common scenarios for using graph semantics -> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] +>[!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] -Graph semantics in Kusto Query Language (KQL) allows you to model and query data as graphs. There are many scenarios where graphs are useful for representing complex and dynamic data that involve many-to-many, hierarchical, or networked relationships, such as social networks, recommendation systems, connected assets, or knowledge graphs. +Graph semantics in Kusto Query Language (KQL) enables modeling and querying data as interconnected networks. This approach excels at representing complex data with many-to-many relationships, hierarchical structures, and networked systems—including social networks, recommendation engines, connected assets, and knowledge graphs. -In this article, you learn about the following common scenarios for using KQL graph semantics: +This article explores the following common scenarios for using KQL graph semantics: -- [Friends of a friend](#friends-of-a-friend) -- [Insights from log data](#insights-from-log-data) +- [Social network analysis](#social-network-analysis) +- [Log data insights](#log-data-insights) +- [Resource graph exploration](#resource-graph-exploration) +- [Multi-domain security analysis](#multi-domain-security-analysis) +- [Time series and graph analytics](#time-series-and-graph-analytics) +- [Digital twins and graph historization](#digital-twins-and-graph-historization) -## Friends of a friend +## Social network analysis -One common use case for graphs is to model and query social networks, where nodes are users and edges are friendships or interactions. For example, imagine we have a table called *Users* that has data about users, such as their name and organization, and a table called *Knows* that has data about the friendships between users as shown in the following diagram: +Social network analysis represents a fundamental graph use case where nodes are users and edges represent relationships or interactions. Consider a data model with a *Users* table containing user attributes (name, organization) and a *Knows* table documenting relationships between users: -:::image type="content" source="media/graph/graph-friends-of-a-friend.png" alt-text="Diagram that shows a graph of friends of a friend."::: +:::image type="content" source="media/graphs/social-network-analysis.png" alt-text="Example diagram using social network analysis."::: -Without using graph semantics in KQL, you could create a graph to find friends of a friend by using multiple joins, as follows: +### Traditional approach challenges + +Without graph semantics, finding "friends-of-friends" requires multiple complex joins: ```kusto let Users = datatable (UserId: string, name: string, org: string)[]; // nodes @@ -39,7 +45,9 @@ Users | project name, name1, name2 ``` -You can use graph semantics in KQL to perform the same query in a more intuitive and efficient way. The following query uses the [make-graph operator](make-graph-operator.md) to create a directed graph from *FirstUser* to *SecondUser* and enriches the properties on the nodes with the columns provided by the *Users* table. Once the graph is instantiated, the [graph-match operator](graph-match-operator.md) provides the friend-of-a-friend pattern including filters and a projection that results in a tabular output. +### Graph semantics solution + +KQL graph semantics simplifies this significantly. The [make-graph operator](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) creates a directed graph, while the [graph-match operator](/kusto/query/graph-match-operator?view=azure-data-explorer&preserve-view=true) expresses the pattern concisely: ```kusto let Users = datatable (UserId:string , name:string , org:string)[]; // nodes @@ -51,11 +59,9 @@ Knows project contoso_person = user.name, middle_man = middle_man.name, kusto_friend_of_friend = friendOfAFriend.name ``` -## Insights from log data +## Log data insights -In some use cases, you want to gain insights from a simple flat table containing time series information, such as log data. The data in each row is a string that contains raw data. To create a graph from this data, you must first identify the entities and relationships that are relevant to the graph analysis. For example, suppose you have a table called *rawLogs* from a web server that contains information about requests, such as the timestamp, the source IP address, the destination resource, and much more. - -The following table shows an example of the raw data: +Log data analysis often requires extracting insights from flat tables containing time-series information. Converting this data to a graph structure requires identifying relevant entities and relationships. Consider a table called *rawLogs* containing web server request information: ```kusto let rawLogs = datatable (rawLog: string) [ @@ -65,9 +71,9 @@ let rawLogs = datatable (rawLog: string) [ ]; ``` -One possible way to model a graph from this table is to treat the source IP addresses as nodes and the web requests to resources as edges. You can use the [parse operator](parse-operator.md) to extract the columns you need for the graph and then you can create a graph that represents the network traffic and interactions between different sources and destinations. To create the graph, you can use the [make-graph operator](make-graph-operator.md) specifying the source and destination columns as the edge endpoints, and optionally providing additional columns as edge or node properties. +### Creating a graph from log data -The following query creates a graph from the raw logs: +Model the graph by treating source IP addresses as nodes and web requests to resources as edges. Use the [parse operator](/kusto/query/parse-operator?view=azure-data-explorer&preserve-view=true) to extract required columns: ```kusto let parsedLogs = rawLogs @@ -84,11 +90,13 @@ let graph = edges | make-graph ipAddress --> resource with nodes on nodeId; ``` -This query parses the raw logs and creates a directed graph where the nodes are either IP addresses or resources and each edge is a request from the source to the destination, with the timestamp and HTTP verb as edge properties. +This creates a directed graph where nodes are IP addresses or resources, and edges represent requests with timestamp and HTTP verb properties: + +:::image type="content" source="media/graphs/create-graph-from-log-data.png" alt-text="Example graph created from log data."::: -:::image type="content" source="media/graph/graph-recommendation.png" alt-text="Diagram that shows a graph of the parsed log data."::: +### Query patterns for recommendations -Once the graph is created, you can use the [graph-match operator](graph-match-operator.md) to query the graph data using patterns, filters, and projections. For example, you can create a pattern that makes a simple recommendation based on the resources that other IP addresses requested within the last five minutes, as follows: +Use [graph-match](graph-match-operator.md) to create simple recommendations based on resources requested by other IP addresses: ```kusto graph @@ -100,15 +108,139 @@ graph project Recommendation=otherResource.nodeId ``` -**Output** +**Output:** | Recommendation | | -------------- | | /product/42 | -The query returns "/product/42" as a recommendation based on a raw text-based log. +This demonstrates how graph semantics can extract meaningful insights from raw log data. + +## Resource graph exploration + +Resource graphs enable efficient exploration and querying of resources at scale, supporting governance, management, and security requirements. These graphs continuously update as resources change, providing dynamic tracking of your resource inventory. + +:::image type="content" source="media/graphs/resource-graph-exploration.png" alt-text="Example graph created using resource exploration."::: + +### Enterprise resource management challenges + +Consider an enterprise with complex cloud infrastructure containing: + +- Virtual machines, databases, storage accounts, and networking components +- User identities with varying permissions across multiple environments +- Complex resource hierarchies spanning different organizational units + +The key challenge lies in efficiently managing and querying this extensive resource inventory for security compliance and access control. + +### Graph-based solutions + +KQL graph semantics enables security administrators to model complex resource hierarchies and permission structures as graphs. This approach supports powerful queries that can: + +- Trace access paths from users through groups to resources +- Identify overprivileged accounts and potential security vulnerabilities +- Detect configuration issues in resource permissions +- Validate compliance with organizational policies + +For enterprise-scale resource graphs, materialized views can represent the current state of nodes and edges, enabling both real-time analysis and historical queries of how resources and permissions have evolved over time. + +For detailed examples and sample code, see the [Resource Graph samples on GitHub](https://github.com/Azure/azure-kusto-graph-samples/tree/main/resource%20graph). + +## Multi-domain security analysis + +Security operations often require analyzing relationships across multiple domains simultaneously. The "Graph of Graph" approach enables modeling and analyzing interconnected data structures by combining separate graph domains: identity, network, and asset graphs. + +:::image type="content" source="media/graphs/multi-domain-security-analysis.png" alt-text="Example of a multi-domain security analysis graph."::: + +### Multi-domain analysis methodology + +This methodology maintains separation between domain-specific graphs while enabling sophisticated cross-domain analysis through query composition. Consider a scenario where an organization needs to detect sophisticated attacks using: + +- **Identity graph** - Modeling users, groups, and permissions to understand access rights +- **Network graph** - Representing devices and connections to detect unusual network patterns +- **Asset graph** - Cataloging resources and sensitivity levels to assess potential impact + +### Advanced security insights + +By traversing relationships across these domains, security analysts can identify attack paths invisible when examining each domain separately. This approach excels at: + +- **Detecting lateral movement** across network segments +- **Identifying privilege escalation** attempts via group membership changes +- **Discovering data exfiltration** from high-sensitivity resources +- **Correlating authentication patterns** with resource access + +For detailed examples and implementation guidance, see the [Graph of Graph samples on GitHub](https://github.com/Azure/azure-kusto-graph-samples/tree/main/graph%20of%20graph). + +## Time series and graph analytics + +Combining graph analysis with time-series analytics creates a powerful framework for detecting temporal anomalies while understanding their impact across interconnected systems. This integration delivers significant value for security analytics, IoT monitoring, and operational intelligence. + +:::image type="content" source="media/graphs/Time-series-graph-analytics.png" alt-text="Example image of a workflow diagram using time series and graph analytics."::: + +### Temporal anomaly detection with context + +Time-series data often contains temporal patterns indicating normal or anomalous behavior. When combined with graph structures, these patterns gain meaningful context through relationship and access path analysis. + +### Security applications + +In security contexts, this integration identifies potentially malicious activities through: + +1. **Authentication anomaly detection** - Flagging logins deviating from usual patterns (time, location, frequency) +2. **Access path analysis** - Determining what sensitive resources anomalous users can reach through permission chains +3. **Impact assessment** - Evaluating the potential blast radius of unusual activity + +### Broader applications + +Beyond security, this approach applies to: + +- **IoT systems** - Correlating device anomalies with connected infrastructure +- **Business operations** - Linking transaction anomalies with organizational structures +- **IT infrastructure** - Connecting performance anomalies with service dependencies + +By combining time-series and graph analytics, KQL enables analysts to understand both the nature of anomalies and their contextual impact across interconnected systems. + +For implementation examples and detailed code samples, see the [Time Series and Graph samples on GitHub](https://github.com/Azure/azure-kusto-graph-samples/blob/main/graph%20of%20graph/timeseriesAndGraph.kql). + +## Digital twins and graph historization + +Digital twins provide virtual representations of physical objects or systems, enabling precise modeling and simulation of real-world entities. Graph semantics in KQL excels in digital twin scenarios because relationships between entities—facilities, equipment, sensors, and people—naturally form graph structures. + +:::image type="content" source="media/graphs/digital-twins-graph-historization.png" alt-text="Example image of a workflow diagram using digital twins and graph historization."::: + +### Digital twin capabilities with KQL + +Graph semantics enables comprehensive digital twin modeling through: + +- **Hierarchical modeling** - Representing complex facility and equipment hierarchies +- **Multi-entity relationships** - Connecting physical assets, virtual representations, and human operators +- **Real-time state tracking** - Monitoring occupancy, equipment status, and environmental conditions +- **Cross-domain analysis** - Correlating physical space utilization with operational metrics + +### Graph historization for temporal analysis + +A critical aspect of digital twin management is capturing and analyzing temporal changes. By historizing graph changes in Kusto, organizations can: + +1. **Track evolution over time** - Monitor how physical spaces and systems change +2. **Conduct historical analysis** - Identify patterns and trends in utilization and performance +3. **Compare historical states** - Detect anomalies or measure improvements across time periods +4. **Develop predictive models** - Use historical utilization patterns for future planning and optimization + +### Implementation benefits + +This approach enables organizations to: + +- Monitor space utilization patterns and optimize facility management +- Track equipment performance and predict maintenance needs +- Analyze environmental conditions and their impact on operations +- Correlate human behavior patterns with physical infrastructure usage + +For detailed implementation examples and code samples, see the [Digital Twins samples on GitHub](https://github.com/Azure/azure-kusto-graph-samples/tree/main/digital%20twins). ## Related content -- [Best practices](graph-best-practices.md) -- [Graph operators](graph-operators.md) +- [Graph semantics overview](graph-semantics-overview.md) +- [Best practices for KQL graph semantics](graph-best-practices.md) +- [Graph function](graph-function.md) +- [make-graph operator](make-graph-operator.md) +- [Azure Kusto Graph Samples on GitHub](https://github.com/Azure/azure-kusto-graph-samples) +- [Advanced KQL graph capabilities for security analysis](https://github.com/Azure/azure-kusto-graph-samples/blob/main/graph%20of%20graph/advanced-kql-capabilities.md) +- [Digital twins with KQL graph semantics](https://github.com/Azure/azure-kusto-graph-samples/tree/main/digital%20twins) diff --git a/data-explorer/kusto/query/graph-semantics-overview.md b/data-explorer/kusto/query/graph-semantics-overview.md new file mode 100644 index 0000000000..24f0678abd --- /dev/null +++ b/data-explorer/kusto/query/graph-semantics-overview.md @@ -0,0 +1,202 @@ +--- +title: Graph semantics overview +description: Learn about graph semantics in Kusto and the different approaches to create and query graphs +ms.reviewer: herauch +ms.topic: conceptual +ms.date: 05/23/2025 +--- + +# Graph semantics overview + +>[!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] + +Graph semantics in Kusto enables you to model and query data as interconnected networks. A graph consists of nodes (entities) and edges (relationships) that connect them. Both nodes and edges can contain properties, creating a rich data model for complex relationships. + +Graphs excel at representing complex data with many-to-many relationships, hierarchical structures, or networked connections—such as social networks, recommendation systems, connected assets, and knowledge graphs. Unlike relational databases that require indexes and joins to connect data across tables, graphs use direct adjacency between nodes, enabling fast and intuitive traversal of relationships. + +The following graph illustrates a cybersecurity attack path scenario. Nodes represent entities such as external sources, users, and critical assets, while edges represent actions or relationships that form a potential attack sequence. + +:::image type="content" source="media/graphs/graph-scenario-cybersecurity.png" alt-text="Graph showing the cybersecurity scenario including phishing email and path to accessing a sensitive database."::: + +Graph queries leverage graph structure to perform sophisticated operations such as finding paths, patterns, shortest distances, communities, and centrality measures. These capabilities make graphs powerful for modeling relationships, interactions, dependencies, and flows across domains—including social networks, supply chains, IoT device networks, digital twins, recommendation systems, and organizational structures. + +The following graph shows a supply chain scenario where nodes represent suppliers, manufacturers, and distributors, and edges represent supply relationships. This example demonstrates how graphs model flows and dependencies across different business contexts. + +:::image type="content" source="media/graphs/graph-supply-chain.png" alt-text="Graph of two suppliers, manufacturer, and distributor, and the supply relationship."::: + +## Why use graph semantics in Kusto? + +Kusto's graph capabilities offer significant advantages by **leveraging existing data investments** while adding sophisticated relationship modeling: + +- **No data migration required** - Build graph models directly from current data without duplication +- **Cost-effective solution** - Eliminates the complexity and expense of dedicated graph databases +- **Temporal analysis support** - As a time-series database, Kusto naturally enables analysis of graph evolution over time +- **Event-based modeling** - Treats graphs as sequences of relationship events, aligning with Kusto's strength in event processing +- **Seamless KQL integration** - Graph operators work alongside all existing KQL capabilities with full IntelliSense support + +This approach delivers **enterprise-grade relationship modeling** while maintaining Kusto's performance, scale, and familiar interface. Organizations can analyze complex interconnected data across domains—from supply chains and organizational hierarchies to IoT device networks and social relationships—without extra infrastructure investments. + +## Transient graph creation approach + +Transient graphs are created dynamically using the [`make-graph`](/kusto/query/make-graph-operator?view=azure-data-explorer&preserve-view=true) operator. These graphs exist in memory during query execution and are automatically discarded when the query completes. + +### Key characteristics + +- **Dynamic creation** - Built from tabular data using KQL queries with the entire structure residing in memory +- **Immediate availability** - No preprocessing or setup requirements +- **Memory constraints** - Graph size is limited by available memory on cluster nodes +- **Performance factors** - Graph topology and property sizes determine memory requirements + +This approach is optimal for smaller to medium-sized datasets where immediate analysis is needed. + +### Use cases for transient graphs + +Transient graphs excel in several scenarios: + +- **Ad hoc analysis** - One-time investigations requiring quick pattern examination +- **Exploratory data analysis** - Testing hypotheses and validating analytical approaches +- **Small to medium datasets** - Real-time analysis of recent events or focused data subsets +- **Rapid prototyping** - Testing graph patterns before implementing persistent models +- **Dynamic data analysis** - Frequently changing data that doesn't justify persistent storage + +Common applications include real-time IoT monitoring, supply chain relationship analysis, customer journey mapping, and any scenario requiring immediate visualization of entity relationships. + +:::moniker range="azure-data-explorer || microsoft-fabric" + +## Persistent graph creation approach + +Persistent graphs use [graph models](../management/graph/graph-model-overview.md) and [graph snapshots](../management/graph/graph-snapshot-overview.md) to provide robust solutions for large-scale, complex graphs representing organizational networks, supply chains, IoT ecosystems, digital twins, and other interconnected data domains. + +### Key characteristics for persistent graphs + +- **Persistent storage** - Graph models and snapshots are stored in database metadata for durability and consistency +- **Scalability** - Handle graphs exceeding memory limitations with enterprise-scale analysis capabilities +- **Reusability** - Multiple users can query the same structure without rebuilding, enabling collaborative analysis +- **Performance optimization** - Eliminate graph construction latency for repeated queries +- **Version control** - Multiple snapshots represent graphs at different time points for historical analysis +- **Schema support** - Structured definitions for different entity types and their properties + +The schema capability supports both static labels (predefined in the graph model) and dynamic labels (generated at runtime from data), providing flexibility for complex environments with diverse entity types. + +### Use cases for persistent graphs + +Persistent graphs are essential for: + +- **Enterprise analytics** - Continuous monitoring workflows across complex networks +- **Large-scale data analysis** - Enterprise-scale graphs with millions of nodes and relationships +- **Collaborative analysis** - Multiple teams working with shared graph structures +- **Production workflows** - Automated systems requiring consistent graph access +- **Historical comparison** - Time-based analysis of graph evolution and changes + +##### Example: Digital Twin Persistent Graph + +:::image type="content" source="media/graphs/digital-twin-persistent-graph.png" alt-text="A graph showing the digital twin factory example with device relationships and equipment dependencies."::: + +In digital twin and IoT scenarios, persistent graphs support regular analysis of device relationships, equipment dependencies, and system evolution over time. Historical analysis allows comparing system states across different periods, tracking the evolution of assets, and conducting long-term trend analysis. + +##### Example: IoT and digital twin persistent graph + +IoT and digital twin applications benefit significantly from persistent graphs when modeling complex relationships between physical devices and their virtual representations across distributed systems. These graphs enable organizations to: + +- Create comprehensive models of IoT deployments and connected assets +- Support real-time monitoring, predictive maintenance, and performance optimization +- Analyze equipment dependencies and identify potential failure points +- Optimize sensor placements through physical and logical topology understanding +- Track device configurations, communications, and performance characteristics over time +- Detect communication pattern anomalies and visualize smart environment evolution +- Simulate operating conditions before implementing physical infrastructure changes + +This persistent approach proves invaluable for managing complex IoT ecosystems at scale. +:::moniker-end + +## Graph querying capabilities + +Once a graph is established (through `make-graph` or from a snapshot), you can leverage the full suite of KQL graph operators for comprehensive analysis: + +**Core operators:** + +- [`graph-match`](graph-match-operator.md) - Enables sophisticated pattern matching and traversal operations for identifying complex relationship sequences +- [`graph-shortest-paths`](graph-shortest-paths-operator.md) - Finds optimal paths between entities, helping prioritize connections and identify critical relationships +- [`graph-to-table`](graph-to-table-operator.md) - Converts graph analysis results to tabular format for integration with existing systems + +**Advanced analysis capabilities:** + +- **Time-based analysis** - Examine how relationships and patterns evolve over time +- **Geospatial integration** - Combine graph data with location-based intelligence for geographic pattern analysis +- **Machine learning integration** - Apply algorithms for entity clustering, pattern classification, and anomaly detection + +These capabilities support diverse use cases including customer journey analysis, product recommendation systems, IoT networks, digital twins, and knowledge graphs. + +:::moniker range="azure-data-explorer || microsoft-fabric" +## Choosing the right approach + +The following decision tree helps you select the most appropriate graph creation approach based on your specific requirements and constraints. + +### Decision Tree: Transient vs Persistent Graphs + +:::image type="content" source="media/graphs/decision-matrix-persistent-or-transient.png" alt-text="Flowchart showing a decision tree for when to use persistent or transient graphs."::: + +### When to use transient graphs + +Choose transient graphs for: + +- **Graph size under 10 million nodes and edges** (for optimal performance) +- **Single user or small team analysis** with minimal collaboration requirements +- **One-time or exploratory investigations** where immediate results are needed +- **Real-time data analysis** requiring current state information +- **Rapid prototyping and testing** of graph patterns and query logic + +While transient graphs can handle larger datasets, query execution time increases as the graph must be reconstructed for every query. Consider this performance trade-off when working with larger datasets. + +### When to use persistent graphs + +Choose persistent graphs for: + +- **Graph size exceeding 10 million nodes and edges** where distributed storage is beneficial +- **Multiple teams requiring shared access** for collaborative analysis +- **Repeated analysis on stable datasets** where construction latency impacts productivity +- **Production workflow integration** requiring consistent, reliable graph access +- **Historical comparison requirements** for tracking changes over time +- **Memory capacity limitations** affecting query performance +- **Collaborative investigation workflows** across teams and time zones + +Persistent graphs are essential when working with enterprise-scale data or when memory limitations affect performance. + +## Performance considerations + +### Memory usage + +- **Transient graphs** - Limited by single cluster node memory, constraining use to datasets within available RAM +- **Persistent graphs** - Leverage distributed storage and optimized access patterns for enterprise-scale data + +### Query latency + +- **Transient graphs** - Include construction time in each query, with delays increasing for large datasets or external data sources +- **Persistent graphs** - Eliminate construction latency through prebuilt snapshots, enabling rapid analysis + +External data source dependencies (Kusto, SQL, Cosmos DB) can significantly affect transient graph construction time, as each query must wait for external responses. + +### Data freshness + +- **Transient graphs** - Always reflect current data state, ideal for real-time analysis +- **Persistent graphs** - Reflect data at snapshot creation time, providing consistency for collaborative analysis but requiring periodic refreshes +:::moniker-end + +## Integration with KQL ecosystem + +Graph semantics integrate seamlessly with KQL's broader capabilities: + +- **Time-series analysis** - Track relationship evolution over time +- **Geospatial functions** - Analyze location-based patterns and geographic anomalies +- **Machine learning operators** - Detect patterns, classify behaviors, and identify anomalies +- **Scalar and tabular operators** - Enable complex transformations, aggregations, and data enrichment + +This integration enables sophisticated workflows including supply chain evolution tracking, geographical asset distribution analysis, community detection through clustering algorithms, and correlation of graph insights with traditional log analysis and external intelligence. + +## Related content + +- [Common scenarios for using KQL graph semantics](graph-scenarios.md) +- [Best practices for KQL graph semantics](graph-best-practices.md) +- [Graph operators](make-graph-operator.md) +- [Graph model overview](../management/graph/graph-model-overview.md) +- [Graph snapshots overview](../management/graph/graph-snapshot-overview.md) diff --git a/data-explorer/kusto/query/graph-shortest-paths-operator.md b/data-explorer/kusto/query/graph-shortest-paths-operator.md index 40ca852b05..afcf2e1f75 100644 --- a/data-explorer/kusto/query/graph-shortest-paths-operator.md +++ b/data-explorer/kusto/query/graph-shortest-paths-operator.md @@ -126,6 +126,6 @@ connections ## Related content -* [Best practices for Kusto Query Language (KQL) graph semantics](graph-best-practices.md) -* [Graph operators](graph-operators.md) -* [make-graph operator](make-graph-operator.md) +* [Best practices for graph semantics](graph-best-practices.md) +* [Graph operator](graph-function.md) +* [make-graph operator](../query/make-graph-operator.md) diff --git a/data-explorer/kusto/query/graph-to-table-operator.md b/data-explorer/kusto/query/graph-to-table-operator.md index 023d331a10..9148292f93 100644 --- a/data-explorer/kusto/query/graph-to-table-operator.md +++ b/data-explorer/kusto/query/graph-to-table-operator.md @@ -11,20 +11,17 @@ ms.date: 08/11/2024 The `graph-to-table` operator exports nodes or edges from a graph to tables. -> [!NOTE] -> This operator is used in conjunction with the [make-graph operator](make-graph-operator.md). - ## Syntax -#### Nodes +### Nodes *G* `|` `graph-to-table` `nodes` [ `with_node_id=`*ColumnName* ] -#### Edges +### Edges *G* `|` `graph-to-table` `edges` [ `with_source_id=`*ColumnName* ] [ `with_target_id=`*ColumnName* ] [ `as` *TableName* ] -#### Nodes and edges +### Nodes and edges *G* `|` `graph-to-table` `nodes` `as` *NodesTableName* [ `with_node_id=`*ColumnName* ]`,` `edges` `as` *EdgesTableName* [ `with_source_id=`*ColumnName* ] [ `with_target_id=`*ColumnName* ] @@ -39,15 +36,15 @@ The `graph-to-table` operator exports nodes or edges from a graph to tables. ## Returns -#### Nodes +### Nodes The `graph-to-table` operator returns a tabular result, in which each row corresponds to a node in the source graph. The returned columns are the node's properties. When `with_node_id` is provided, the node hash column is of `long` type. -#### Edges +### Edges The `graph-to-table` operator returns a tabular result, in which each row corresponds to an edge in the source graph. The returned columns are the node's properties. When `with_source_id` or `with_target_id` are provided, the node hash column is of `long` type. -#### Nodes and edges +### Nodes and edges The `graph-to-table` operator returns two tabular results, matching the previous descriptions. diff --git a/data-explorer/kusto/query/labels-graph-function.md b/data-explorer/kusto/query/labels-graph-function.md new file mode 100644 index 0000000000..ceb6a54c88 --- /dev/null +++ b/data-explorer/kusto/query/labels-graph-function.md @@ -0,0 +1,307 @@ +--- +title: labels() (graph function in Preview) +description: Learn how to use the labels() function to filter nodes and edges based on their labels or project label information in graph queries. +ms.reviewer: michalfaktor +ms.topic: reference +ms.date: 05/26/2025 +--- +# labels() (graph function in Preview) + +> [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +The `labels()` graph function retrieves the labels associated with nodes or edges in a graph. It can be used both for filtering elements based on their labels and for projecting label information in query results. + +Labels are defined within [Graph models](../management/graph/graph-model-overview.md) and can be either static (fixed labels assigned to node or edge types) or dynamic (labels derived from data properties during graph construction). The `labels()` function accesses these predefined labels to enable efficient filtering and analysis of graph elements. + +> [!NOTE] +> This function is used with the [graph-match](graph-match-operator.md) and [graph-shortest-paths](graph-shortest-paths-operator.md) operators. + +> [!IMPORTANT] +> When the `labels()` function is used on a graph created with the `make-graph` operator (that is, a transient graph rather than a persistent graph model), it always returns an empty array (of dynamic data type) for all nodes and edges, because transient graphs do not have label metadata. + +## Syntax + +`labels([element])` + +## Parameters + +| Name | Type | Required | Description | +|---|---|---|---| +| *element* | `string` | | The reference to a graph node or edge variable in a graph pattern. Don't pass any parameters when used inside [all()](all-graph-function.md), [any()](any-graph-function.md), and [map()](map-graph-function.md) graph functions, with [inner_nodes()](inner-nodes-graph-function.md). For more information, see [Graph pattern notation](graph-match-operator.md#graph-pattern-notation). | + +## Returns + +Returns a dynamic array containing the labels associated with the specified node or edge. For nodes and edges without labels, returns an empty array. + +When used inside [all()](all-graph-function.md), [any()](any-graph-function.md), or [map()](map-graph-function.md) with [inner_nodes()](inner-nodes-graph-function.md), call `labels()` without parameters to return the labels for all inner nodes or edges, respectively. + +## Label sources + +Labels are defined in [Graph models](../management/graph/graph-model-overview.md) and can originate from two sources: + +- **Static labels**: Fixed labels assigned to specific node or edge types during graph model definition. These labels remain constant for all instances of a particular type. +- **Dynamic labels**: Labels derived from data properties during graph construction. These labels can vary based on the actual data values and computed expressions. + +The `labels()` function retrieves both static and dynamic labels that have been associated with graph elements through the graph model schema and definition. + +## Examples + +### Filter nodes by labels + +The following example shows how to use the `labels()` function to filter nodes based on their assigned labels. The example includes the full graph model definition to clarify how static and dynamic labels are assigned. + +#### Graph model definition + +````kusto +.create-or-alter graph_model AppProcessGraph ``` +{ + "Schema": { + "Nodes": { + "Application": {"AppName": "string", "Type": "string"}, + "Process": {"ProcessName": "string"} + }, + "Edges": { + "CONNECTS_TO": {} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "Applications | project AppId, AppName, Type, NodeLabels", + "NodeIdColumn": "AppId", + "Labels": ["Application"], + "LabelsColumn": "NodeLabels" + }, + { + "Kind": "AddNodes", + "Query": "Processes | project ProcId, ProcessName", + "NodeIdColumn": "ProcId", + "Labels": ["Process"] + }, + { + "Kind": "AddEdges", + "Query": "AppConnections | project SourceAppId, TargetProcId", + "SourceColumn": "SourceAppId", + "TargetColumn": "TargetProcId", + "Labels": ["CONNECTS_TO"] + } + ] + } +} +``` +```` + +#### Query example + +```kusto +graph('AppProcessGraph') +| graph-match cycles=none (app)-[e*1..3]->(process) + where process.ProcessName contains "nginx" and labels(app) has "Application" + project app=app.AppName +| summarize c=count() by app +| top 10 by c desc +``` + +**Output** + +| app | c | +|---|---| +| WebApp1 | 15 | +| WebApp2 | 12 | +| APIService | 8 | + +### Project labels in results + +The following example demonstrates how to use the `labels()` function in the project clause to include label information in the query results. This query finds connections between different types of network components and includes their labels for analysis. + +#### Graph model definition + +````kusto +.create-or-alter graph_model NetworkGraph ``` +{ + "Schema": { + "Nodes": { + "NetworkComponent": {"ComponentName": "string", "ComponentType": "string"} + }, + "Edges": { + "CONNECTED_TO": {"ConnectionType": "string"} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "NetworkComponentsTable | project Id, ComponentName, ComponentType, NodeLabels", + "NodeIdColumn": "Id", + "Labels": ["NetworkComponent"], + "LabelsColumn": "NodeLabels" + }, + { + "Kind": "AddEdges", + "Query": "ConnectionsTable | project SourceId, TargetId, ConnectionType, EdgeLabels", + "SourceColumn": "SourceId", + "TargetColumn": "TargetId", + "Labels": ["CONNECTED_TO"], + "LabelsColumn": "EdgeLabels" + } + ] + } +} +``` +```` + +#### Query example + +```kusto +graph('NetworkGraph') +| graph-match (source)-[conn]->(target) + where labels(source) has "Network" and labels(target) has "Compute" + project + SourceComponent = source.ComponentName, + TargetComponent = target.ComponentName, + SourceLabels = labels(source), + TargetLabels = labels(target), + ConnectionType = conn.ConnectionType +``` + +**Output** + +| SourceComponent | TargetComponent | SourceLabels | TargetLabels | ConnectionType | +|---|---|---|---|---| +| Switch1 | Server1 | ["Network", "Access"] | ["Compute", "Production"] | Ethernet | + +### Filter by multiple label conditions + +The following example shows how to combine multiple label conditions to find complex patterns in a network topology. This query identifies paths from frontend components to backend components through middleware layers. + +#### Graph model definition + +````kusto +.create-or-alter graph_model AppComponentGraph ``` +{ + "Schema": { + "Nodes": { + "Frontend": {"ComponentName": "string"}, + "Middleware": {"ComponentName": "string"}, + "Backend": {"ComponentName": "string"} + }, + "Edges": { + "DEPENDS_ON": {"DependencyType": "string"} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "ComponentsTable | project Id, ComponentName, NodeLabels", + "NodeIdColumn": "Id", + "LabelsColumn": "NodeLabels" + }, + { + "Kind": "AddEdges", + "Query": "DependenciesTable | project SourceId, TargetId, DependencyType, EdgeLabels", + "SourceColumn": "SourceId", + "TargetColumn": "TargetId", + "Labels": ["DEPENDS_ON"], + "LabelsColumn": "EdgeLabels" + } + ] + } +} +``` +```` + +#### Query example + +```kusto +graph('AppComponentGraph') +| graph-match (frontend)-[dep1]->(middleware)-[dep2]->(backend) + where labels(frontend) has "Frontend" + and labels(middleware) has "Middleware" + and labels(backend) has "Backend" + project + Path = strcat(frontend.ComponentName, " -> ", middleware.ComponentName, " -> ", backend.ComponentName), + FrontendLabels = labels(frontend), + MiddlewareLabels = labels(middleware), + BackendLabels = labels(backend) +``` + +**Output** + +| Path | FrontendLabels | MiddlewareLabels | BackendLabels | +|---|---|---|---| +| WebUI -> APIGateway -> Database | ["Frontend", "UserInterface"] | ["Middleware", "API"] | ["Backend", "Storage"] | +| WebUI -> APIGateway -> Cache | ["Frontend", "UserInterface"] | ["Middleware", "API"] | ["Backend", "Cache"] | + +### Use labels() with all() and any() functions + +The following example demonstrates how to use the `labels()` function without parameters inside `all()` and `any()` functions with variable-length paths. This query finds paths in a service mesh where all intermediate services have "Production" labels and at least one intermediate service has "Critical" labels. + +#### Graph model definition + +```` +.create-or-alter graph_model ServiceMeshGraph ``` +{ + "Schema": { + "Nodes": { + "Service": {"ServiceName": "string", "Environment": "string"} + }, + "Edges": { + "CALLS": {"Protocol": "string"} + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "ServicesTable | project Id, ServiceName, Environment, NodeLabels", + "NodeIdColumn": "Id", + "Labels": ["Service"], + "LabelsColumn": "NodeLabels" + }, + { + "Kind": "AddEdges", + "Query": "ServiceCallsTable | project SourceId, TargetId, Protocol, EdgeLabels", + "SourceColumn": "SourceId", + "TargetColumn": "TargetId", + "Labels": ["CALLS"], + "LabelsColumn": "EdgeLabels" + } + ] + } +} +``` +```` + +#### Query example + +```kusto +graph('ServiceMeshGraph') +| graph-match (source)-[calls*2..4]->(destination) + where source.ServiceName == "UserService" and + destination.ServiceName == "DatabaseService" and + all(inner_nodes(calls), labels() has "Production") and + any(inner_nodes(calls), labels() has "Critical") + project + Path = strcat_array(map(inner_nodes(calls), ServiceName), " -> "), + IntermediateLabels = map(inner_nodes(calls), labels()), + CallProtocols = map(calls, Protocol) +``` + +**Output** + +| Path | IntermediateLabels | CallProtocols | +|---|---|---| +| AuthService -> PaymentService | [["Production", "Auth"], ["Production", "Critical", "Payment"]] | ["HTTPS", "gRPC"] | +| CacheService -> PaymentService | [["Production", "Cache"], ["Production", "Critical", "Payment"]] | ["Redis", "gRPC"] | + +## Related content + +* [Graph operators](graph-operators.md) +* [graph-match operator](graph-match-operator.md) +* [graph-shortest-paths operator](graph-shortest-paths-operator.md) +* [Graph models overview](../management/graph/graph-model-overview.md) diff --git a/data-explorer/kusto/query/map-graph-function.md b/data-explorer/kusto/query/map-graph-function.md index b514373ae9..de735ec56e 100644 --- a/data-explorer/kusto/query/map-graph-function.md +++ b/data-explorer/kusto/query/map-graph-function.md @@ -16,7 +16,7 @@ The `map()` graph function calculates an expression for each *edge* or *inner no ## Syntax -`map``(`*edge*`,` *expression*`)` +`map`(`*edge*`,` *expression*`)` `map(inner_nodes(`*edge*`),` *expression*`)` diff --git a/data-explorer/kusto/query/media/graph/graph-friends-of-a-friend.png b/data-explorer/kusto/query/media/graph/graph-friends-of-a-friend.png deleted file mode 100644 index 1a789a1452..0000000000 Binary files a/data-explorer/kusto/query/media/graph/graph-friends-of-a-friend.png and /dev/null differ diff --git a/data-explorer/kusto/query/media/graph/graph-property-graph.png b/data-explorer/kusto/query/media/graph/graph-property-graph.png deleted file mode 100644 index 2319b8599f..0000000000 Binary files a/data-explorer/kusto/query/media/graph/graph-property-graph.png and /dev/null differ diff --git a/data-explorer/kusto/query/media/graph/graph-recommendation.png b/data-explorer/kusto/query/media/graph/graph-recommendation.png deleted file mode 100644 index 5f33b1f64e..0000000000 Binary files a/data-explorer/kusto/query/media/graph/graph-recommendation.png and /dev/null differ diff --git a/data-explorer/kusto/query/media/graph/graph-social-network.png b/data-explorer/kusto/query/media/graph/graph-social-network.png deleted file mode 100644 index 9b372e21ca..0000000000 Binary files a/data-explorer/kusto/query/media/graph/graph-social-network.png and /dev/null differ diff --git a/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.mmd b/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.mmd new file mode 100644 index 0000000000..7205afe1ef --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.mmd @@ -0,0 +1,32 @@ +--- +config: + look: neo + theme: default +--- +flowchart TD + subgraph "Authentication Timeseries" + User001T["User001 Logins
9am-5pm Seattle
Regular Pattern"] + User002T["User002 Logins
⚠️ 3am Bangkok
Anomalous Pattern"] + end + + subgraph "Access Graph" + User001["User001
Finance"] -- "memberOf" --> Group001["Group001
Finance Security Group"] + User002["User002
IT"] -- "memberOf" --> Group002["Group002
IT Security Group"] + Group001 -- "canAccess" --> Resource001["Resource001
Finance Database"] + Group002 -- "canAdminister" --> Resource001 + end + + User001T -.- User001 + User002T -.- User002 + + classDef anomaly fill:#f66,stroke:#333,stroke-width:2px + classDef normal fill:#6f6,stroke:#333,stroke-width:2px + classDef user fill:#f9f,stroke:#333,stroke-width:2px + classDef group fill:#bbf,stroke:#333,stroke-width:2px + classDef resource fill:#dfd,stroke:#333,stroke-width:2px + + class User001T normal + class User002T anomaly + class User001,User002 user + class Group001,Group002 group + class Resource001 resource diff --git a/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.png b/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.png new file mode 100644 index 0000000000..5145d03ba5 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/Time-series-graph-analytics.png differ diff --git a/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.mmd b/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.mmd new file mode 100644 index 0000000000..7cc8d0fb2c --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.mmd @@ -0,0 +1,15 @@ +--- +config: + look: neo + theme: default +--- +flowchart LR + IP1["31.56.96.51"] -- "Requests (GET)
timestamp: 2019-01-22 00:26:16" --> Prod1["/product/27"] + IP1 -- "Requests (GET)
timestamp: 2019-01-22 00:26:17" --> Prod2["/product/42"] + IP2["54.36.149.41"] -- "Requests (GET)
timestamp: 2019-01-22 00:26:14" --> Prod1 + + classDef ip fill:#4a86e8,stroke:#333,color:white + classDef product fill:#4a86e8,stroke:#333,color:white + + class IP1,IP2 ip + class Prod1,Prod2 product \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.png b/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.png new file mode 100644 index 0000000000..7d130cadbd Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/create-graph-from-log-data.png differ diff --git a/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.mmd b/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.mmd new file mode 100644 index 0000000000..b19a2fcdd0 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.mmd @@ -0,0 +1,19 @@ +--- +config: + look: neo + theme: default +--- +flowchart TD + Start([Which graph approach should I use?]) --> Size{Graph size > 10M nodes/edges?} + + Size -->|Yes| Persistent[🏛️ Use Persistent Graphs] + Size -->|No| Teams{Multiple teams or
repeated analysis?} + + Teams -->|Yes| Persistent + Teams -->|No| Transient[⚡ Use Transient Graphs] + + style Start fill:#e1f5fe,stroke:#0277bd,color:#000 + style Persistent fill:#c8e6c9,stroke:#2e7d32,color:#000 + style Transient fill:#fff3e0,stroke:#ef6c00,color:#000 + style Size fill:#f3e5f5,stroke:#7b1fa2,color:#000 + style Teams fill:#f3e5f5,stroke:#7b1fa2,color:#000 \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.png b/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.png new file mode 100644 index 0000000000..1eb7cd4199 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/decision-matrix-persistent-or-transient.png differ diff --git a/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.mmd b/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.mmd new file mode 100644 index 0000000000..6888bb132f --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.mmd @@ -0,0 +1,18 @@ +--- +config: + look: neo + theme: default +--- +graph TD + DT1[🏭 Digital Twin: Factory] -->|monitors| EQ1[⚙️ Equipment 1] + DT1 -->|monitors| EQ2[⚙️ Equipment 2] + EQ1 -->|connected to| S1[🔌 Sensor 1] + EQ2 -->|connected to| S2[🔌 Sensor 2] + DT1 -->|reports to| CL1[☁️ Cloud Analytics] + + style DT1 fill:#e1f5fe,stroke:#0277bd,color:#000 + style EQ1 fill:#f3e5f5,stroke:#7b1fa2,color:#000 + style EQ2 fill:#e8f5e8,stroke:#2e7d32,color:#000 + style S1 fill:#fff3e0,stroke:#ef6c00,color:#000 + style S2 fill:#ffd93d,stroke:#fdcb6e,color:#000 + style CL1 fill:#a3cfbb,stroke:#1b6e3c,color:#000 \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.png b/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.png new file mode 100644 index 0000000000..98752ca186 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/digital-twin-persistent-graph.png differ diff --git a/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.mmd b/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.mmd new file mode 100644 index 0000000000..d7eb33bf54 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.mmd @@ -0,0 +1,26 @@ +--- +config: + look: neo + theme: default +--- +flowchart TD + Site["site-1
Main Campus"] --> Building["building-1
Building A"] + Building --> Floor["floor-1
First Floor"] + Floor --> Room1["room-101
Conference Room"] + Floor --> Room2["room-102
Office Space"] + Room1 --> Desk1["desk-1
Window Location"] + Room2 --> Desk2["desk-2
Interior Location"] + Desk1 --> Sensor1["occupancy-1
Status: Occupied"] + Desk2 --> Sensor2["occupancy-2
Status: Vacant"] + Alice["Alice
Engineer"] -- "isLocatedIn" --> Room1 + Bob["Bob
Manager"] -- "isLocatedIn" --> Room2 + + classDef facility fill:#dfd,stroke:#333,stroke-width:2px + classDef furniture fill:#ffd,stroke:#333,stroke-width:2px + classDef sensor fill:#bbf,stroke:#333,stroke-width:2px + classDef person fill:#f9f,stroke:#333,stroke-width:2px + + class Site,Building,Floor,Room1,Room2 facility + class Desk1,Desk2 furniture + class Sensor1,Sensor2 sensor + class Alice,Bob person \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.png b/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.png new file mode 100644 index 0000000000..4470702525 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/digital-twins-graph-historization.png differ diff --git a/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.mmd b/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.mmd new file mode 100644 index 0000000000..2c6b358759 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.mmd @@ -0,0 +1,44 @@ +--- +config: + look: neo + theme: default +--- +graph TD + %% People nodes in blue + Dave((Dave)):::person + Mallory((Mallory)):::person + Alice((Alice)):::person + Bob((Bob)):::person + Eve((Eve)):::person + Alex((Alex)):::person + + %% Equipment nodes in green + Conveyorbelt((Conveyor belt)):::equipment + Pump((Pump)):::equipment + Press((Press)):::equipment + + %% Measurement nodes in orange + Speed((Speed)):::measurement + temperature((temperature)):::measurement + pressure((pressure)):::measurement + + Bob -->|reportsTo| Alice + Alice -->|reportsTo| Dave + Alex -->|reportsTo| Dave + Eve -->|reportsTo| Mallory + + Bob -->|operates| Pump + Eve -->|operates| Pump + Mallory -->|operates| Press + Alex -->|operates| Conveyorbelt + + Conveyorbelt -->|hasParent| Speed + Pump -->|hasParent| temperature + Pump -->|hasParent| pressure + Pump -->|hasParent| Conveyorbelt + Press -->|hasParent| Pump + + %% Define node styles + classDef person fill:#9699F3,stroke:#333,stroke-width:1px; + classDef equipment fill:#79EC87,stroke:#333,stroke-width:1px; + classDef measurement fill:#E62828,stroke:#333,stroke-width:1px; \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.png b/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.png new file mode 100644 index 0000000000..ea2693d6f1 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/factory-maintenance-analysis.png differ diff --git a/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.mmd b/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.mmd new file mode 100644 index 0000000000..cae0d42648 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.mmd @@ -0,0 +1,16 @@ +--- +config: + look: neo + theme: default +--- +graph TD + EXT[🌐 External IP] -->|phishing email| U1[👤 User: Alice] + U1 -->|compromised credentials| S1[🖥️ Workstation] + S1 -->|lateral movement| FS[🗄️ File Server] + FS -->|access| DB[🗄️ Sensitive Database] + + style EXT fill:#ff6b6b,stroke:#d63031,color:#fff + style U1 fill:#f3e5f5,stroke:#7b1fa2,color:#000 + style S1 fill:#e8f5e8,stroke:#2e7d32,color:#000 + style FS fill:#fff3e0,stroke:#ef6c00,color:#000 + style DB fill:#ffd93d,stroke:#fdcb6e,color:#000 \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.png b/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.png new file mode 100644 index 0000000000..6cc1757029 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/graph-scenario-cybersecurity.png differ diff --git a/data-explorer/kusto/query/media/graphs/graph-supply-chain.mmd b/data-explorer/kusto/query/media/graphs/graph-supply-chain.mmd new file mode 100644 index 0000000000..bc2f507422 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/graph-supply-chain.mmd @@ -0,0 +1,14 @@ +--- +config: + look: neo + theme: default +--- +graph TD + S1[🏭 Supplier A] -->|supplies| M1[🏢 Manufacturer X] + S2[🏭 Supplier B] -->|supplies| M1 + M1 -->|ships to| D1[🚚 Distributor Y] + + style S1 fill:#e1f5fe,stroke:#0277bd,color:#000 + style S2 fill:#f3e5f5,stroke:#7b1fa2,color:#000 + style M1 fill:#e8f5e8,stroke:#2e7d32,color:#000 + style D1 fill:#fff3e0,stroke:#ef6c00,color:#000 \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/graph-supply-chain.png b/data-explorer/kusto/query/media/graphs/graph-supply-chain.png new file mode 100644 index 0000000000..3c28604ca4 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/graph-supply-chain.png differ diff --git a/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.mmd b/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.mmd new file mode 100644 index 0000000000..8c86401490 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.mmd @@ -0,0 +1,33 @@ +--- +config: + look: neo + theme: default +--- +flowchart TD + subgraph Asset Graph + Resource1["Resource1
Database
High Sensitivity"] + Resource2["Resource2
FileShare
Medium Sensitivity"] + end + + subgraph Identity Graph + User1["User1
Finance"] -- "MemberOf" --> Group1["Group1
Finance-Users"] + User2["User2
IT"] -- "MemberOf" --> Group2["Group2
IT-Admins"] + Group1 -- "HasAccess" --> Resource1 + Group2 -- "HasAccess" --> Resource2 + end + + subgraph Network Graph + Device1["Device1
Workstation"] -- "RDP" --> Device2["Device2
Server"] + Device2 -- "SSH" --> Device3["Device3
Database"] + Device1 -- "HTTPS" --> Resource1 + end + + User1 -.- Device1 + + classDef identity fill:#f9f,stroke:#333,stroke-width:2px + classDef network fill:#bbf,stroke:#333,stroke-width:2px + classDef asset fill:#dfd,stroke:#333,stroke-width:2px + + class User1,User2,Group1,Group2 identity + class Device1,Device2,Device3 network + class Resource1,Resource2 asset \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.png b/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.png new file mode 100644 index 0000000000..a42339378c Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/multi-domain-security-analysis.png differ diff --git a/data-explorer/kusto/query/media/graphs/resource-graph-exploration.mmd b/data-explorer/kusto/query/media/graphs/resource-graph-exploration.mmd new file mode 100644 index 0000000000..8fd99a340f --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/resource-graph-exploration.mmd @@ -0,0 +1,19 @@ +--- +config: + look: neo + theme: default +--- +flowchart TD + MG["Management Group
MG001"] --> RG["Resource Group
RG001"] + RG --> VM["Virtual Machine
VM001"] + RG --> DB["Database
DB001"] + ITAdmins["Group
IT_Admins"] -- "authorized_on" --> RG + Alice["User
Alice"] -- "has_member" --> ITAdmins + + classDef user fill:#f9f,stroke:#333,stroke-width:2px + classDef group fill:#bbf,stroke:#333,stroke-width:2px + classDef resource fill:#dfd,stroke:#333,stroke-width:2px + + class Alice user + class ITAdmins group + class MG,RG,VM,DB resource \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/resource-graph-exploration.png b/data-explorer/kusto/query/media/graphs/resource-graph-exploration.png new file mode 100644 index 0000000000..e6c978f5b9 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/resource-graph-exploration.png differ diff --git a/data-explorer/kusto/query/media/graphs/social-network-analysis.mmd b/data-explorer/kusto/query/media/graphs/social-network-analysis.mmd new file mode 100644 index 0000000000..38c8eb3ff6 --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/social-network-analysis.mmd @@ -0,0 +1,20 @@ +--- +config: + look: neo + theme: default +--- +flowchart LR + You((You)) -- "knows" --> Friend((Friend)) + Friend -- "knows" --> FriendOfFriend((Friend of a friend)) + + subgraph "Contoso organization" + FriendOfFriend + end + + classDef you fill:#4a86e8,stroke:#333,color:white + classDef friend fill:white,stroke:#333 + classDef fof fill:#e67c37,stroke:#333,color:white + + class You you + class Friend friend + class FriendOfFriend fof \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/social-network-analysis.png b/data-explorer/kusto/query/media/graphs/social-network-analysis.png new file mode 100644 index 0000000000..663c02d8d4 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/social-network-analysis.png differ diff --git a/data-explorer/kusto/query/media/graphs/tutorial-first-graph.mmd b/data-explorer/kusto/query/media/graphs/tutorial-first-graph.mmd new file mode 100644 index 0000000000..243c9c9adb --- /dev/null +++ b/data-explorer/kusto/query/media/graphs/tutorial-first-graph.mmd @@ -0,0 +1,25 @@ +--- +config: + look: neo + theme: default +--- +graph TD + Alice[👤 Alice
CEO, Age 45] + Bob[👤 Bob
Engineering Manager, Age 35] + Carol[👤 Carol
Marketing Manager, Age 38] + Dave[👤 Dave
Developer, Age 28] + Eve[👤 Eve
Developer, Age 26] + Frank[👤 Frank
Marketing Specialist, Age 30] + + Alice -->|manages| Bob + Alice -->|manages| Carol + Bob -->|manages| Dave + Bob -->|manages| Eve + Carol -->|manages| Frank + + style Alice fill:#ff6b6b,stroke:#d63031,color:#fff + style Bob fill:#4ecdc4,stroke:#00b894,color:#fff + style Carol fill:#4ecdc4,stroke:#00b894,color:#fff + style Dave fill:#a8e6cf,stroke:#00b894,color:#000 + style Eve fill:#a8e6cf,stroke:#00b894,color:#000 + style Frank fill:#a8e6cf,stroke:#00b894,color:#000 \ No newline at end of file diff --git a/data-explorer/kusto/query/media/graphs/tutorial-first-graph.png b/data-explorer/kusto/query/media/graphs/tutorial-first-graph.png new file mode 100644 index 0000000000..cfc5e37e47 Binary files /dev/null and b/data-explorer/kusto/query/media/graphs/tutorial-first-graph.png differ diff --git a/data-explorer/kusto/query/node-degree-in.md b/data-explorer/kusto/query/node-degree-in.md index 9f690b0c32..4fb881d1ed 100644 --- a/data-explorer/kusto/query/node-degree-in.md +++ b/data-explorer/kusto/query/node-degree-in.md @@ -84,7 +84,7 @@ reports ## Related content -* [Graph overview](graph-overview.md) +* [Graph overview](graph-semantics-overview.md) * [Graph operators](graph-operators.md) * [graph-match operator](graph-match-operator.md) * [node-degree-out](node-degree-out.md) diff --git a/data-explorer/kusto/query/node-degree-out.md b/data-explorer/kusto/query/node-degree-out.md index 9fe37aa663..42ed919539 100644 --- a/data-explorer/kusto/query/node-degree-out.md +++ b/data-explorer/kusto/query/node-degree-out.md @@ -136,7 +136,7 @@ project manager.name, employee.name, di_m=node_degree_in(manager), do_m=node_deg ## Related content -* [Graph overview](graph-overview.md) +* [Graph semantics overview](graph-semantics-overview.md) * [Graph operators](graph-operators.md) * [graph-match operator](graph-match-operator.md) * [node-degree-in](node-degree-in.md) diff --git a/data-explorer/kusto/query/project-away-operator.md b/data-explorer/kusto/query/project-away-operator.md index c2d356223a..e26f887dd0 100644 --- a/data-explorer/kusto/query/project-away-operator.md +++ b/data-explorer/kusto/query/project-away-operator.md @@ -96,10 +96,10 @@ The table shows only the first 10 results. | Ignite 2019 | Jie Feng | | `https://myignite.techcommunity.microsoft.com/sessions/83940` | 100 | 2019-11-06T14:35:00Z | 20 | Wed, Nov 6, 9:35 AM - 9:55 AM | Mention | | Ignite 2019 | Bernhard Rode | Le Hai Dang, Ricardo Niepel | `https://myignite.techcommunity.microsoft.com/sessions/81596` | 200 | 2019-11-06T16:45:00Z | 45 | Wed, Nov 6, 11:45 AM-12:30 PM | Mention | | Ignite 2019 | Tzvia Gitlin | Troyna | `https://myignite.techcommunity.microsoft.com/sessions/83933` | 400 | 2019-11-06T17:30:00Z | 75 | Wed, Nov 6, 12:30 PM-1:30 PM | Focused | -| Ignite 2019 | Jie Feng | `https://myignite.techcommunity.microsoft.com/sessions/81057` | 300 | 2019-11-06T20:30:00Z | 45 | Wed, Nov 6, 3:30 PM-4:15 PM | Mention | +| Ignite 2019 | Jie Feng | Troyna | `https://myignite.techcommunity.microsoft.com/sessions/81057` | 300 | 2019-11-06T20:30:00Z | 45 | Wed, Nov 6, 3:30 PM-4:15 PM | Mention | | Ignite 2019 | Manoj Raheja | | `https://myignite.techcommunity.microsoft.com/sessions/83939` | 300 | 2019-11-07T18:15:00Z | 20 | Thu, Nov 7, 1:15 PM-1:35 PM | Focused | | Ignite 2019 | Uri Barash | | `https://myignite.techcommunity.microsoft.com/sessions/81060` | 300 | 2019-11-08T17:30:00Z | 45 | Fri, Nov8, 10:30 AM-11:15 AM | Focused | -| Ignite 2018 | Manoj Raheja | | | 200 | | 20 | | Focused | +| Ignite 2018 | Manoj Raheja | | `https://learn.microsoft.com/shows/ignite-2018/` | 200 | | 20 | | Focused | | ... | ... | ... | ... | ... | ... | ... | ... | ... | ## Related content diff --git a/data-explorer/kusto/query/sql-cheat-sheet.md b/data-explorer/kusto/query/sql-cheat-sheet.md index 077e7ae723..4eaacd9b59 100644 --- a/data-explorer/kusto/query/sql-cheat-sheet.md +++ b/data-explorer/kusto/query/sql-cheat-sheet.md @@ -3,7 +3,7 @@ title: SQL to Kusto query translation description: Learn about the Kusto Query Language equivalent of SQL queries. ms.reviewer: alexans ms.topic: reference -ms.date: 08/11/2024 +ms.date: 05/28/2025 --- # SQL to Kusto Query Language cheat sheet @@ -35,27 +35,27 @@ The following table shows sample queries in SQL and their KQL equivalents. | Category | SQL Query | Kusto Query | Learn more | |--|--|--| | Select data from table | `SELECT * FROM dependencies` | `dependencies` | [Tabular expression statements](tabular-expression-statements.md) | -| -- | `SELECT name, resultCode FROM dependencies` | `dependencies | project name, resultCode` | [project](project-operator.md) | -| -- | `SELECT TOP 100 * FROM dependencies` | `dependencies | take 100` | [take](take-operator.md) | -| Null evaluation | `SELECT * FROM dependencies`
`WHERE resultCode IS NOT NULL` | `dependencies`
`| where isnotnull(resultCode)` | [isnotnull()](isnotnull-function.md) | -| Comparison operators (date) | `SELECT * FROM dependencies`
`WHERE timestamp > getdate()-1` | `dependencies`
`| where timestamp > ago(1d)` | [ago()](ago-function.md) | -| -- | `SELECT * FROM dependencies`
`WHERE timestamp BETWEEN ... AND ...` | `dependencies`
`| where timestamp between (datetime(2016-10-01) .. datetime(2016-11-01))` | [between](between-operator.md) | -| Comparison operators (string) | `SELECT * FROM dependencies`
`WHERE type = "Azure blob"` | `dependencies`
`| where type == "Azure blob"` | [Logical operators](logical-operators.md) | -| -- | `-- substring`
`SELECT * FROM dependencies`
`WHERE type like "%blob%"` | `// substring`
`dependencies`
`| where type has "blob"` | [has](has-operator.md) | -| -- | `-- wildcard`
`SELECT * FROM dependencies`
`WHERE type like "Azure%"` | `// wildcard`
`dependencies`
`| where type startswith "Azure"`
`// or`
`dependencies`
`| where type matches regex "^Azure.*"` | [`startswith`](startswith-operator.md)
[matches regex](matches-regex-operator.md) | -| Comparison (boolean) | `SELECT * FROM dependencies`
`WHERE !(success)` | `dependencies`
`| where success == False` | [Logical operators](logical-operators.md) | -| Grouping, Aggregation | `SELECT name, AVG(duration) FROM dependencies`
`GROUP BY name` | `dependencies`
`| summarize avg(duration) by name` | [summarize](summarize-operator.md)
[avg()](avg-aggregation-function.md) | -| Distinct | `SELECT DISTINCT name, type FROM dependencies` | `dependencies`
`| summarize by name, type` | [summarize](summarize-operator.md)
[distinct](distinct-operator.md) | -| -- | `SELECT name, COUNT(DISTINCT type) `
` FROM dependencies `
` GROUP BY name` | ` dependencies `
`| summarize by name, type | summarize count() by name `
`// or approximate for large sets `
` dependencies `
` | summarize dcount(type) by name ` | [count()](count-aggregation-function.md)
[dcount()](dcount-aggregation-function.md) | -| Column aliases, Extending | `SELECT operationName as Name, AVG(duration) as AvgD FROM dependencies`
`GROUP BY name` | `dependencies`
`| summarize AvgD = avg(duration) by Name=operationName` | [Alias statement](alias-statement.md) | -| -- | `SELECT conference, CONCAT(sessionid, ' ' , session_title) AS session FROM ConferenceSessions` | `ConferenceSessions`
`| extend session=strcat(sessionid, " ", session_title)`
`| project conference, session` | [strcat()](strcat-function.md)
[project](project-operator.md) | -| Ordering | `SELECT name, timestamp FROM dependencies`
`ORDER BY timestamp ASC` | `dependencies`
`| project name, timestamp`
`| sort by timestamp asc nulls last` | [sort](sort-operator.md) | -| Top n by measure | `SELECT TOP 100 name, COUNT(*) as Count FROM dependencies`
`GROUP BY name`
`ORDER BY Count DESC` | `dependencies`
`| summarize Count = count() by name`
`| top 100 by Count desc` | [top](top-operator.md) | +|--| `SELECT name, resultCode FROM dependencies` | `dependencies | project name, resultCode` | [project](project-operator.md) | +|--| `SELECT TOP 100 * FROM dependencies` | `dependencies | take 100` | [take](take-operator.md) | +| Null evaluation | `SELECT * FROM dependencies`
`WHERE resultCode IS NOT NULL` | `dependencies`
` | where isnotnull(resultCode)` | [isnotnull()](isnotnull-function.md) | +| Comparison operators (date) | `SELECT * FROM dependencies`
`WHERE timestamp > getdate()-1` | `dependencies`
` | where timestamp > ago(1d)` | [ago()](ago-function.md) | +|--| `SELECT * FROM dependencies`
`WHERE timestamp BETWEEN ... AND ...` | `dependencies`
` | where timestamp between (datetime(2016-10-01) .. datetime(2016-11-01))` | [between](between-operator.md) | +| Comparison operators (string) | `SELECT * FROM dependencies`
`WHERE type = "Azure blob"` | `dependencies`
` | where type == "Azure blob"` | [Logical operators](logical-operators.md) | +|--| `-- substring`
`SELECT * FROM dependencies`
`WHERE type like "%blob%"` | `// substring`
`dependencies`
` | where type has "blob"` | [has](has-operator.md) | +|--| `-- wildcard`
`SELECT * FROM dependencies`
`WHERE type like "Azure%"` | `// wildcard`
`dependencies`
` | where type startswith "Azure"`
`// or`
`dependencies`
` | where type matches regex "^Azure.*"` | [`startswith`](startswith-operator.md)
[matches regex](matches-regex-operator.md) | +| Comparison (boolean) | `SELECT * FROM dependencies`
`WHERE !(success)` | `dependencies`
` | where success == False` | [Logical operators](logical-operators.md) | +| Grouping, Aggregation | `SELECT name, AVG(duration) FROM dependencies`
`GROUP BY name` | `dependencies`
` | summarize avg(duration) by name` | [summarize](summarize-operator.md)
[avg()](avg-aggregation-function.md) | +| Distinct | `SELECT DISTINCT name, type FROM dependencies` | `dependencies`
` | distinct name, type` | [summarize](summarize-operator.md)
[distinct](distinct-operator.md) | +|--| `SELECT name, COUNT(DISTINCT type) `
` FROM dependencies `
` GROUP BY name` | ` dependencies `
` | summarize by name, type | summarize count() by name `
`// or approximate for large sets `
` dependencies `
` | summarize dcount(type) by name ` | [count()](count-aggregation-function.md)
[dcount()](dcount-aggregation-function.md) | +| Column aliases, Extending | `SELECT operationName as Name, AVG(duration) as AvgD FROM dependencies`
`GROUP BY name` | `dependencies`
` | summarize AvgD = avg(duration) by Name=operationName` | [Alias statement](alias-statement.md) | +|--| `SELECT conference, CONCAT(sessionid, ' ' , session_title) AS session FROM ConferenceSessions` | `ConferenceSessions`
` | extend session=strcat(sessionid, " ", session_title)`
` | project conference, session` | [strcat()](strcat-function.md)
[project](project-operator.md) | +| Ordering | `SELECT name, timestamp FROM dependencies`
`ORDER BY timestamp ASC` | `dependencies`
` | project name, timestamp`
` | sort by timestamp asc nulls last` | [sort](sort-operator.md) | +| Top n by measure | `SELECT TOP 100 name, COUNT(*) as Count FROM dependencies`
`GROUP BY name`
`ORDER BY Count DESC` | `dependencies`
` | summarize Count = count() by name`
` | top 100 by Count desc` | [top](top-operator.md) | | Union | `SELECT * FROM dependencies`
`UNION`
`SELECT * FROM exceptions` | `union dependencies, exceptions` | [union](union-operator.md) | -| -- | `SELECT * FROM dependencies`
`WHERE timestamp > ...`
`UNION`
`SELECT * FROM exceptions`
`WHERE timestamp > ...` | `dependencies`
`| where timestamp > ago(1d)`
`| union`
` (exceptions`
` | where timestamp > ago(1d))` | | -| Join | `SELECT * FROM dependencies `
`LEFT OUTER JOIN exceptions`
`ON dependencies.operation_Id = exceptions.operation_Id` | `dependencies`
`| join kind = leftouter`
` (exceptions)`
`on $left.operation_Id == $right.operation_Id` | [join](join-operator.md) | -| Nested queries | `SELECT * FROM dependencies`
`WHERE resultCode == `
`(SELECT TOP 1 resultCode FROM dependencies`
`WHERE resultId = 7`
`ORDER BY timestamp DESC)` | `dependencies`
`| where resultCode == toscalar(`
` dependencies`
` | where resultId == 7`
` | top 1 by timestamp desc`
` | project resultCode)` | [toscalar](toscalar-function.md) | -| Having | `SELECT COUNT(\*) FROM dependencies`
`GROUP BY name`
`HAVING COUNT(\*) > 3` | `dependencies`
`| summarize Count = count() by name`
`| where Count > 3` | [summarize](summarize-operator.md)
[where](where-operator.md) | +|--| `SELECT * FROM dependencies`
`WHERE timestamp > ...`
`UNION`
`SELECT * FROM exceptions`
`WHERE timestamp > ...` | `dependencies`
` | where timestamp > ago(1d)`
` | union`
` (exceptions`
` | where timestamp > ago(1d))` | | +| Join | `SELECT * FROM dependencies `
`LEFT OUTER JOIN exceptions`
`ON dependencies.operation_Id = exceptions.operation_Id` | `dependencies`
` | join kind = leftouter`
` (exceptions)`
`on $left.operation_Id == $right.operation_Id` | [join](join-operator.md) | +| Nested queries | `SELECT * FROM dependencies`
`WHERE resultCode == `
`(SELECT TOP 1 resultCode FROM dependencies`
`WHERE resultId = 7`
`ORDER BY timestamp DESC)` | `dependencies`
` | where resultCode == toscalar(`
` dependencies`
` | where resultId == 7`
` | top 1 by timestamp desc`
` | project resultCode)` | [toscalar](toscalar-function.md) | +| Having | `SELECT COUNT(\*) FROM dependencies`
`GROUP BY name`
`HAVING COUNT(\*) > 3` | `dependencies`
` | summarize Count = count() by name`
` | where Count > 3` | [summarize](summarize-operator.md)
[where](where-operator.md) | ## Related content diff --git a/data-explorer/kusto/query/toc.yml b/data-explorer/kusto/query/toc.yml index 8d83fd7f63..3868a92438 100644 --- a/data-explorer/kusto/query/toc.yml +++ b/data-explorer/kusto/query/toc.yml @@ -304,6 +304,8 @@ items: - name: external_table() href: external-table-function.md displayName: external table external-table + - name: graph() + href: graph-function.md - name: materialize() href: materialize-function.md - name: materialized_view() @@ -1271,7 +1273,7 @@ items: - name: Graph items: - name: Graph overview - href: graph-overview.md + href: graph-semantics-overview.md - name: Graph best practices href: graph-best-practices.md - name: Graph scenarios @@ -1304,6 +1306,9 @@ items: - name: inner_nodes() displayName: inner nodes() graph function, inner nodes href: inner-nodes-graph-function.md + - name: labels() + displayName: labels() graph function, labels + href: labels-graph-function.md - name: node_degree_in() displayName: graph, node, Node degree in node_degree_in, indegree node_degree_in href: node-degree-in.md diff --git a/data-explorer/kusto/query/tutorials/create-geospatial-visualizations.md b/data-explorer/kusto/query/tutorials/create-geospatial-visualizations.md index 745d13842d..f0bc415155 100644 --- a/data-explorer/kusto/query/tutorials/create-geospatial-visualizations.md +++ b/data-explorer/kusto/query/tutorials/create-geospatial-visualizations.md @@ -319,4 +319,4 @@ StormEvents * See a use case for geospatial clustering: [Data analytics for automotive test fleets](/azure/architecture/industries/automotive/automotive-telemetry-analytics) * Learn about Azure architecture for [geospatial data processing and analytics](/azure/architecture/example-scenario/data/geospatial-data-processing-analytics-azure) -* Get a comprehensive understanding of Azure Data Explorer by reading the [white paper](https://azure.microsoft.com/mediahandler/files/resourcefiles/azure-data-explorer/Azure_Data_Explorer_white_paper.pdf) +* Get a comprehensive understanding of Azure Data Explorer by reading the [white paper](https://aka.ms/adx.techwhitepaper) diff --git a/data-explorer/kusto/query/tutorials/your-first-graph.md b/data-explorer/kusto/query/tutorials/your-first-graph.md new file mode 100644 index 0000000000..e5e9561bcb --- /dev/null +++ b/data-explorer/kusto/query/tutorials/your-first-graph.md @@ -0,0 +1,347 @@ +--- +title: 'Tutorial: Create your first graphs in Kusto Query Language' +description: Learn how to model and query interconnected data using graph semantics in Kusto Query Language (KQL). Build transient and persistent graphs to analyze organizational hierarchies. +author: cosh +ms.author: herauch +ms.service: azure-data-explorer +ms.topic: tutorial +ms.custom: mvc +ms.date: 05/26/2025 +--- + +# Tutorial: Create your first graphs in Kusto Query Language + +> [!INCLUDE [applies](../../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../../includes/applies-to-version/sentinel.md)] + +Graph semantics in Kusto enables you to model and query data as interconnected networks, making it intuitive to analyze complex relationships like organizational hierarchies, social networks, and attack paths. Unlike traditional relational queries that rely on joins, graphs use direct relationships between entities to traverse connections efficiently. + +In this tutorial, you learn how to: + +> [!div class="checklist"] +> +> * Create a transient graph using the make-graph operator +> * Query graphs to find relationships using graph-match +> * Build persistent graph models for reusable analysis +> * Compare transient versus persistent graph approaches + +If you don't have an Azure Data Explorer cluster, [create a free cluster](/azure/data-explorer/start-for-free-web-ui) before you begin the tutorial. + +## Prerequisites + +* A Microsoft account or Microsoft Entra user identity to sign in to the [help cluster](https://dataexplorer.azure.com/clusters/help) + +::: moniker range="microsoft-fabric" + +* A [Fabric workspace](/fabric/get-started/create-workspaces) with a Microsoft Fabric-enabled [capacity](/fabric/enterprise/licenses#capacity) + +::: moniker-end + +## Access your query environment + +::: moniker range="azure-data-explorer" +Open the [Azure Data Explorer Web UI](https://dataexplorer.azure.com/clusters/help) to access the help cluster for this tutorial. +::: moniker-end + +::: moniker range="microsoft-fabric" +Navigate to your Microsoft Fabric workspace and open a KQL database to run the queries. +::: moniker-end + +## Create a transient graph with organizational data + +In this section, you'll create your first graph using sample organizational data. Transient graphs are created dynamically during query execution using the `make-graph` operator, making them perfect for ad-hoc analysis and exploration. + +You'll work with a simple company structure where employees report to managers. This organizational hierarchy provides an intuitive example for understanding graph relationships: + +:::image type="content" source="../media/graphs/tutorial-first-graph.png" alt-text="A diagram showing the organization hierarchy."::: + +Create the organizational graph structure using employee and reporting relationship data: + +:::moniker range="azure-data-explorer" +> [!div class="nextstepaction"] +> Run the query +::: moniker-end + +```kusto +// Create sample employee data +let employees = datatable(name:string, role:string, age:long) +[ + "Alice", "CEO", 45, + "Bob", "Engineering Manager", 35, + "Carol", "Marketing Manager", 38, + "Dave", "Developer", 28, + "Eve", "Developer", 26, + "Frank", "Marketing Specialist", 30 +]; +// Create reporting relationships +let reports = datatable(employee:string, manager:string) +[ + "Bob", "Alice", + "Carol", "Alice", + "Dave", "Bob", + "Eve", "Bob", + "Frank", "Carol" +]; +// Build the graph and explore it +reports +| make-graph employee --> manager with employees on name +| graph-to-table nodes +``` + +### Output + +|name|role|age| +|---|---|---| +|Alice|CEO|45| +|Bob|Engineering Manager|35| +|Carol|Marketing Manager|38| +|Dave|Developer|28| +|Eve|Developer|26| +|Frank|Marketing Specialist|30| + +## Query relationships with graph-match patterns + +Now you'll learn to use the `graph-match` operator to find specific patterns in your organizational graph. The `graph-match` operator searches for relationships and connections within the graph structure. + +First, find all employees who directly report to Alice by matching the immediate reporting relationship pattern: + +:::moniker range="azure-data-explorer" +> [!div class="nextstepaction"] +> Run the query +::: moniker-end + +```kusto +let employees = datatable(name:string, role:string, age:long) +[ + "Alice", "CEO", 45, + "Bob", "Engineering Manager", 35, + "Carol", "Marketing Manager", 38, + "Dave", "Developer", 28, + "Eve", "Developer", 26, + "Frank", "Marketing Specialist", 30 +]; +let reports = datatable(employee:string, manager:string) +[ + "Bob", "Alice", + "Carol", "Alice", + "Dave", "Bob", + "Eve", "Bob", + "Frank", "Carol" +]; +reports +| make-graph employee --> manager with employees on name +| graph-match (alice)<-[reports]-(employee) + where alice.name == "Alice" + project employee = employee.name, role = employee.role, age = employee.age +``` + +### Direct reports output + +|employee|role|age| +|---|---|---| +|Bob|Engineering Manager|35| +|Carol|Marketing Manager|38| + +Next, find all employees in Alice's entire organization, including indirect reports, using variable length edges with `*1..3` to traverse multiple levels of the hierarchy: + +:::moniker range="azure-data-explorer" +> [!div class="nextstepaction"] +> Run the query +::: moniker-end + +```kusto +let employees = datatable(name:string, role:string, age:long) +[ + "Alice", "CEO", 45, + "Bob", "Engineering Manager", 35, + "Carol", "Marketing Manager", 38, + "Dave", "Developer", 28, + "Eve", "Developer", 26, + "Frank", "Marketing Specialist", 30 +]; +let reports = datatable(employee:string, manager:string) +[ + "Bob", "Alice", + "Carol", "Alice", + "Dave", "Bob", + "Eve", "Bob", + "Frank", "Carol" +]; +reports +| make-graph employee --> manager with employees on name +| graph-match (alice)<-[reports*1..3]-(employee) + where alice.name == "Alice" + project employee = employee.name, role = employee.role, reportingLevels = array_length(reports) +``` + +### All organization members output + +|employee|role|reportingLevels| +|---|---|---| +|Bob|Engineering Manager|1| +|Carol|Marketing Manager|1| +|Dave|Developer|2| +|Eve|Developer|2| +|Frank|Marketing Specialist|2| + +::: moniker range="azure-data-explorer || microsoft-fabric" + +## Create a persistent graph model + +> [!NOTE] +> This feature is currently in public preview. Functionality and syntax are subject to change before General Availability. + +Persistent graphs are stored in the database and can be queried repeatedly without rebuilding the graph structure. You'll now create the same organizational structure as a persistent graph for better performance and reusability. + +Create functions that return your sample data, then define a graph model structure: + +```kusto +// Create a function that returns employee data +.create function Employees() { + datatable(name: string, role: string, age: long) + [ + "Alice", "CEO", 45, + "Bob", "Engineering Manager", 35, + "Carol", "Marketing Manager", 38, + "Dave", "Developer", 28, + "Eve", "Developer", 26, + "Frank", "Marketing Specialist", 30 + ] +} + +// Create a function that returns reporting relationships +.create function Reports() { + datatable(employee: string, manager: string) + [ + "Bob", "Alice", + "Carol", "Alice", + "Dave", "Bob", + "Eve", "Bob", + "Frank", "Carol" + ] +} +``` + +Define the graph model with node and edge schemas: + +````kusto +.create-or-alter graph_model OrganizationGraph ``` +{ + "Schema": { + "Nodes": { + "Employee": { + "name": "string", + "role": "string", + "age": "long" + } + }, + "Edges": { + "ReportsTo": { + } + } + }, + "Definition": { + "Steps": [ + { + "Kind": "AddNodes", + "Query": "Employees()", + "NodeIdColumn": "name", + "Labels": ["Employee"] + }, + { + "Kind": "AddEdges", + "Query": "Reports()", + "SourceColumn": "employee", + "TargetColumn": "manager", + "Labels": ["ReportsTo"] + } + ] + } +} +``` +```` + +Create a graph snapshot to materialize the model into a queryable structure: + +```kusto +.make graph_snapshot OrganizationGraph_v1 from OrganizationGraph +``` + +## Query your persistent graph + +Query the persistent graph using the same patterns as transient graphs. Find all employees who report to Alice: + +```kusto +graph("OrganizationGraph") +| graph-match (alice)<-[reports]-(employee) + where alice.name == "Alice" + project employee = employee.name, role = employee.role, age = employee.age +``` + +Find all employees in Alice's organization including indirect reports: + +```kusto +graph("OrganizationGraph") +| graph-match (alice)<-[reports*1..3]-(employee) + where alice.name == "Alice" + project employee = employee.name, role = employee.role, reportingLevels = array_length(reports) +``` + +Query a specific snapshot version if needed: + +```kusto +graph("OrganizationGraph", "OrganizationGraph_v1") +| graph-match (alice)<-[reports*1..3]-(employee) + where alice.name == "Alice" + project employee = employee.name, role = employee.role +``` + +::: moniker-end + +## Compare transient and persistent graphs + +Understanding when to use each approach helps you choose the right method for your analysis needs: + +| Aspect | Transient Graphs | Persistent Graphs | +|--------|------------------|-------------------| +| **Creation** | `make-graph` operator during query | `.create-or-alter graph_model` + `.make graph_snapshot` | +| **Storage** | In-memory during query execution | Stored in database | +| **Reusability** | Must rebuild for each query | Query repeatedly without rebuilding | +| **Performance** | Good for smaller datasets | Optimized for large, complex graphs | +| **Use cases** | Ad-hoc analysis, exploration | Production analytics, repeated queries | +| **Memory limits** | Limited by node memory | Can handle larger datasets | + +## Clean up resources + +::: moniker range="azure-data-explorer || microsoft-fabric" +If you're not going to continue using the persistent graph models, delete them with the following commands: + +1. Drop the graph model: + + ```kusto + .drop graph_model OrganizationGraph + ``` + +2. Drop the helper functions: + + ```kusto + .drop function Employees + .drop function Reports + ``` + +::: moniker-end + +The transient graphs are automatically cleaned up when the query completes, so no additional cleanup is needed for those examples. + +## Next steps + +Now that you understand the basics of graph semantics in Kusto, advance to more complex scenarios and optimizations: + +> [!div class="nextstepaction"] +> [Graph best practices](../graph-best-practices.md) + +You can also explore these related topics: + +* [Graph operators reference](../graph-operators.md) - Complete guide to all available graph operators +* [Graph model management](../../management/graph/graph-model-overview.md) - Deep dive into persistent graph models +* [Graph shortest paths](../graph-shortest-paths-operator.md) - Find optimal paths between entities +* [Advanced graph queries](../graph-scenarios.md) - Complex analysis patterns and use cases diff --git a/data-explorer/kusto/toc.yml b/data-explorer/kusto/toc.yml index 57b99e8a56..f0480de946 100644 --- a/data-explorer/kusto/toc.yml +++ b/data-explorer/kusto/toc.yml @@ -18,7 +18,9 @@ items: href: query/tutorials/join-data-from-multiple-tables.md - name: 4 - Create geospatial visualizations href: query/tutorials/create-geospatial-visualizations.md - - name: 5 - Learn about common tasks for Microsoft Sentinel + - name: 5 - Create your first graph + href: query/tutorials/your-first-graph.md + - name: 6 - Learn about common tasks for Microsoft Sentinel href: query/tutorials/common-tasks-microsoft-sentinel.md - name: Train me items: diff --git a/data-explorer/media/graph/graph-friends-of-a-friend.png b/data-explorer/media/graph/graph-friends-of-a-friend.png deleted file mode 100644 index 1a789a1452..0000000000 Binary files a/data-explorer/media/graph/graph-friends-of-a-friend.png and /dev/null differ diff --git a/data-explorer/media/graph/graph-property-graph.png b/data-explorer/media/graph/graph-property-graph.png deleted file mode 100644 index 2319b8599f..0000000000 Binary files a/data-explorer/media/graph/graph-property-graph.png and /dev/null differ diff --git a/data-explorer/media/graph/graph-recommendation.png b/data-explorer/media/graph/graph-recommendation.png deleted file mode 100644 index 5f33b1f64e..0000000000 Binary files a/data-explorer/media/graph/graph-recommendation.png and /dev/null differ diff --git a/data-explorer/media/graph/graph-social-network.png b/data-explorer/media/graph/graph-social-network.png deleted file mode 100644 index 9b372e21ca..0000000000 Binary files a/data-explorer/media/graph/graph-social-network.png and /dev/null differ diff --git a/data-explorer/query-monitor-data.md b/data-explorer/query-monitor-data.md index d95da57dcc..75f11117c4 100644 --- a/data-explorer/query-monitor-data.md +++ b/data-explorer/query-monitor-data.md @@ -3,7 +3,7 @@ title: 'Query data in Azure Monitor with Azure Data Explorer' description: 'In this article, query data in Azure Monitor (Application Insights resource and Log Analytics workspace) by creating Azure Data Explorer cross product queries.' ms.reviewer: guywi-ms ms.topic: how-to -ms.date: 07/25/2024 +ms.date: 05/28/2025 #Customer intent: I want to query data in Azure Monitor using Azure Data Explorer. --- @@ -116,6 +116,10 @@ If the Azure Data Explorer resource is in *tenant-name-a* and Log Analytics work Kusto Explorer automatically signs you into the tenant to which the user account originally belongs. To access resources in other tenants with the same user account, the `tenantId` has to be explicitly specified in the connection string: `Data Source=https://ade.applicationinsights.io/subscriptions/SubscriptionId/resourcegroups/ResourceGroupName;Initial Catalog=NetDefaultDB;AAD Federated Security=True;Authority ID=` +>[!NOTE] +> +> The `tenantId` parameter is not directly configurable in the Azure Data Explorer web UI. For the `tenantId` use the Microsoft Entra identity. + ## Function supportability The Azure Data Explorer cross-service queries support functions for both Application Insights resource and Log Analytics workspace. diff --git a/data-explorer/toc.yml b/data-explorer/toc.yml index c54be51c4c..de47ac75c9 100644 --- a/data-explorer/toc.yml +++ b/data-explorer/toc.yml @@ -369,14 +369,6 @@ items: href: python-query-data.md - name: Debug Kusto Query Language inline Python href: debug-inline-python.md - - name: Graph analysis - items: - - name: Graph analysis overview - href: graph-overview.md - - name: Graph analysis scenarios - href: graph-scenarios.md - - name: Best practices for graph analysis - href: graph-best-practices.md - name: Visualize data items: - name: Visualization integrations overview diff --git a/data-explorer/whats-new-archive.md b/data-explorer/whats-new-archive.md index fb1ea7e120..f38dd91308 100644 --- a/data-explorer/whats-new-archive.md +++ b/data-explorer/whats-new-archive.md @@ -3,7 +3,7 @@ title: What's new in Azure Data Explorer documentation archive description: In this article, you'll find an archive of new and significant changes in the Azure Data Explorer documentation ms.reviewer: orspodek ms.topic: reference -ms.date: 05/11/2025 +ms.date: 05/27/2025 --- # What's new in Azure Data Explorer documentation archive @@ -122,7 +122,7 @@ Welcome to what's new in Azure Data Explorer archive. This article is an archive | Article title | Description | |--|--| -|- [KQL graph semantics overview (preview)](graph-overview.md)
- [KQL graph semantics best practices (preview)](graph-best-practices.md)
- [Common scenarios for using KQL graph semantics (preview)?](graph-scenarios.md) | New articles. Describes how to use Kusto Query Language (KQL) graph semantics.| +|- [KQL graph semantics overview (Preview)](graph-overview.md)
- [KQL graph semantics best practices (Preview)](graph-best-practices.md)
- [Common scenarios for using KQL graph semantics (Preview)](graph-scenarios.md) | New articles. Describes how to use Kusto Query Language (KQL) graph semantics.| | [How to ingest historical data](ingest-data-historical.md)| New article. Describes how to use LightIngest to ingest historical or ad hoc data into Azure Data Explorer.| |- [Ingest data from Splunk to Azure Data Explorer](ingest-data-splunk.md)
- [Data connectors overview](integrate-data-overview.md)| New article that describes how to ingest data into Azure Data Explorer from Splunk, and updated data connector overview with additional capabilities.| | [KQL learning resources](kql-learning-resources.md)| New article. Describes the different learning resources for ramping up on KQL.|