diff --git a/data-explorer/kusto/functions-library/functions-library.md b/data-explorer/kusto/functions-library/functions-library.md index 60026b0dae..f3b0287ae8 100644 --- a/data-explorer/kusto/functions-library/functions-library.md +++ b/data-explorer/kusto/functions-library/functions-library.md @@ -3,7 +3,7 @@ title: Functions library description: This article describes user-defined functions that extend query environment capabilities. ms.reviewer: adieldar ms.topic: reference -ms.date: 11/17/2024 +ms.date: 11/27/2024 monikerRange: "microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel" --- # Functions library @@ -19,6 +19,7 @@ The user-defined functions code is given in the articles. It can be used within | Function Name | Description | |--|--| | [detect_anomalous_new_entity_fl()](detect-anomalous-new-entity-fl.md) | Detect the appearance of anomalous new entities in timestamped data. | +| [graph_path_discovery_fl()](graph-path-discovery-fl.md) | Discover valid paths between relevant endpoints (sources and targets) over graph data (edge and nodes). ## General functions diff --git a/data-explorer/kusto/functions-library/graph-path-discovery-fl.md b/data-explorer/kusto/functions-library/graph-path-discovery-fl.md new file mode 100644 index 0000000000..43c330d49d --- /dev/null +++ b/data-explorer/kusto/functions-library/graph-path-discovery-fl.md @@ -0,0 +1,392 @@ +--- +title: graph_path_discovery_fl() +description: Learn how to use the graph_path_discovery_fl() function to detect paths over graph data. +ms.reviewer: andkar +ms.topic: reference +ms.date: 11/27/2024 +monikerRange: "microsoft-fabric || azure-data-explorer || azure-monitor || microsoft-sentinel" +--- +# graph_path_discovery_fl() + +>[!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] + +Discover valid paths between relevant endpoints (sources and targets) over graph data (edge and nodes). + +The function `graph_path_discovery_fl()` is a [UDF (user-defined function)](../query/functions/user-defined-functions.md) that allows to discover valid paths between relevant endpoints over graph data. Graph data consists of nodes (for example - resources, applications or users) and edges (for example - existing access permissions). In cybersecurity context, such paths might represent possible lateral movement paths that a potential attacker can utilize. We're interested in paths connecting endpoints defined as relevant by some criteria - for example, exposed sources connected to critical targets. Based on the function's configuration, other types of paths, suitable for other security scenarios, can be discovered. + +The data that can be used as input for this function is a table of edges in the format 'SourceId, EdgeId, TargetId', and a list of nodes with optional nodes' properties that can be used to define valid paths. Alternatively, graph input can be extracted from other types of data. For example, traffic logs with entries of type 'User A logged in to resource B' can be modeled as edges of type '(User A)-[logged in to]->(resource B)', while the list of distinct users and resources can be modeled as nodes. + +We make several assumptions: +* All edges are valid for path discovery. Edges that are irrelevant should be filtered out before running path discovery. +* Edges are unweighted, independent, and unconditional, meaning that all edges have the same probability and moving from B to C is not dependent on previous move from A to B. +* Paths we want to discover are simple directional paths without cycles, of type A->B->C. More complex definitions can be made by changing the internal syntax of graph-match operator in the function. + +These assumptions can be adapted as needed by changing the internal logic of the function. + +The function discovers all possible paths between valid sources to valid targets, under optional constraints such as path length limits, maximum output size, etc. The output is a list of discovered paths with source and target Ids, as well as list of connecting edges and nodes. The function uses only the required fields, such as node Ids and edge Ids. In case other relevant fields - such as types, property lists, security-related scores, or external signals - are available in input data, they can be added to logic and output by changing the function definition. + +## Syntax + +`graph_path_discovery_fl(`*edgesTableName*, , *nodesTableName*, *scopeColumnName*, *isValidPathStartColumnName*, *isValidPathEndColumnName*, *nodeIdColumnName*, *edgeIdColumnName*, , *sourceIdColumnName*, *targetIdColumnName*, [*minPathLength*], [*maxPathLength*], [*resultCountLimit*]`)` + +[!INCLUDE [syntax-conventions-note](../includes/syntax-conventions-note.md)] + +## Parameters + +| Name | Type | Required | Description | +|--|--|--|--| +| *edgesTableName* | `string` | :heavy_check_mark: | The name of the input table containing the edges of the graph. | +| *nodesTableName* | `string` | :heavy_check_mark: | The name of the input table containing the nodes of the graph. | +| *scopeColumnName* | `string` | :heavy_check_mark: | The name of the column in nodes and edges tables containing the partition or scope (for example, subscription or account), so that a different anomaly model is built for each scope. | +| *isValidPathStartColumnName* | `string` | :heavy_check_mark: | The name of the column in nodes table containing a Boolean flag for a node, *True* meaning that the node is a valid start point for a path and *False* - not a valid one. | +| *isValidPathEndColumnName* | `string` | :heavy_check_mark: | The name of the column in nodes table containing a Boolean flag for a node, *True* meaning that the node is a valid end point for a path and *False* - not a valid one. | +| *nodeIdColumnName* | `string` | :heavy_check_mark: | The name of the column in nodes table containing the node Id. | +| *edgeIdColumnName* | `string` | :heavy_check_mark: | The name of the column in edges table containing the edge Id. | +| *sourceIdColumnName* | `string` | :heavy_check_mark: | The name of the column in edges table containing edge's source node Id. | +| *targetIdColumnName* | `string` | :heavy_check_mark: | The name of the column in edges table containing edge's target node Id. | +| *minPathLength* | `long` | | The minimum number of steps (edges) in the path. The default value is 1. | +| *maxPathLength* | `long` | | The maximum number of steps (edges) in the path. The default value is 8. | +| *resultCountLimit* | `long` | | The maximum number of paths returned for output. The default value is 100000. | + + +## Function definition + +You can define the function by either embedding its code as a query-defined function, or creating it as a stored function in your database, as follows: + +### [Query-defined](#tab/query-defined) + +Define the function using the following [let statement](../query/let-statement.md). No permissions are required. + +> [!IMPORTANT] +> A [let statement](../query/let-statement.md) can't run on its own. It must be followed by a [tabular expression statement](../query/tabular-expression-statements.md). To run a working example of `graph_path_discovery_fl()`, see [Example](#example). + +```kusto +let graph_path_discovery_fl = ( edgesTableName:string, nodesTableName:string, scopeColumnName:string + , isValidPathStartColumnName:string, isValidPathEndColumnName:string + , nodeIdColumnName:string, edgeIdColumnName:string, sourceIdColumnName:string, targetIdColumnName:string + , minPathLength:long = 1, maxPathLength:long = 8, resultCountLimit:long = 100000) +{ +let edges = ( + table(edgesTableName) + | extend sourceId = column_ifexists(sourceIdColumnName, '') + | extend targetId = column_ifexists(targetIdColumnName, '') + | extend edgeId = column_ifexists(edgeIdColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') + ); +let nodes = ( + table(nodesTableName) + | extend nodeId = column_ifexists(nodeIdColumnName, '') + | extend isValidPathStart = column_ifexists(isValidPathStartColumnName, '') + | extend isValidPathEnd = column_ifexists(isValidPathEndColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') +); +let paths = ( + edges + // Build graph object partitioned by scope, so that no connections are allowed between scopes. + // In case no scopes are relevant, partitioning should be removed for better performance. + | make-graph sourceId --> targetId with nodes on nodeId partitioned-by scope ( + // Look for existing paths between source nodes and target nodes with less than predefined number of hops + // Current configurations looks for directed paths without any cycles; this can be changed if needed + graph-match cycles = none (s)-[e*minPathLength..maxPathLength]->(t) + // Filter only by paths with that connect valid endpoints + where ((s.isValidPathStart) and (t.isValidPathEnd)) + project sourceId = s.nodeId + , isSourceValidPathStart = s.isValidPathStart + , targetId = t.nodeId + , isTargetValidPathEnd = t.isValidPathEnd + , scope = s.scope + , edgeIds = e.edgeId + , edgeAllTargetIds = e.targetId + | limit resultCountLimit + ) + | extend pathLength = array_length(edgeIds) + , pathId = hash_md5(strcat(sourceId, strcat(edgeIds), targetId)) + , pathAllNodeIds = array_concat(pack_array(sourceId), edgeAllTargetIds) + | project-away edgeAllTargetIds + | mv-apply with_itemindex = SortIndex nodesInPath = pathAllNodeIds to typeof(string), edgesInPath = edgeIds to typeof(string) on ( + extend step = strcat( + iff(isnotempty(nodesInPath), strcat('(', nodesInPath, ')'), '') + , iff(isnotempty(edgesInPath), strcat('-[', edgesInPath, ']->'), '')) + | summarize fullPath = array_strcat(make_list(step), '') + ) +); +paths +}; +// Write your query to use the function here. +``` + +### [Stored](#tab/stored) + +Define the stored function once using the following [`.create function`](../management/create-function.md). [Database User permissions](../access-control/role-based-access-control.md) are required. + +> [!IMPORTANT] +> You must run this code to create the function before you can use the function as shown in the [Example](#example). + +```kusto +.create-or-alter function with (docstring = "Build paths on graph data (edges and nodes) between valid endpoints (sych as exposed and critical assets) per scope (such as subscription or device)", skipvalidation = "true", folder = 'Cybersecurity') +graph_path_discovery_fl ( edgesTableName:string, nodesTableName:string, scopeColumnName:string + , isValidPathStartColumnName:string, isValidPathEndColumnName:string + , nodeIdColumnName:string, edgeIdColumnName:string, sourceIdColumnName:string, targetIdColumnName:string + , minPathLength:long = 1, maxPathLength:long = 8, resultCountLimit:long = 100000) +{ +let edges = ( + table(edgesTableName) + | extend sourceId = column_ifexists(sourceIdColumnName, '') + | extend targetId = column_ifexists(targetIdColumnName, '') + | extend edgeId = column_ifexists(edgeIdColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') + ); +let nodes = ( + table(nodesTableName) + | extend nodeId = column_ifexists(nodeIdColumnName, '') + | extend isValidPathStart = column_ifexists(isValidPathStartColumnName, '') + | extend isValidPathEnd = column_ifexists(isValidPathEndColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') +); +let paths = ( + edges + // Build graph object partitioned by scope, so that no connections are allowed between scopes. + // In case no scopes are relevant, partitioning should be removed for better performance. + | make-graph sourceId --> targetId with nodes on nodeId partitioned-by scope ( + // Look for existing paths between source nodes and target nodes with less than predefined number of hops + // Current configurations looks for directed paths without any cycles; this can be changed if needed + graph-match cycles = none (s)-[e*minPathLength..maxPathLength]->(t) + // Filter only by paths with that connect valid endpoints + where ((s.isValidPathStart) and (t.isValidPathEnd)) + project sourceId = s.nodeId + , isSourceValidPathStart = s.isValidPathStart + , targetId = t.nodeId + , isTargetValidPathEnd = t.isValidPathEnd + , scope = s.scope + , edgeIds = e.edgeId + , edgeAllTargetIds = e.targetId + | limit resultCountLimit + ) + | extend pathLength = array_length(edgeIds) + , pathId = hash_md5(strcat(sourceId, strcat(edgeIds), targetId)) + , pathAllNodeIds = array_concat(pack_array(sourceId), edgeAllTargetIds) + | project-away edgeAllTargetIds + | mv-apply with_itemindex = SortIndex nodesInPath = pathAllNodeIds to typeof(string), edgesInPath = edgeIds to typeof(string) on ( + extend step = strcat( + iff(isnotempty(nodesInPath), strcat('(', nodesInPath, ')'), '') + , iff(isnotempty(edgesInPath), strcat('-[', edgesInPath, ']->'), '')) + | summarize fullPath = array_strcat(make_list(step), '') + ) +); +paths +} +``` + +--- + +## Example + +The following example uses the [invoke operator](../query/invoke-operator.md) to run the function. + +### [Query-defined](#tab/query-defined) + +To use a query-defined function, invoke it after the embedded function definition. + +:::moniker range="azure-data-explorer" +> [!div class="nextstepaction"] +> Run the query +::: moniker-end + +```kusto +let graph_path_discovery_fl = ( edgesTableName:string, nodesTableName:string, scopeColumnName:string + , isValidPathStartColumnName:string, isValidPathEndColumnName:string + , nodeIdColumnName:string, edgeIdColumnName:string, sourceIdColumnName:string, targetIdColumnName:string + , minPathLength:long = 1, maxPathLength:long = 8, resultCountLimit:long = 100000) +{ +let edges = ( + table(edgesTableName) + | extend sourceId = column_ifexists(sourceIdColumnName, '') + | extend targetId = column_ifexists(targetIdColumnName, '') + | extend edgeId = column_ifexists(edgeIdColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') + ); +let nodes = ( + table(nodesTableName) + | extend nodeId = column_ifexists(nodeIdColumnName, '') + | extend isValidPathStart = column_ifexists(isValidPathStartColumnName, '') + | extend isValidPathEnd = column_ifexists(isValidPathEndColumnName, '') + | extend scope = column_ifexists(scopeColumnName, '') +); +let paths = ( + edges + // Build graph object partitioned by scope, so that no connections are allowed between scopes. + // In case no scopes are relevant, partitioning should be removed for better performance. + | make-graph sourceId --> targetId with nodes on nodeId partitioned-by scope ( + // Look for existing paths between source nodes and target nodes with less than predefined number of hops + // Current configurations looks for directed paths without any cycles; this can be changed if needed + graph-match cycles = none (s)-[e*minPathLength..maxPathLength]->(t) + // Filter only by paths with that connect valid endpoints + where ((s.isValidPathStart) and (t.isValidPathEnd)) + project sourceId = s.nodeId + , isSourceValidPathStart = s.isValidPathStart + , targetId = t.nodeId + , isTargetValidPathEnd = t.isValidPathEnd + , scope = s.scope + , edgeIds = e.edgeId + , edgeAllTargetIds = e.targetId + | limit resultCountLimit + ) + | extend pathLength = array_length(edgeIds) + , pathId = hash_md5(strcat(sourceId, strcat(edgeIds), targetId)) + , pathAllNodeIds = array_concat(pack_array(sourceId), edgeAllTargetIds) + | project-away edgeAllTargetIds + | mv-apply with_itemindex = SortIndex nodesInPath = pathAllNodeIds to typeof(string), edgesInPath = edgeIds to typeof(string) on ( + extend step = strcat( + iff(isnotempty(nodesInPath), strcat('(', nodesInPath, ')'), '') + , iff(isnotempty(edgesInPath), strcat('-[', edgesInPath, ']->'), '')) + | summarize fullPath = array_strcat(make_list(step), '') + ) +); +paths +}; +let edges = datatable (SourceNodeName:string, EdgeName:string, EdgeType:string, TargetNodeName:string, Region:string)[ + 'vm-work-1', 'e1', 'can use', 'webapp-prd', 'US', + 'vm-custom', 'e2', 'can use', 'webapp-prd', 'US', + 'webapp-prd', 'e3', 'can access', 'vm-custom', 'US', + 'webapp-prd', 'e4', 'can access', 'test-machine', 'US', + 'vm-custom', 'e5', 'can access', 'server-0126', 'US', + 'vm-custom', 'e6', 'can access', 'hub_router', 'US', + 'webapp-prd', 'e7', 'can access', 'hub_router', 'US', + 'test-machine', 'e8', 'can access', 'vm-custom', 'US', + 'test-machine', 'e9', 'can access', 'hub_router', 'US', + 'hub_router', 'e10', 'routes traffic to', 'remote_DT', 'US', + 'vm-work-1', 'e11', 'can access', 'storage_main_backup', 'US', + 'hub_router', 'e12', 'routes traffic to', 'vm-work-2', 'US', + 'vm-work-2', 'e13', 'can access', 'backup_prc', 'US', + 'remote_DT', 'e14', 'can access', 'backup_prc', 'US', + 'backup_prc', 'e15', 'moves data to', 'storage_main_backup', 'US', + 'backup_prc', 'e16', 'moves data to', 'storage_DevBox', 'US', + 'device_A1', 'e17', 'is connected to', 'sevice_B2', 'EU', + 'sevice_B2', 'e18', 'is connected to', 'device_A1', 'EU' +]; +let nodes = datatable (NodeName:string, NodeType:string, NodeEnvironment:string, Region:string) [ + 'vm-work-1', 'Virtual Machine', 'Production', 'US', + 'vm-custom', 'Virtual Machine', 'Production', 'US', + 'webapp-prd', 'Application', 'None', 'US', + 'test-machine', 'Virtual Machine', 'Test', 'US', + 'hub_router', 'Traffic Router', 'None', 'US', + 'vm-work-2', 'Virtual Machine', 'Production', 'US', + 'remote_DT', 'Virtual Machine', 'Production', 'US', + 'backup_prc', 'Service', 'Production', 'US', + 'server-0126', 'Server', 'Production', 'US', + 'storage_main_backup', 'Cloud Storage', 'Production', 'US', + 'storage_DevBox', 'Cloud Storage', 'Test', 'US', + 'device_A1', 'Device', 'Backend', 'EU', + 'device_B2', 'Device', 'Backend', 'EU' +]; +let nodesEnriched = ( + nodes + | extend IsValidStart = (NodeType == 'Virtual Machine'), IsValidEnd = (NodeType == 'Cloud Storage') // option 1 + //| extend IsValidStart = (NodeName in('vm-work-1', 'vm-work-2')), IsValidEnd = (NodeName in('storage_main_backup')) // option 2 + //| extend IsValidStart = (NodeEnvironment == 'Test'), IsValidEnd = (NodeEnvironment == 'Production') // option 3 +); +graph_path_discovery_fl(edgesTableName = 'edges' + , nodesTableName = 'nodesEnriched' + , scopeColumnName = 'Region' + , nodeIdColumnName = 'NodeName' + , edgeIdColumnName = 'EdgeName' + , sourceIdColumnName = 'SourceNodeName' + , targetIdColumnName = 'TargetNodeName' + , isValidPathStartColumnName = 'IsValidStart' + , isValidPathEndColumnName = 'IsValidEnd' +) +``` + +### [Stored](#tab/stored) + +> [!IMPORTANT] +> For this example to run successfully, you must first run the [Function definition](#function-definition) code to store the function. + +```kusto +let edges = datatable (SourceNodeName:string, EdgeName:string, EdgeType:string, TargetNodeName:string, Region:string)[ + 'vm-work-1', 'e1', 'can use', 'webapp-prd', 'US', + 'vm-custom', 'e2', 'can use', 'webapp-prd', 'US', + 'webapp-prd', 'e3', 'can access', 'vm-custom', 'US', + 'webapp-prd', 'e4', 'can access', 'test-machine', 'US', + 'vm-custom', 'e5', 'can access', 'server-0126', 'US', + 'vm-custom', 'e6', 'can access', 'hub_router', 'US', + 'webapp-prd', 'e7', 'can access', 'hub_router', 'US', + 'test-machine', 'e8', 'can access', 'vm-custom', 'US', + 'test-machine', 'e9', 'can access', 'hub_router', 'US', + 'hub_router', 'e10', 'routes traffic to', 'remote_DT', 'US', + 'vm-work-1', 'e11', 'can access', 'storage_main_backup', 'US', + 'hub_router', 'e12', 'routes traffic to', 'vm-work-2', 'US', + 'vm-work-2', 'e13', 'can access', 'backup_prc', 'US', + 'remote_DT', 'e14', 'can access', 'backup_prc', 'US', + 'backup_prc', 'e15', 'moves data to', 'storage_main_backup', 'US', + 'backup_prc', 'e16', 'moves data to', 'storage_DevBox', 'US', + 'device_A1', 'e17', 'is connected to', 'sevice_B2', 'EU', + 'sevice_B2', 'e18', 'is connected to', 'device_A1', 'EU' +]; +let nodes = datatable (NodeName:string, NodeType:string, NodeEnvironment:string, Region:string) [ + 'vm-work-1', 'Virtual Machine', 'Production', 'US', + 'vm-custom', 'Virtual Machine', 'Production', 'US', + 'webapp-prd', 'Application', 'None', 'US', + 'test-machine', 'Virtual Machine', 'Test', 'US', + 'hub_router', 'Traffic Router', 'None', 'US', + 'vm-work-2', 'Virtual Machine', 'Production', 'US', + 'remote_DT', 'Virtual Machine', 'Production', 'US', + 'backup_prc', 'Service', 'Production', 'US', + 'server-0126', 'Server', 'Production', 'US', + 'storage_main_backup', 'Cloud Storage', 'Production', 'US', + 'storage_DevBox', 'Cloud Storage', 'Test', 'US', + 'device_A1', 'Device', 'Backend', 'EU', + 'device_B2', 'Device', 'Backend', 'EU' +]; +let nodesEnriched = ( + nodes + | extend IsValidStart = (NodeType == 'Virtual Machine'), IsValidEnd = (NodeType == 'Cloud Storage') // option 1 + //| extend IsValidStart = (NodeName in('vm-work-1', 'vm-work-2')), IsValidEnd = (NodeName in('storage_main_backup')) // option 2 + //| extend IsValidStart = (NodeEnvironment == 'Test'), IsValidEnd = (NodeEnvironment == 'Production') // option 3 +); +graph_path_discovery_fl(edgesTableName = 'edges' + , nodesTableName = 'nodesEnriched' + , scopeColumnName = 'Region' + , nodeIdColumnName = 'NodeName' + , edgeIdColumnName = 'EdgeName' + , sourceIdColumnName = 'SourceNodeName' + , targetIdColumnName = 'TargetNodeName' + , isValidPathStartColumnName = 'IsValidStart' + , isValidPathEndColumnName = 'IsValidEnd' +) +``` + +--- + +**Output** + + +| sourceId | isSourceValidPathStart | targetId | isTargetValidPathEnd | scope | edgeIds | pathLength | pathId | pathAllNodeIds | fullPath | +| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +| test-machine | True | storage_DevBox | True | US | ["e9","e10","e14","e16"] | 4 | 00605d35b6e1d28024fd846f217b43ac | ["test-machine","hub_router","remote_DT","backup_prc","storage_DevBox"] | (test-machine)-[e9]->(hub_router)-[e10]->(remote_DT)-[e14]->(backup_prc)-[e16]->(storage_DevBox) | + + + +Running the function finds all paths using input edges that connect between source nodes flagged as valid start points (isSourceValidPathStart == True) to all targets flagged as valid end points (isTargetValidPathEnd == True). The output is a table where each row describes a single path (limited to maximum number of rows by resultCountLimit parameter). Each row contains the following fields: + +* `sourceId`: nodeId of the source - first node in the path. +* `isSourceValidPathStart`: Boolean flag for source node being a valid path start; should be equal to True. +* `targetId`: nodeId of the target - last node in the path. +* `isTargetValidPathEnd`: Boolean flag for target node being a valid path end; should be always equal to True. +* `scope`: the scope containing the path. +* `edgeIds`: an ordered list of edges in the path. +* `pathLength`: the numbers of edges (hops) in the path. +* `pathId`: a hash of path's endpoints and steps can be used as unique identifier for the path. +* `pathAllNodeIds`: an ordered list of nodes in the path. +* `fullPath`: a string representing the full path, in format (source node)-[edge 1]->(node2)-.....->(target node). + +In the example above, we preprocess the nodes table and add several options of possible endpoint definitions. By commenting/uncommenting different options, several scenarios can be discovered: + +* Option 1: Find paths between Virtual Machines to Cloud Storage resources. Useful in exploring connection patterns between types of nodes. +* Option 2: Find paths between any of the specific nodes (vm-work-1, vm-work-2) to a specific node (storage_main_backup). Useful in investigating known cases - such as paths from known compromised assets to known critical ones. +* Option 3: Find paths between groups of nodes, such as nodes in different environments. Useful for monitoring insecure paths, such as paths between test and production environments. + +In the example above we use the first option to find all the paths between VMs to cloud storage resources, which might be used by potential attackers who want to access stored data. This scenario can be strengthened by adding more filters to valid endpoints - for example, connecting VMs with known vulnerabilities to storage accounts containing sensitive data. + +The function `graph_path_discovery_fl()` can be used in cybersecurity domain to discover interesting paths, such as lateral movement paths, over data modeled as a graph. + diff --git a/data-explorer/kusto/functions-library/toc.yml b/data-explorer/kusto/functions-library/toc.yml index a1377c552a..5de11ec3ba 100644 --- a/data-explorer/kusto/functions-library/toc.yml +++ b/data-explorer/kusto/functions-library/toc.yml @@ -29,6 +29,9 @@ items: - name: get_packages_version_fl() displayName: functions library, python, version, package href: get-packages-version-fl.md +- name: graph_path_discovery_fl() + displayName: functions library, graph, path, discovery, cyber, security, cybersecurity, + href: graph-path-discovery-fl.md - name: kmeans_fl() displayName: functions library, clustering, K-Means href: kmeans-fl.md diff --git a/data-explorer/kusto/query/parse-version-function.md b/data-explorer/kusto/query/parse-version-function.md index 85806206a6..d6785acdbd 100644 --- a/data-explorer/kusto/query/parse-version-function.md +++ b/data-explorer/kusto/query/parse-version-function.md @@ -3,13 +3,13 @@ title: parse_version() description: Learn how to use the parse_version() function to convert the input string representation of the version to a comparable decimal number, ms.reviewer: alexans ms.topic: reference -ms.date: 08/11/2024 +ms.date: 11/27/2024 --- # parse_version() > [!INCLUDE [applies](../includes/applies-to-version/applies.md)] [!INCLUDE [fabric](../includes/applies-to-version/fabric.md)] [!INCLUDE [azure-data-explorer](../includes/applies-to-version/azure-data-explorer.md)] [!INCLUDE [monitor](../includes/applies-to-version/monitor.md)] [!INCLUDE [sentinel](../includes/applies-to-version/sentinel.md)] -Converts the input string representation of the version to a comparable decimal number. +Converts the input string representation of a version number into a decimal number that can be compared. ## Syntax @@ -31,58 +31,75 @@ Converts the input string representation of the version to a comparable decimal ## Returns -If conversion is successful, the result will be a decimal. -If conversion is unsuccessful, the result will be `null`. +If conversion is successful, the result is a decimal; otherwise, the result is `null`. -## Example +## Examples + +### Parse version strings + +The following query shows version strings with their parsed version numbers. :::moniker range="azure-data-explorer" > [!div class="nextstepaction"] -> Run the query +> Run the query ::: moniker-end ```kusto let dt = datatable(v: string) [ - "0.0.0.5", "0.0.7.0", "0.0.3", "0.2", "0.1.2.0", "1.2.3.4", "1", "99999999.0.0.0" + "0.0.0.5", "0.0.7.0", "0.0.3", "0.2", "0.1.2.0", "1.2.3.4", "1" +]; +dt +| extend parsedVersion = parse_version(v) +``` + +**Output** + +| v | parsedVersion | +|---|---| +| 0.0.0.5 | 5 | +| 0.0.7.0 | 700,000,000 | +| 0.0.3 | 300,000,000 | +| 0.2 | 20,000,000,000,000,000 | +| 0.1.2.0 | 10,000,000,200,000,000 | +| 1.2.3.4 | 1,000,000,020,000,000,300,000,004 | +| 1 | 1,000,000,000,000,000,000,000,000 | + +### Compare parsed version strings + +The following query identifies which labs have equipment needing updates by comparing their parsed version strings to the minimum version number "1.0.0.0". + +:::moniker range="azure-data-explorer" +> [!div class="nextstepaction"] +> Run the query +::: moniker-end + +```kusto +let dt = datatable(lab: string, v: string) +[ + "Lab A", "0.0.0.5", + "Lab B", "0.0.7.0", + "Lab D","0.0.3", + "Lab C", "0.2", + "Lab G", "0.1.2.0", + "Lab F", "1.2.3.4", + "Lab E", "1", ]; dt -| project v1=v, _key=1 -| join kind=inner (dt | project v2=v, _key = 1) on _key -| where v1 != v2 -| summarize v1 = max(v1), v2 = min(v2) by (hash(v1) + hash(v2)) // removing duplications -| project v1, v2, higher_version = iif(parse_version(v1) > parse_version(v2), v1, v2) +| extend parsed_version = parse_version(v) +| extend needs_update = iff(parsed_version < parse_version("1.0.0.0"), "Yes", "No") +| project lab, v, needs_update +| sort by lab asc , v, needs_update ``` **Output** -|v1|v2|higher_version| +| lab | v | needs_update | |---|---|---| -|99999999.0.0.0|0.0.0.5|99999999.0.0.0| -|1|0.0.0.5|1| -|1.2.3.4|0.0.0.5|1.2.3.4| -|0.1.2.0|0.0.0.5|0.1.2.0| -|0.2|0.0.0.5|0.2| -|0.0.3|0.0.0.5|0.0.3| -|0.0.7.0|0.0.0.5|0.0.7.0| -|99999999.0.0.0|0.0.7.0|99999999.0.0.0| -|1|0.0.7.0|1| -|1.2.3.4|0.0.7.0|1.2.3.4| -|0.1.2.0|0.0.7.0|0.1.2.0| -|0.2|0.0.7.0|0.2| -|0.0.7.0|0.0.3|0.0.7.0| -|99999999.0.0.0|0.0.3|99999999.0.0.0| -|1|0.0.3|1| -|1.2.3.4|0.0.3|1.2.3.4| -|0.1.2.0|0.0.3|0.1.2.0| -|0.2|0.0.3|0.2| -|99999999.0.0.0|0.2|99999999.0.0.0| -|1|0.2|1| -|1.2.3.4|0.2|1.2.3.4| -|0.2|0.1.2.0|0.2| -|99999999.0.0.0|0.1.2.0|99999999.0.0.0| -|1|0.1.2.0|1| -|1.2.3.4|0.1.2.0|1.2.3.4| -|99999999.0.0.0|1.2.3.4|99999999.0.0.0| -|1.2.3.4|1|1.2.3.4| -|99999999.0.0.0|1|99999999.0.0.0| +| Lab A | 0.0.0.5 | Yes | +| Lab B | 0.0.7.0 | Yes | +| Lab C | 0.2 | Yes | +| Lab D | 0.0.3 | Yes | +| Lab E | 1 |No | +| Lab F | 1.2.3.4 |No | +| Lab G | 0.1.2.0 | Yes |