Skip to content
1 change: 1 addition & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ LocalDateTime
LocalTime
Lovitz
MEM
MSF
Miesha
MotoGP
NRedisGraph
Expand Down
150 changes: 49 additions & 101 deletions algorithms/msf.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@ nav_order: 9

# Minimum Spanning Forest (MSF)

The Minimum Spanning Forest algorithm computes the minimum spanning forest of a graph. A minimum spanning forest is a collection of minimum spanning trees, one for each connected component in the graph.
The Minimum Spanning Forest algorithm computes the minimum spanning forest of a
graph. A minimum spanning forest is a collection of minimum spanning trees, one
for each connected component in the graph.

## What is a Minimum Spanning Forest?

## What is a Minimum Spanning Forest?
- For a **connected graph**, the MSF is a single minimum spanning tree (MST) that connects all nodes with the minimum total edge weight
- For a **disconnected graph**, the MSF consists of multiple MSTs, one for each connected component
- The forest contains no cycles and has exactly `N - C` edges, where `N` is the number of nodes and `C` is the number of connected components
- The sum of the weights of the edges in the forest is minimized

## Use Cases

Expand All @@ -27,122 +30,68 @@ The Minimum Spanning Forest algorithm computes the minimum spanning forest of a
```cypher
CALL algo.MSF(
config: MAP
) YIELD src, dest, weight, relationshipType
) YIELD edges, nodes
Comment on lines 30 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for MSF procedure definition or implementation
find . -type f \( -name "*.md" -o -name "*.cypher" -o -name "*.java" -o -name "*.js" -o -name "*.py" \) | head -20

Repository: FalkorDB/docs

Length of output: 586


🏁 Script executed:

# Search for MSF implementation or references
rg "algo\.MSF|CALL.*MSF" -i --type-list

Repository: FalkorDB/docs

Length of output: 446


🏁 Script executed:

# Look for the MSF algorithm implementation
rg "def.*MSF|class.*MSF|function.*MSF" -i --type-list

Repository: FalkorDB/docs

Length of output: 446


🏁 Script executed:

# Search for any test files or examples showing MSF YIELD output
rg "MSF.*YIELD|YIELD.*src.*dest|YIELD.*edges.*nodes" -i

Repository: FalkorDB/docs

Length of output: 173


🏁 Script executed:

# Check the specific lines in msf.md file
head -100 algorithms/msf.md | tail -80

Repository: FalkorDB/docs

Length of output: 3656


🏁 Script executed:

# Search for any reference to FalkorDB source or implementation
rg "github.com/falkordb|FalkorDB" --type md | head -10

Repository: FalkorDB/docs

Length of output: 1172


🏁 Script executed:

# Look for any references or links to official FalkorDB repository
rg "falkordb" -i --type md | grep -i "source\|repo\|github\|implementation" | head -10

Repository: FalkorDB/docs

Length of output: 1884


🌐 Web query:

FalkorDB algo.MSF YIELD signature official API documentation

💡 Result:

Signature (from official docs):

CALL algo.MSF(config: MAP) YIELD src, dest, weight, relationshipType. [1]

Where:

  • config: MAP — algorithm options (sourceNodes, relationshipTypes, relationshipWeightProperty, defaultValue).
  • Yields: src (Node), dest (Node), weight (Float), relationshipType (String). [1][2]

Sources: FalkorDB algorithm docs (MSF) and Procedures overview. [1] https://docs.falkordb.com/algorithms/msf.html [2] https://docs.falkordb.com/cypher/procedures.html


🌐 Web query:

FalkorDB Minimum Spanning Forest algorithm procedure yields fields

💡 Result:

FalkorDB's MSF procedure (CALL algo.MSF(...)) yields these fields: src, dest, weight, relationshipType. [1][2]

Sources:

  • FalkorDB MSF algorithm docs. [1]
  • FalkorDB Procedures reference (algo.MSF entry). [2]

CRITICAL: Procedure signature and examples do not match the official FalkorDB API.

The official FalkorDB MSF procedure yields src, dest, weight, relationshipType—not edges, nodes. This affects multiple sections:

  • Line 32 (Syntax): Shows YIELD edges, nodes (incorrect)
  • Lines 49–52 (Return Values table): Lists edges and nodes (incorrect)
  • Line 87 (Example): Shows YIELD edge, weight (also incorrect)

Update all three sections to reflect the actual API:

 ```cypher
 CALL algo.MSF(
     config: MAP
-) YIELD edges, nodes
+) YIELD src, dest, weight, relationshipType

Update the Return Values table to:

| Name | Type | Description |
|------|------|-------------|
| `src` | Node | Source node of the edge |
| `dest` | Node | Destination node of the edge |
| `weight` | Float | Weight of the edge |
| `relationshipType` | String | Type of the relationship |

Update the example at line 87 from `YIELD edge, weight` to `YIELD src, dest, weight, relationshipType`.

<details>
<summary>🤖 Prompt for AI Agents</summary>

In algorithms/msf.md around lines 29–32, 49–52 and 87, the MSF procedure
signature and examples use incorrect yield names; replace the incorrect "YIELD
edges, nodes" at lines 29–32 with "YIELD src, dest, weight, relationshipType",
update the Return Values table at lines 49–52 to list src (Node), dest (Node),
weight (Float), relationshipType (String) with the provided descriptions, and
change the example at line 87 from "YIELD edge, weight" to "YIELD src, dest,
weight, relationshipType".


</details>

<!-- fingerprinting:phantom:triton:mongoose -->

<!-- This is an auto-generated comment by CodeRabbit -->

```

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `config` | MAP | Configuration map containing algorithm parameters |

#### Configuration Options

| Option | Type | Required | Default | Description |
|--------|------|----------|---------|-------------|
| `sourceNodes` | List of Nodes | No | All nodes | Starting nodes for the algorithm. If not provided, all nodes in the graph are considered |
| `relationshipTypes` | List of Strings | No | All types | Relationship types to traverse. If not provided, all relationship types are considered |
| `relationshipWeightProperty` | String | No | `null` | Property name containing edge weights. If not specified, all edges have weight 1.0 |
| `defaultValue` | Float | No | `1.0` | Default weight for edges that don't have the weight property |

### Returns
The procedure accepts an optional configuration `Map` with the following optional parameters:

| Field | Type | Description |
|-------|------|-------------|
| `src` | Node | Source node of the edge in the spanning forest |
| `dest` | Node | Destination node of the edge in the spanning forest |
| `weight` | Float | Weight of the edge |
| `relationshipType` | String | Type of the relationship |
| Name | Type | Default | Description |
|---------------------|--------|------------------------|----------------------------------------------------------------------------|
| `nodeLabels` | Array | All labels | Array of node labels to filter which nodes are included in the computation |
| `relationshipTypes` | Array | All relationship types | Array of relationship types to define which edges are traversed |
| `objective` | string | 'minimize' | 'minimize' or 'maximize' what to optimize in the spanning tree |
| `weightAttribute` | string | Unweighted | the attribute to use as the tree weight. |

## Examples
### Return Values
The procedure returns a stream of records corresponding to each tree in the forest with the following fields:

### Example 1: Basic MSF with Unweighted Graph
| Name | Type | Description |
|---------|------|----------------------------------|
| `edges` | List | The edges that connect each tree |
| `nodes` | List | The nodes in the tree |

Find the minimum spanning forest treating all edges equally:
### Create the Graph

```cypher
CALL algo.MSF({}) YIELD src, dest, weight, relationshipType
RETURN src.name AS source, dest.name AS destination, weight, relationshipType
CREATE
(CityHall:GOV),
(CourtHouse:GOV),
(FireStation:GOV),
(Electricity:UTIL),
(Water:UTIL),
(Building_A:RES),
(Building_B:RES),
(CityHall)-[rA:ROAD {cost: 2.2}]->(CourtHouse),
(CityHall)-[rB:ROAD {cost: 8.0}]->(FireStation),
(CourtHouse)-[rC:ROAD {cost: 3.4}]->(Building_A),
(FireStation)-[rD:ROAD {cost: 3.0}]->(Building_B),
(Building_A)-[rF:ROAD {cost: 5.2}]->(Building_B),
(Electricity)-[rG:ROAD {cost: 0.7}]->(Building_A),
(Water)-[rH:ROAD {cost: 2.3}]->(Building_B),
(CityHall)-[tA:TRAM {cost: 1.5}]->(Building_A),
(CourtHouse)-[tB:TRAM {cost: 7.3}]->(Building_B),
(FireStation)-[tC:TRAM {cost: 1.2}]->(Electricity)
RETURN *
```

### Example 2: MSF with Weighted Edges
## Examples:

Consider a graph representing cities connected by roads with distances:

```cypher
// Create a weighted graph
CREATE (a:City {name: 'A'}), (b:City {name: 'B'}), (c:City {name: 'C'}),
(d:City {name: 'D'}), (e:City {name: 'E'})
CREATE (a)-[:ROAD {distance: 2}]->(b),
(a)-[:ROAD {distance: 3}]->(c),
(b)-[:ROAD {distance: 1}]->(c),
(b)-[:ROAD {distance: 4}]->(d),
(c)-[:ROAD {distance: 5}]->(d),
(d)-[:ROAD {distance: 6}]->(e)

// Find minimum spanning forest using distance weights
CALL algo.MSF({
relationshipWeightProperty: 'distance'
}) YIELD src, dest, weight
RETURN src.name AS from, dest.name AS to, weight AS distance
ORDER BY weight
```

**Result:**
```text
from | to | distance
-----|----|---------
B | C | 1.0
A | B | 2.0
A | C | 3.0
B | D | 4.0
D | E | 6.0
```
Suppose you are an urban planner tasked with designing a new transportation network for a town. There are several vital buildings that must be connected by this new network. A cost estimator has already provided you with the estimated cost for some of the potential routes between these buildings.

### Example 3: MSF on Specific Relationship Types
Your goal is to connect every major building with the lowest total cost, even if travel between some buildings requires multiple stops and different modes of transport. The Minimum Spanning Forest algorithm helps you achieve this by identifying the most cost-effective network.

Find the spanning forest considering only specific relationship types:
![City Graph](../images/city_plan.png)

```cypher
CALL algo.MSF({
relationshipTypes: ['ROAD', 'HIGHWAY'],
relationshipWeightProperty: 'distance'
}) YIELD src, dest, weight, relationshipType
RETURN src.name AS from, dest.name AS to, weight, relationshipType
CALL algo.MSF({weightAttribute: 'cost'}) YIELD edges, nodes RETURN edges, nodes
```

### Example 4: MSF Starting from Specific Nodes

Compute the spanning forest starting from a subset of nodes:

```cypher
MATCH (start:City) WHERE start.name IN ['A', 'B']
WITH collect(start) AS startNodes
CALL algo.MSF({
sourceNodes: startNodes,
relationshipWeightProperty: 'distance'
}) YIELD src, dest, weight
RETURN src.name AS from, dest.name AS to, weight
```

### Example 5: Disconnected Graph

For a graph with multiple components, MSF returns multiple trees:

```cypher
// Create two disconnected components
CREATE (a:Node {name: 'A'})-[:CONNECTED {weight: 1}]->(b:Node {name: 'B'}),
(b)-[:CONNECTED {weight: 2}]->(c:Node {name: 'C'}),
(x:Node {name: 'X'})-[:CONNECTED {weight: 3}]->(y:Node {name: 'Y'})

// Find MSF
CALL algo.MSF({
relationshipWeightProperty: 'weight'
}) YIELD src, dest, weight
RETURN src.name AS from, dest.name AS to, weight
```
### Expected Results
The algorithm would yield a single tree containing the following edge and node objects:

**Result:** Two separate trees (A-B-C and X-Y)
![City MSF Graph](../images/city_msf.png)

## Algorithm Details

Expand All @@ -161,9 +110,8 @@ FalkorDB's MSF implementation uses an efficient matrix-based approach optimized
## Best Practices

1. **Weight Properties**: Ensure weight properties are numeric (integers or floats)
2. **Missing Weights**: Use `defaultValue` to handle edges without weight properties
3. **Large Graphs**: For large graphs (100K+ nodes), consider filtering by `sourceNodes` or `relationshipTypes`
4. **Directed vs Undirected**: The algorithm treats relationships as undirected for spanning forest purposes
2. **Missing Weights**: Edges without the specified weight property will only be included in the tree if there are no other edges that could be used to connect the connected component
3. **Directed vs Undirected**: The algorithm treats all relationships as undirected for spanning forest purposes

## Related Algorithms

Expand Down
Binary file added images/city_msf.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/city_plan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.