Possible issue with topological_sort_by_dfs

Looks like the function ```topological_sort_by_dfs``` can have a run-time $O(n\cdot n)$ even for sparse graphs. The simplest example to get this behaviour is

```
function gen_graph_topsort(k)
    g = SimpleDiGraph(k+1)
    for i = 1:k
        add_edge!(g, 1, i+1)
    end
    return g
end

g = gen_graph_topsort(10^5)
@time topological_sort_by_dfs(g)
```

For graphs with about 10^5 vertices this already takes a few seconds. For larger graphs, this quickly becomes infeasible (due to the quadratic run-time). Expected behaviour would be that the implementation has worst-case O(n+m) run-time and takes fractions of a second for graphs of this size.

The problem appears to be that this loop https://github.com/JuliaGraphs/Graphs.jl/blob/e5405d105e034ccb9396d30f86c26641ce94c504/src/traversals/dfs.jl#L97 goes through the neighbors of u over and over everytime starting from the beginning of the list.

I.e., it will find ```n``` with ```vcolor[n] == 0``` and break, then (with some further steps) ```vcolor[n]``` will be set to ```2```. Afterwards, again the search over the neighbors of u starts at the first neighbor and goes through them until it finds one with ```vcolor == 0``` and breaks etc.

Thus, for a vertex with $k$ neighbor it can take $O(k^2)$ steps, leading to potentially O(n^2), resp. O(m^2), run-time for sparse graphs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue with topological_sort_by_dfs #263

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible issue with topological_sort_by_dfs #263

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions