Skip to content

Possible issue with topological_sort_by_dfs #263

@mwien

Description

@mwien

Looks like the function topological_sort_by_dfs can have a run-time $O(n\cdot n)$ even for sparse graphs. The simplest example to get this behaviour is

function gen_graph_topsort(k)
    g = SimpleDiGraph(k+1)
    for i = 1:k
        add_edge!(g, 1, i+1)
    end
    return g
end

g = gen_graph_topsort(10^5)
@time topological_sort_by_dfs(g)

For graphs with about 10^5 vertices this already takes a few seconds. For larger graphs, this quickly becomes infeasible (due to the quadratic run-time). Expected behaviour would be that the implementation has worst-case O(n+m) run-time and takes fractions of a second for graphs of this size.

The problem appears to be that this loop

for n in outneighbors(g, u)
goes through the neighbors of u over and over everytime starting from the beginning of the list.

I.e., it will find n with vcolor[n] == 0 and break, then (with some further steps) vcolor[n] will be set to 2. Afterwards, again the search over the neighbors of u starts at the first neighbor and goes through them until it finds one with vcolor == 0 and breaks etc.

Thus, for a vertex with $k$ neighbor it can take $O(k^2)$ steps, leading to potentially O(n^2), resp. O(m^2), run-time for sparse graphs.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Relationships

None yet

Development

No branches or pull requests

Issue actions