Tarjan's Algorithm: Finding Strongly Connected Components

Tarjan's Algorithm: Finding Strongly Connected Components

Today, I encountered a problem that I struggled with for a long time. The problem goes like this:


There are n servers numbered from 0 to n-1 connected by undirected server-to-server connections forming a network where connections[i] = [a, b] represents a connection between servers a and b. Any server can reach any other server directly or indirectly through the network.

A critical connection is a connection that, if removed, will make some server unable to reach some other server.

Find all critical connections in the network.


Someone with a bit of knowledge about data structures would recognize this immediately as a graph problem. I took all the connections we have and built an adjacency list to model the graph. I tried removing one connection at a time and running a DFS to see if some of the nodes become unreachable. This approach works but unfortunately is a slow one. The complexity of the algorithm I wrote ended up being O(E*(V+E)).

It comes as no surprise that I got a "Time Limit Exceeded". So, I did what motivational gurus have prepared me for all my life. I powered throu...

Kidding, I checked the hints!

Tarjan's Algorithm

Tarjan's algorithm is used to find "Strongly Connected Components".

An SCC is defined as such: If we pick any pair of vertices u and v in the SCC, we can find a path from u to v and vice versa.
– Competitive Programming 3, p 158

This also means that removing a connection won't affect the nodes' connectivity in an SCC since there will be another path to the first node from the second node's neighbors.

Each cluster represents a group of strongly connected components. (credits)

If we take a look back at our problem, we can see some resemblance to the definition. Basically, any connection between members of the same SCC is not considered critical.

Tarjan's algorithm is a modified version of DFS, where we kept track of the order when the node was visited for the first time and the smallest order reachable from DFS of the subtree that starts from that node. The second one can only be smaller if we find a cycle.

The final solution with looks like this:

class Solution {
public:
    void find_bridges(unordered_map<int, vector<int>>& adj_list, vector<int>& rank, int cur, int prev, int order, vector<vector<int>>& ans) {
        rank[cur] = order;
        for(const auto& neighbor: adj_list[cur]) {
            if(neighbor == prev) 
                continue;
            
            // a rank of 0 means the node wasn't visited, we kick off a DFS for it's subtree
            if(rank[neighbor] == 0) {
                find_bridges(adj_list, rank, neighbor, cur, order + 1, ans);
            }
            
            // if we find a smaller order in the subtree, aka a cycle we change the current order
            rank[cur] = min(rank[neighbor], rank[cur]);
            
            // if the order of the neighbor is bigger it means we didn't find a cycle and thus removing this connection 
            // would disconnect the subtree of the neighbor from cur
            if(rank[neighbor] > order) {
                ans.push_back({cur, neighbor});
            }
        }
    }
    
    // start here
    vector<vector<int>> criticalConnections(int n, vector<vector<int>>& connections) {
        vector<int> rank(n, 0); // holds the order in DFS traversal
        vector<vector<int>> ans; // holds the list of critical connections
        unordered_map<int, vector<int>> adj_list; // defines the network graph
        
        for(const auto& conn: connections) {
            int a = conn[0], b = conn[1];
            adj_list[a].push_back(b);
            adj_list[b].push_back(a);
        }
        
        // start a DFS from node 0
        find_bridges(adj_list, rank, 0, -1, 1, ans);
        
        return ans;
    }
};