The Longest Induced Path Problem: How Far Can Information Travel?

Jessica Leung

by Jessica Leung
THIRD PRIZE in OR/MS Tomorrow Student Writing Competition 2019
Jessica Leung is a Ph.D. candidate in the Discipline of Business Analytics at the University of Sydney Business School.

Imagine you posted something on social media, it may be a food review for that gorgeous restaurant you went to the other day, some brilliant ideas worth sharing or even just a cool meme that you wish everyone can see. Some of your friends shared your post and their friends shared it further...have you ever wondered how far your post could travel and who ended up seeing it? To answer this question, we wish to find the longest induced path in the social network.

The longest induced path problem is to find the largest subset of nodes in a graph that gives a simple path without cycles (Johnson and Garey, 1979; Di Giacomo et al., 2016; Esperet et al., 2017). To do so, we have to consider all possible acyclic paths formed by all possible subsets of nodes in the graph. Considering its combinatorial nature, the longest induced path problem is known to be NP-hard (Johnson and Garey, 1979). Given the inherent computational complexity of the problem, the literature offers limited insight on exact solution approaches in general graphs but instead focuses on subclasses of the problem that are polynomially solvable and on tightening the bounds of the induced path length (Gavril, 2002; Courcelle et al., 2000; Esperet et al., 2017). Fortunately, leveraging the technological and algorithmic advancement in recent decades, we can now formulate the longest induced path problem as an integer programming problem and solve it with an exact solution for any general graph using standard off-the-shelf solvers (e.g. Gurobi).

With the abundance of data available to us, we can now study the longest induced path problem and its implications in many different contexts. For instance, in our motivating example, finding the longest possible path of information transmission in a social network allows us to identify the seed of the information cascade and the final entity which sees this piece of information in the network. Other applications include studying the communication property of complex systems to enhance our understanding of the interacting elements within the system, identifying gene regulation networks in embryonic development, and diagnosing fault in multiprocessor networks. Thus, in light of the numerous practical applications, the longest induced path problem is an interesting class of network optimisation problems to study.

How exactly do we find the longest induced path? Matsypura et al. (2019) provided three conceptually different approaches that yield an exact solution to the longest induced path problem in a general graph setting; developed using interesting facts about the longest induced path.

The first approach is underpinned by the simple observation that the longest induced path in a graph is the subgraph with the maximum diameter. We consider every possible connected subset of nodes in the original graph as a subgraph. In each subgraph, we compute the shortest paths between each pair of nodes where the greatest length of these paths is the diameter of the subgraph. This can be formulated as an integer programming problem that searches for a connected subgraph with the largest diameter which simultaneously consists of the smallest number of nodes among all possible subgraphs. Naturally, the subgraph that gives the maximum diameter is essentially the longest induced path.

The second approach stems from the feature that the graph average distance is simply the average length of the shortest path between any two nodes in a graph. For any connected graph with n nodes, the graph average distance is known to be at most (n + 1)/3 and is equal to (n + 1)/3 when the graph is a path (Doyle and Graver, 1977). Therefore, we can search for the largest connected subgraph where its average distance is at least (n + 1)/3.

The third approach leverages the intuition that an induced path can also be viewed as a walk in a graph with no short-cuts. Imagine we take a walk in the graph, starting with one of the nodes. Then we choose the next node to visit from one of the neighbours of the current node. We traverse through other nodes one at a time until we run out of time or when no further moves are possible. During our journey, we ensure that our route is an induced path by never visiting a single node twice and never taking shortcuts. In such a way, we can formulate an integer programming problem looking for the longest walk.

What’s next? Equipped with the tools to find the longest induced path in any general graph, we are offered an alternative perspective to view problems in real world complex systems. We saw that the longest induced path in a social media network let us know how far information travels. From a marketing point of view, this not only indicates the reach of an advertisement but also identifies a group of potential customers that quite literally share the same interest. With the additional information on customer segmentation and market reach, businesses can now tailor their digital marketing strategies to different interest groups. A promising direction for future research is to consider different types of induced paths and to develop more computational efficient methods. This opens up new opportunities to gain a better understanding of the communication properties of underlying networks in many other real-life applications.

References
 1. Courcelle, B., J. A. Makowsky, and U. Rotics (2000). Linear time solvable optimization problems on graphs of bounded clique-width. Theory of Computing Systems 33(2), 125–150.
 2. Di Giacomo, E., G. Liotta, and T. Mchedlidze (2016). Lower and upper bounds for long induced paths in 3- connected planar graphs. Theoretical Computer Science 636, 47–55.
 3. Doyle, J. and J. E. Graver (1977). Mean distance in a graph. Discrete Mathematics 17 (2), 147–154.
 4. Esperet, L., L. Lemoine, and F. Maffray (2017). Long induced paths in graphs. European Journal of Combinatorics 62, 1–14.
 5. Gavril, F. (2002). Algorithms for maximum weight induced paths. Information Processing Letters 81(4), 203–208.
 6. Johnson, D. S. and M. R. Garey (1979). Comput- ers and intractability: A guide to the theory of NP-completeness. WH Freeman.
 7. Matsypura, D., A. Veremyev, O. A. Prokopyev, and E. L. Pasiliao (2019). On exact solution approaches for the longest induced path problem. European Journal of Operational Research.