Bi-level programming with applications in engineering and economics

Sepehr Ramyar

by Sepehr Ramyar
Ph.D. Candidate in Technology and Information Management, University of California San Cruz

H

Have you ever played Texas hold’em? If you have, you probably realize that the dealer has an advantage: she gets to act last. This allows the dealer to see what everyone else at the table is betting and then decide on what she wants to do. Having seen how everyone else has acted, the dealer can make a more informed decision. So, based on the rules of the game, or the sequence in which information unfolds, a player can gain an advantage over other participants. In fact, it is the information structure of the game (in this case, Texas hold’em) that provides an advantage for the dealer. And this is precisely why the role of the dealer rotates from one player to another, one hand at a time, so that no one player remains in a permanently advantageous position.

A player does not necessarily have to be the last to act in order to be at an advantage. In fact, in many cases, being the first player to act is considered an advantage. In a duel, for example, it definitely pays to be the first player to move. Thinking of the rules of a game in terms of information structure gives us insights into how a player can be at an advantage. In the duel example, if the information (this could be how accurate they aim) is revealed simultaneously, a priori, no player has an advantage. However, if the information is unfolded sequentially, the first player to move is clearly at an advantage. In economics jargon, this is called a first-mover advantage.
The key notion is that the first-mover knows what the other player is going to do once he moves, and knowing that, he chooses an action that would optimize his objective. In other words, the advantage of the first-mover is not only in that he moves first, but also because he knows or anticipates the reaction, or the best response, of the other player. Although the first-mover is usually able to reduce the payoff to the other player(s) and increase his own, this is not necessarily the case and the first-mover’s advantage does not always mean less payoff to the others. Again, the key determinant in such situations is the information structure of the game.

But what does the information structure mean? Intuitively, this is equivalent to the first-mover anticipating the best response of the other player(s). In mathematical terms, one can represent a player using an optimization problem since they usually seek to maximize their payoff (or equivalently, minimize their loss). So, if a player’s behavior is illustrated using a mathematical problem, the best action of a player would be the optimality condition of his optimization problem. Therefore, given what others are doing, a player’s optimality condition would specify his best response to the other players. Now, if a player is including the other players’ optimality conditions into his own optimization problem, he is essentially internalizing (or anticipating) the best response of other players. In other words, he is expecting how other players would respond to any of his moves and consequently impact his payoff, thus choosing the action that would maximize his payoff given the other players’ anticipated reactions to his move.

This problem is called a Stackelberg game, or a leader-follower game. And the framework can be applied to many real world applications in which there is a leader and one (or several) followers. For example, a firm that can anticipate how its competitors would respond to changes in its quantity output can choose the optimal level of output knowing that the followers would react in a manner consistent with the optimality conditions of the leader. The Stackelberg formulation of a game is an instance of a more general class of optimization problems that are called bi-level programs. These are mathematical programs where the constraints (or the feasibility set) involve another optimization problem. This is because the leader, in a Stackelberg game, includes the optimality conditions of other players in the constraints of his optimization problem.

Bi-level programming has many applications. Resource allocation in adversarial environments is one prominent category. This situation arises when the game is played between a defender and an attacker (or a group of attackers). The defender aims to maximize security (or minimize loss) by utilizing scarce resources. For example, defending a wildlife resort against environmental crimes. This is an example of green security games where bilevel programming helps anticipate the adversary’s strategy and accordingly allocates resources efficiently. There are other applications of Stackelberg games in urban crime prevention, traffic monitoring, toll pricing, supply chain management etc.
Despite wide applicability, bi-level programs have practical limitations. One major drawback is that bi-level programs are non-convex and hence difficult to solve. Even if the higher-level and lower-level programs are convex, the resulting bi-level program would still be non-convex because of the interdependency between the decision variables in the higher and lower level problems. Restricting the optimization problems at the two layers to linear programs would still yield an NP-hard bilevel program. There are theorems on very general conditions for existence of globally optimal solutions to bilevel programs, but none specifying an algorithm for achieving the global optimum. This is a major problem as the Stackelberg game involves more and more players (or more decision variables), the computation of a solution becomes increasingly difficult (Luo, Pang, & Ralph, 1996; Ben-Aved & Blair, 1990).

Another problem with Stackelberg games (or the equivalent bilevel programs) is the underlying assumption that the leader anticipates how other players would respond. In other words, there is an implicit assumption that the other players act (or respond) rationally. This, however, may not be the case in reality. One way to deal with this is using bounded rationality to model players’ behavior. This includes using behavioral models that allow for mistakes (or errors) in a player choosing his best response and account for the players not always acting rationally and consistently. Introducing behavioral models that involve extra parameters exacerbates the computational difficulties of bilevel programs. Another implicit (but maybe not so innocuous) assumption behind Stackelberg games is that of a non-cooperative setting. In other words, players are assumed to be utility-maximizing individuals and there is no cooperation or side-payments of any kind. However, especially in the case of a leader and follower(s), it may not be too unrealistic to suspect formation of coalitions among players (Sinha, Fang, An, Keikintveld & Tambe, 2018).

Despite theoretical and practical limitations, Stackelberg games, and bilevel programming in a broader sense, have provided valuable solutions to many challenges ranging from wildlife protection to prevention of price manipulations in markets. More recently, bilevel programming has been used for weighing software usefulness vs. privacy concerns and enables modeling the privacy-compromising adversaries. Advances in solving bilevel programs more efficiently would consequently have a significant impact on making our decisions more reliable and informed and our lives better and safer.

References:

 Luo, Z. Q., Pang, J. S., & Ralph, D. (1996). Mathematical programs with equilibrium constraints. Cambridge University Press.

 Ben-Ayed, O., & Blair, C. E. (1990). Computational difficulties of bilevel linear programming. Operations Research, 38(3), 556-560.

 Sinha, A., Fang, F., An, B., Kiekintveld, C., & Tambe, M. (2018). Stackelberg Security Games: Looking Beyond a Decade of Success. In IJCAI (pp. 5494-5501).