Plenary speakers
Plenary speakers
Paul Dupuis (Brown University)
Title: Accelerating Monte Carlo--What does the Donsker-Varadhan theory have to say?
Abstract: The theory of large deviations has played an important role in the development of Monte Carlo methods for estimating quantities defined in terms of a specific rare event, such as ruin probabilities or buffer overflow probabilities. In this setting the associated large deviation theory is that of small random perturbations of deterministic systems, also referred to as the Freidlin—Wentsel theory. However, rare events also play an important role when estimating functionals of an invariant distribution, where straightforward simulation will converge very slowly when parts of the state space do not communicate well. Problems of this sort are common in statistical inference, engineering and the physical sciences, and much effort has gone into the development of methods to accelerate the convergence of Monte Carlo. We will argue that the Donsker—Varadhan theory, which describes the large deviation properties of the empirical measure of a Markov chain, gives insight into the analysis of existing schemes and the design of new methods for this class of problems.
Svante Janson (Uppsala University)
Title: Bootstrap percolation on random graphs
Abstract: Bootstrap percolation on a graph G is a process of spreading an "infection" (or "activation") according to the following rule, with a given threshold r (for example r=2): given a set A of initially infected vertices, new vertices are subsequently infected, at each time step, if they have at least r previously infected neighbors. (This is not a good model for usual infectious diseases. It might be a better, although still simplified, model for neuronal networks, or for the spread of rumors or other ideas or beliefs.)
The main problem is to find the size of the set of vertices that become infected and, in particular, whether eventually every vertex is infected.
We study this problem with a random initial set on some random graphs, for example the Erdös-Renyi random graph G(n,p). In this case, the model exhibits a sharp phase transition as n tends to infinity: the size of the final infected set is with high probability is either n-o(n) or o(n). We characterize the two cases. Furthermore, we give asymptotics for the number of steps until the process stops. We also study the case of random regular graphs.
Philippe Robert (INRIA)
Title: Scaling methods for the analysis of stochastic networks
Abstract: It is usually quite difficult to model the behavior of a multi-dimensional Markov processes describing the evolution of a stochastic network. A general problem concerns the inference of qualitative properties of the network from local specifications of the dynamic. The local characteristics are determined by the algorithms that control the interaction between nodes (for resource allocation or admission control, for example). When it exists, the equilibrium distribution of stochastic networks does not in general have a product form (Gibbsian representation). In fact, equilibrium distributions are not known at all except for some special classes of network. Moreover, even when a Gibbsian representation does exist, the dynamics of the networks are not Gibbsian so that some classical tools of statistical physics cannot be used.
Some insight into the behavior of these networks can nevertheless be obtained through limit procedures which can be roughly described as follows: for some system parameter p and some critical value p0, with an appropriate scaling, when p tends to p0 the evolution of the state of the network converges to a much simpler process. This process might, for instance, be specified by the solution of some ordinary/stochastic differential equation. Examples of such scalings are:
- Heavy traffic scaling for Loss Networks. These networks are subject to streams of call requests and their nodes have finite capacity. The scaling considers the case where the arrival rates of requests and the capacities of the nodes grow to infinity at a fixed proportional speed.
- Fluid Limits. The scaling parameter is N, the norm of the initial state of the Markov process. The scaling consists in speeding up the time scale by a factor N, normalizing the Markov process itself by 1/N and letting N tend to infinity.
- Mean Field Limit Theorems. These can be used when a network of N nodes has some symmetry properties so that when N tends to infinity, the state evolution of a given node can be described by a much simpler, lower dimensional state space Markov process.
While these methods have been developed extensively in the last 20 years for the analysis of stochastic networks, there are still very few ``general'' results. The investigation of special classes of networks is still the main source of progress in the domain. The talk will discuss through examples some interesting aspects of these methods.
Tutorial speakers
Avi Mandelbaum (Technion)
Title: Empirical Adventures in Call-Centers and Hospitals
Abstract: I shall describe examples of complex service operations for which data-based simple models have been found useful. Most attention will be given to telephone call centers and hospitals. I view these service systems through the mathematical lenses of Queueing Science, mainly asymptotically, with a bias towards Statistics.
More specifically, I focus on empirical findings that motivate or are motivated by (or both) interesting research questions. Such findings also enable validation analysis of existing models and protocols, either supporting or refuting their relevance and robustness.
My main data-source is a unique data repository, which is maintained at the Technion's SEE Laboratory (SEE = Service Enterprise Engineering:http://ie.technion.ac.il/Labs/Serveng/).It is unique in that it is transaction-based: it details the individual operational history of all service transactions (e.g. calls in a call center or patients in an emergency department). For example, one source of data, publicly available, is a network of 4 call centers of a U.S. bank, spanning 2.5 years and covering about 1000 agents; there are 218,047,488 telephone calls overall, out of which 41,646,142 where served by agents, while the rest were handled by answering machines. An environment for online Exploratory Data Analysis (EDA) of the data, SEEStat, has been developed at the SEELab. It is accessible at http://seeserver.iem.technion.ac.il/see-terminal after registration.
Sid Resnick (Cornell University)
Title: Modeling Data Network Sessions
Abstract: A session is a higher order entity in data network modeling resulting from amalgamating packets, connections, or groups of connections according to specified but not unique rules. For example, using various rules, the flow of packets past a sensor can be amalgamated into higher level entities called sessions using a threshold rule based on gaps between packet arrivals. We review some heavy tailed probability modeling based on sessions and turn to statistical analysis.
Statistical analysis of these sessions based on packets is complex: session duration (D) and size (S) are jointly heavy tailed but average transmission rate (R=S/D) is sometimes not heavy tailed and arrival times of sessions is not Poisson. By segmenting sessions using a peak rate covariate, we find conditional on a peak rate decile, within this decile segment session initiations can be modeled as Poisson. Some reasons why Gaussian traffic seems ubiquitous are discussed.