LIDS Student Conference 2014

Thursday, Jan 30.

Session on Communication Systems.

Matthew Johnston. Opportunistic Scheduling with Limited Channel State Information: An Application of Causal Rate Distortion

In this talk, we consider an opportunistic communication system in which a transmitter selects one of multiple channels over which to schedule a transmission, based on partial knowledge of the network state. We characterize a fundamental limit on the rate that channel state information must be conveyed to the transmitter in order to meet a constraint on expected throughput. This problem is modeled as a causal rate distortion optimization of a Markov source. We introduce a novel distortion metric capturing the impact of imperfect channel state information on throughput, and compute a closed-form expression for the causal information rate distortion function for the case of two channels. Additionally, we analyze the gap between the causal information rate distortion and the causal entropic rate distortion functions.

Tianheng Wang. On the Stability of Scheduling Algorithms for Network Navigation

Wireless navigation networks enable location-awareness in GPS-challenged environments. For such networks, scheduling algorithms are needed to improve the navigation accuracy through measurement pair selections under limited communication resource. In this paper, we develop an analytical framework to determine the location error evolution for different scheduling algorithms and network settings. Under this framework, we provide sufficient conditions for the stability of the location error evolution, and we quantify the time-averaged network location errors (NLEs) for scheduling algorithms with and without exploiting the network states. Furthermore, we show the optimality of the proposed scheduling algorithms in terms of the error scaling with respect to the agent density. These results provide fundamental insights into the effects of scheduling algorithms and network settings on the location error evolution, leading to efficient scheduling algorithms for navigation networks.

Austin Collins. What To Transmit From Your Many Antennas

We analyze a model for wireless communication in the "delay constrained" setting, for example streaming video or a phone call. In this scenario, the transmitter has multiple antennas (MISO - multiple input single output), and the channel is modeled as an isotropic block fading channel with channel state information available at the receiver. We analyze the information theoretically best transmission scheme in this delay constrained setting, and discuss its relation to orthogonal designs used in space-time block coding.

Abhishek Sinha. Throughput Optimal Algorithms for Multicast in a Wireless Network

In this talk we concentrate on the classical problem of throughput-optimal broadcast in both wired and wireless networks setting. A known throughput optimal solution from the literature is to pre-identify a set of disjoint spanning trees and route packets on them. However this approach is not scalable and not particularly convenient in the scenario of constantly changing link qualities and/or topologies in a dynamic wireless mobile environment. In this talk we give a provably optimal Max-Weight type distributed broadcast algorithm for a network whose underlying topology is a Directed Acyclic Graph (DAG). This algorithm does not maintain any global topology information like spanning trees yet achieves throughput optimality. In the subsequent part of the talk we present another broadcast algorithm which works for any network with arbitrary underlying topology. This algorithm has some interesting connections with classical link scheduling algorithms under adversarial arrival models.

Session on Inference and Learning I.

Matthew Johnson. Analyzing Hogwild Parallel Gaussian Gibbs Sampling

Sampling inference methods often don't scale well because even weak global dependencies reduce opportunities for parallel computation. Gibbs sampling theory, for example, requires sample updates to be performed sequentially in the absence of strict conditional independence structure among variables. Yet empirical work has shown that some widely-used models with global dependencies can be sampled effectively by going ``Hogwild'' and simply running Gibbs updates in parallel with only periodic global communication. Little is understood about the successes and limitations of this strategy. As a step towards such an understanding, in this talk we consider such Hogwild Gibbs sampling approaches for Gaussian distributions. We discuss a framework that provides sufficient conditions for convergence and error bounds along with simple proofs and connections to methods in numerical linear algebra. Based on joint work with James Saunderson and Alan Willsky.

Ying Liu. Learning Gaussian Graphical Models with Small Feedback Vertex Sets: Observed or Latent

Gaussian Graphical Models (GGMs) or Gauss Markov random fields are widely used in many applications, and the trade-off between the modeling capacity and the efficiency of learning and inference has been an important research problem. In this paper, we study the family of GGMs with small feedback vertex sets (FVSs), where an FVS is a set of nodes whose removal breaks all the cycles. Exact inference such as computing the marginal distributions and the partition function has complexity $O(k^{2}n)$ using message-passing algorithms, where $k$ is the size of the FVS, and $n$ is the total number of nodes. We propose efficient structure learning algorithms for two cases: 1) All nodes are observed, which is useful in modeling social or flight networks where the FVS nodes often correspond to a small number of high-degree nodes, or hubs, while the rest of the network is modeled by a tree. Regardless of the maximum degree, without knowing the full graph structure, we can exactly compute the maximum likelihood estimate in $O(kn^2+n^2\log n)$ if the FVS is known or in polynomial time if the FVS is unknown but has bounded size. 2) The FVS nodes are latent variables, where structure learning is equivalent to decomposing a inverse covariance matrix (exactly or approximately) into the sum of a tree-structured matrix and a low-rank matrix. By incorporating efficient inference into the learning steps, we can obtain a learning algorithm using alternating low-rank correction with complexity $O(kn^{2}+n^{2}\log n)$ per iteration. We also perform experiments using both synthetic data as well as real data of flight delays to demonstrate the modeling capacity with FVSs of various sizes.

Dan Levine. Focused Active Inference

In many inferential settings, data are expensive to acquire or process and are heterogeneously informative about underlying processes of interest. We therefore consider observation selection (e.g., adaptive sampling, active inference, experimental design, etc.) as a way of simultaneously reducing sensing costs while improving inferential accuracy. Prior work has considered the problem of inferring all latent states by selecting locally maximally-informative observations. We consider the more general problem where only a subset of latent variables is of inferential interest, and interactions between observable and latent random variables are described by a high-dimensional, yet factorable, joint distribution containing nuisance variables. We specifically consider the class of multivariate Gaussian distributions Markov to tree-shaped undirected graphs. For pairs of vertices connected by a unique path in the graph, we show that there exist decompositions of nonlocal mutual information into local information measures that can be computed efficiently from the output of message passing algorithms. We integrate these decompositions into a computationally efficient greedy selector where the computational expense of quantification can be distributed across nodes in the network. Experimental results demonstrate the comparative efficiency of our algorithms for sensor selection in high-dimensional distributions. We additionally derive an online-computable performance bound based on augmentations of the relevant latent variable set that, when such a valid augmentation exists, is applicable for \emph{any} distribution with nuisances.

Session on Inference and Learning II.

Mark Cutler. Reinforcement Learning with Multi-Fidelity Simulators

We present a framework for reinforcement learning (RL) in a scenario where multiple simulators are available with decreasing amounts of fidelity to a real-world learning scenario. Our framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing the agent to choose to run trajectories at the lowest level that will still provide it with informative information. The approach transfers state-action Q-values from lower-fidelity models as heuristics for the ``Knows What It Knows'' Rmax family of RL algorithms, which is applicable over a wide range of possible dynamics and reward representations. Theoretical proofs of the framework's sample complexity are given and empirical results are demonstrated on a remote controlled car with multiple simulators. The approach allows RL algorithms to find near-optimal policies for the real world with fewer expensive real-world samples than previous transfer approaches or learning without simulators.

Ammar Ammar. Learning the Mixed Multinomial Logit Model

Computing a ranking over choices using consumer data gathered from a heterogenous population has become an indispensable module for any modern consumer information system, e.g. Yelp, Netflix, Amazon and app-stores like Google play. The corresponding ranking or recommendation algorithm needs to extract meaningful information from noisy data {\em accurately} and in a {\em scalable} manner. A principled approach to resolve this challenge requires a {\em model} that connects observations to recommendation decisions and subsequently a tractable inference algorithm utilizing this model. In this paper, we abstract all the preference data generated by consumers as noisy, partial realizations of their innate preferences, i.e. orderings or permutations over choices. Inspired by the seminal works of Samuelson (cf. {\em axiom of revealed preferences}) and that of McFadden (cf. discrete choice models for transportation), we model population's innate preferences as a Mixture of the so called Multinomial Logit model (MMNL). Now given a consumer's limited preferences, the recommendation problem boils down to (a) learning the MMNL model corresponding to the population data, (b) finding the mixture component of MMNL that closely represents the revealed preferences of the consumer at hand, and (c) recommending other choices to her/him that are ranked high according to thus found component. Inspired by this, we address the problem of learning MMNL model from partial preferences. We identify fundamental limitations of {\em any} algorithm to learn such a model as well as provide conditions under which, a simple, data-driven (non-parametric) algorithm learns the model effectively. The proposed algorithm has a pleasant similarity to the standard {\em collaborative filtering} for scalar (or star) ratings, but in the domain of permutations. This work advances the state-of-art in the domain of learning distribution over permutations.

Trevor Campbell. Batch-Sequential Clustering via Low-Variance Bayesian Nonparametrics

Bayesian nonparametrics (BNPs) are a class of probabilistic models in machine learning in which the number of parameters is mutable and is learned alongside the values of those parameters during inference. This yields a richness and flexibility in describing uncertainty that cannot be achieved with traditional Bayesian models. However, the enhanced flexibility comes at a cost; inference procedures for BNPs (e.g. Gibbs sampling, variational Bayes, particle learning) are typically computationally expensive, limiting the applicability of BNPs to autonomous systems in which timely decision-making is crucial. This talk will discuss BNP low-variance asymptotics, a recently developed methodology for finding hard clustering algorithms that share the flexibility of BNPs, but have computationally inexpensive learning algorithms. In particular, a low-variance analysis of the dependent Dirichlet process (DDP) Gaussian mixture model will be presented, along with two hard clustering algorithms that can be extracted from the analysis. Finally, results from experiments will be covered, demonstrating that these hard clustering algorithms require orders of magnitude less computational time than the DDP while still providing comparable or higher accuracy on both synthetic and real datasets.

Hajir Roozbehani. Classification of conditional independence models: a topological perspective

Conditional independence (CI) relations have a natural structure as a real (or complex) manifold. As such, they look like a Euclidean space and have no interesting topology. In this talk we construct CI models as projective varieties and show that interesting topology emerges in this way. To do this we work with CI models of regular Gaussian random variables. It is known that such models are the intersection of the zero set of polynomial equations with the positive definite cone. We show that relaxing the positive definite constraint adds only components with singular principal minors to the feasible set. This gives a convenient method to construct varieties from CI models and resolves a conjecture in the algebraic statistics literature. We then show how this construction gives rise to some simple invariants that can distinguish CI models. We conclude the talk by showing that topologists can tell independence from conditional independence provided that they study projective geometry.

Friday, Jan 31.

Session on Biological Systems + Optimization.

Noele Norris. Exploring the Optimality of Various Bacterial Motility Strategies

While many species of bacteria are motile, they use various random strategies to determine where to swim. This chemotaxis allows bacterial populations to distribute themselves in accordance to distributions of nutrients found within an environment. We extend past work describing a chemotactic E. coli cell as an ergodic, stochastic hybrid system and use experimental data on bacterial motion in microfluidic environments to model other species of bacteria. Our focus is on understanding the differences between the run-and-tumble strategy of E. coli and the more complicated run-reverse-flick strategy of the marine bacterium Vibrio alginolyticus. We use stochastic stability theory to analyze the chemotaxis models in terms of their stationary distributions and also derive a diffusion approximation of the system that provides further insight into the performance of various strategies. By comparing general chemotactic strategies, we hypothesize why various strategies may be evolutionarily advantageous for particular environments. These results also provide intuition for designing minimalistic multi-agent robotic systems that can be used for various environmental monitoring and source-seeking tasks. This presentation summarizes joint work with Filippo Menolascina, Roman Stocker, and Emilio Frazzoli.

Narmada Herath. Trade-offs Between Retroactivity and Noise in Connected Transcriptional Components

At the interconnection of two gene transcriptional components in a biomolecular network, the noise in the downstream component can be reduced by increasing its gene copy number. However, this method of reducing noise increases the load applied to the upstream system, called "retroactivity", thereby causing a perturbation in the upstream system. In this work, we quantify the error in the system trajectories caused by perturbations due to retroactivity and noise, and analyze the trade-off between these two perturbations. We model the system as a set of nonlinear chemical Langevin equations and employ contraction theory for stochastic systems to find an upper bound for the error due to noise. We show that this upper bound can be minimized by increasing the gene copy number, but this leads to an increase in the magnitude of the error due to retroactivity.

Andras Gyorgy. From Single Modules to Multi-Module Systems in Gene Transcription Networks

Predicting the dynamic behavior of a complex network from that of the composing modules is a central problem in systems and synthetic biology. Unfortunately, modules display context-dependent behavior. As a result, our current ability of predicting the emergent behavior of a network from that of the composing modules remains limited. One cause of context-dependence is retroactivity. This phenomenon is similar to loading in electrical networks and influences the dynamic performance of a module upon connection to other modules. Here, we establish an analysis framework for gene transcription networks that explicitly accounts for retroactivity to reliably predict how a module's behavior will change once it is connected to other systems. This framework carries substantial conceptual analogy with the electrical circuit theory based on equivalent impedances established by Thevenin. Specifically, relying on model reduction techniques for nonlinear systems, we demonstrate that a module's key interconnection properties are encoded by three retroactivity matrices: internal, scaling, and mixing retroactivity. All of them have a physical interpretation and can be computed from measurable biochemical parameters and from the modules' topology, similar to how one would compute the equivalent impedance of a network of interconnected electrical components. The internal retroactivity quantifies the effect of intramodular connections on an isolated module's dynamics. The scaling and mixing retroactivity establish how intermodular connections change the dynamics of connected modules. Based on these matrices and on the dynamics of modules in isolation, we can accurately predict the dynamic behavior of an arbitrary interconnection of modules. Further, using contraction theory we provide a quantitative metric that determines how robust the dynamic behavior of a module is to interconnection with other modules. Our metric can be employed both to evaluate the extent of modularity in natural networks and to establish concrete design guidelines to minimize retroactivity between modules in synthetic systems. Finally, we illustrate how our framework predicts and explains surprising and counter-intuitive dynamic properties of naturally occurring network structures, which could not be captured by existing models of the same dimension.

Hamza Fawzi. Convex Lower Bounds for Atomic Cone Ranks with Applications to Nonnegative rank and cp-rank

We study a certain class of atomic rank functions defined on a convex cone, which generalize several notions of ``positive'' ranks such as nonnegative rank or cp-rank (for completely positive matrices). We propose a new method to obtain lower bounds for such ranks using semidefinite programming and sum-of-squares techniques. For the case of nonnegative rank and cp-rank our lower bound has interesting connections with existing combinatorial bounds and satisfy important structural properties such as invariance under diagonal scaling or subadditivity.

Andrew Mastin. Randomized Minmax Regret for Combinatorial Optimization Under Uncertainty

The minmax regret problem for combinatorial optimization under uncertainty can be viewed as a zero-sum game played between an optimizing player and an adversary, where the optimizing player selects a solution and the adversary selects costs with the intention of maximizing the regret of the player. Existing minmax regret models consider only deterministic solutions/strategies, and minmax regret versions of most polynomial solvable problems are NP-hard. In this paper, we consider a randomized model where the optimizing player selects a probability distribution (corresponding to a mixed strategy) over solutions and the adversary selects costs with knowledge of the player's distribution, but not its realization. We show that under this randomized model, the minmax regret version of any polynomial solvable combinatorial problem becomes polynomial solvable. This holds true for both the interval and discrete scenario representations of uncertainty. We also show that the maximum expected regret value under the randomized model is upper bounded by the regret under the deterministic model.

Frank Permenter. Partial Facial Reduction: Simplified, Equivalent SDPs via Approximations of the PSD Cone

We develop practical semidefinite programming (SDP) facial reduction procedures that utilize computationally efficient approximations of the positive semidefinite cone. The proposed methods simplify SDPs with no positive definite solution by solving a sequence of easier optimization problems. We demonstrate effectiveness of our techniques on SDPs arising in practice and describe our publicly-available software implementation.

Session on Systemic Efficiency and Resilience

Alex Teytelboym. Stability and competitive equilibrium in networks with bilateral contracts

We consider general networks of bilateral contracts that include supply chains. We define a new stability concept called walk stability and show that any network of bilateral contracts has a walk-stable contract allocation whenever agents' preferences satisfy full substitutability. Walk-stable contract allocations may not be immune to group deviations, efficient, or in the core. We also show that competitive equilibrium exists in networked markets even in the absence of transferrable utility. The competitive equilibrium contract allocation is walk-stable.

Diego Feijer. Credit markets, systemic risk and financial stability

We develop a model that speaks of banking crises, financial regulation and the unstable nature of credit. Feedback between financial constraints and credit market prices result in an externality that induces banks to borrow in amounts that are not Pareto optimal leaving the system fragile and susceptible to systemic runs. Regulatory policies that affect the cost of debt can restore constrained efficiency and reduce the probability and severity of financial crises.

Kimon Drakopoulos. Network Resilience Against Epidemic Spread

We study the problem of minimizing the expected extinction time of an epidemic that evolves on a graph according to an SIS model. We consider curing policies for the nodes which exploit the knowledge of the current state of the epidemic. We characterize the family of graphs for which it is possible to achieve subpolynomial extinction time and provide a policy that achieves it. Moreover we prove that for all graphs outside this family the expected extinction time is polynomial.

Kuang Xu. Queueing with Future Information

Can forecasts significantly improve the performance of queueing systems? We study an admissions control problem: to redirect away incoming jobs up to a fixed rate, in order to minimize the resulting average queue length, Q. We show that Q diverges to infinity in the heavy-traffic regime under any online policy, but can be bounded given (even limited) future forecasts. Based on joint work with Joel Spencer and Madhu Sudan.