Schedule at a Glance
Thursday, January 29
| Session I: Optimization & Networks | ||
|---|---|---|
| Start | Speaker | Talk Title |
| 10:30 AM | Yubo Cai | Adaptive Levenberg-Marquardt Third-Order Newton's Method |
| 10:46 AM | Jia Wan | Experimentally Guided Recommendations for Assortment Optimization |
| 11:02 AM | Austin Saragih | Analytical Facility Location and Resilience |
| 11:18 AM | Ayoub Belhadji | Weighted quantization using MMD: From mean field to mean shift via gradient flows |
| 11:34 AM | Vishwak Srinivasan | Designing Algorithms for Entropic Optimal Transport from an Optimisation Perspective |
| 11:50 AM | Ashkan Soleymani | A Unified Framework for Statistical Test of Invariances |
| 12:06 AM | R Vallabh Ramakanth | Optimal Oblivious Load-Balancing for Sparse Traffic in Large-Scale Satellite Networks |
| Session II: Machine Learning | ||
|---|---|---|
| Start | Speaker | Talk Title |
| 2:45 PM | Shomik Jain | Interaction Context Often Increases Sycophancy in LLMs |
| 3:00 PM | Yuexing Hao | Personalizing Prostate Cancer Education for Patients using an EHR-Integrated LLM Agent |
| 3:15 PM | Kaveh Alimohammadi | Activation-Informed Merging of LLMs |
| 3:30 PM | Ruizhe Huang | Generative AI for Weather Data Assimilation |
| 3:45 PM | Youngjae Min | HardNet: Hard-Constrained Neural Networks with Universal Approximation Guarantees |
| 4:00 PM | Parmida Davarmanesh | Efficient and accurate steering of Large Language Models through attention-guided feature learning |
| 4:15 PM | Cameron Hickert | Probability-Aware Parking Selection |
Friday, January 30
| Session III: Control & Decision Making | ||
|---|---|---|
| Start | Speaker | Talk Title |
| 8:45 AM | Taylor Baum | Adaptive Closed-loop Control of Arterial Blood Pressure |
| 9:00 AM | Will Sharpless | Bellman Value Decomposition |
| 9:15 AM | Miroslav Kosanic | Composite Control of Grid-Following Inverters for Stabilizing AI-Induced Fast Power Disturbances |
| 9:30 AM | Ashkan Soleymani | Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games |
| 9:45 AM | Valia Efthymiou | Strategic Classification: sometimes strategic adapation is desirable |
| 10:00 AM | Jung-Hoon Cho | Formalizing and Estimating Task-Space Complexity for Zero-shot Generalization |
| 10:15 AM | Yingke Li | Beyond Explore–Exploit: Pragmatic Curiosity with Self-Consistent Learning and No-Regret Optimization |
| 10:30 AM | Mingyang Liu | Computing Equilibrium beyond Unilateral Deviation |
| Session IV: Statistics & Information | ||
|---|---|---|
| Start | Speaker | Talk Title |
| 1:30 | Renfei Tan | Multi-agent Adaptive Mechanism Design |
| 1:45 | Flora Shi | Instance-Adaptive Hypothesis Tests with Heterogeneous Agents |
| 2:00 | Anzo Teh | Solving Empirical Bayes via Transformers |
| 2:15 | Lelia Marie Hampton | Targeted urban afforestation can substantially reduce income-based heat disparities in US cities |
| 2:30 | Josefina Correa | A Weakly Informative Prior for Bayesian Autoregressive Models |
Session I: Optimization & Networks (10:30 - 12:06)
Yubo Cai
Adaptive Levenberg-Marquardt Third-Order Newton's MethodIn this talk, we introduce the Adaptive Levenberg-Marquardt Third-Order Newton's Method (ALMTON) for unconstrained nonconvex optimization. Our method represents the first globally convergent variant of the unregularized third-order Newton's method. Distinct from the standard Adaptive Regularization using 3rd-order models (AR3) which employs a quartic regularization term, ALMTON utilizes an adaptive Levenberg-Marquardt (quadratic) regularization. This key modification preserves the cubic structure of the local model, enabling a unified subproblem solution via Semidefinite Programming (SDP). Algorithmically, ALMTON operates strategically a "mixed-mode" paradigm designed to reconcile the competing demands of local efficiency and global stability. By prioritizing unregularized third-order steps, the framework maximally exploits high-order curvature information to accelerate convergence, engaging regularization mechanisms only when strictly necessary to ensure model well-posedness. This design is substantiated by a rigorous theoretical analysis, which establishes an $O(\epsilon^-2)$ worst-case evaluation complexity for finding $\epsilon$-approximate stationary points. In practice, this theoretical robustness translates into superior geometric intelligence. Our numerical experiments demonstrate that ALMTON significantly expands the basin of attraction relative to classical baselines (e.g., Gradient Descent and Damped Newton) and successfully navigates pathological landscapes where second-order methods typically stagnate. Moreover, when benchmarked against state-of-the-art third-order implementations (specifically AR3-interp [13]), ALMTON consistently achieves convergence with greater stability with fewer iterations. Moreover, we conclude by explicitly characterizing the operational boundaries of the method, providing a rigorous analysis of the scalability bottlenecks imposed by current SDP solvers in high-dimensional regimes.
Jia Wan
Experimentally Guided Recommendations for Assortment OptimizationA key challenge for online platforms is measuring consumer demand in environments with limited price variation. In this paper we study what platforms can learn about demand via experiments that vary the set of recommendations presented to consumers and subsequently apply it to counterfactuals of interest. We consider a canonical model of choice -- the mixed multinomial logit model -- and develop a semiparametric framework for sharp Manski-style bounds on linear functionals of counterfactual choice probabilities with no restrictions on the distribution of preference heterogeneity. This provides identification bounds of counterfactual shares for overall engagement (probability of not taking the outside option) and particular goods under a range of counterfactuals, including procurement of new goods and alternative recommendation algorithms. We show that the first-order sensitivity of these bounds admits a Riesz representer on the constraint range which yields an efficient influence function. We then study the problem of optimal experimental design that selects a finite set of recommendation slates across consumers to minimize the worst-case bound width for a target counterfactual and support our findings with simulation results.
Austin Saragih
Analytical Facility Location and ResilienceOne of the most studied models in operations research and optimization is the uncapacitated facility location (UFL) problem. This paper introduces a corrected closed-form analytical approximation formulation and solution framework that solves multi-million–point instances near optimally and in milliseconds. Our method also provides the first formal explanation for empirical patterns long noted in practice in UFL. Motivated by recent large-scale disruptions, we extend our approach to a new, analytical reliable UFL variant under disruption uncertainty, yielding resilient facility deployment policies and practical insights for designing reliable supply chain networks for future disruptions.
Ayoub Belhadji
Weighted quantization using MMD: From mean field to mean shift via gradient flowsApproximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We argue that a Wasserstein–Fisher–Rao gradient flow is well-suited for designing quantizations optimal under MMD. We show that a system of interacting particles satisfying a set of ODEs discretizes this flow. We further derive a new fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the classical mean shift algorithm, widely used for identifying modes in kernel density estimators. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent and that it acts as a relaxation of Lloyd's algorithm for clustering. Our unification of gradient flows, mean shift, and MMD-optimal quantization yields algorithms that are more robust than state-of-the-art methods, as demonstrated via high-dimensional and multi-modal numerical experiments.
Vishwak Srinivasan
Designing Algorithms for Entropic Optimal Transport from an Optimisation PerspectiveIn this work, we develop a collection of novel methods for the entropic-regularised optimal transport problem, which are inspired by existing mirror descent interpretations of the Sinkhorn algorithm used for solving this problem. These are fundamentally proposed from an optimisation perspective: either based on the associated semi-dual problem, or based on solving a non-convex constrained problem over subset of joint distributions. This optimisation viewpoint results in non-asymptotic rates of convergence for the proposed methods under minimal assumptions on the problem structure.
Ashkan Soleymani
A Unified Framework for Statistical Test of InvariancesWhile invariances naturally arise in almost any type of real-world data, no efficient and robust test exists for detecting them in observational data under arbitrarily given group actions. We tackle this problem by studying measures of invariance that can capture even negligible underlying patterns. Our first contribution is to show that, while detecting subtle asymmetries is \emph{computationally intractable}, a randomized method can be used to robustly estimate closeness measures to invariance within constant factors. This provides a general framework for robust statistical tests of invariance. Despite the extensive and well-established literature, our methodology, to the best of our knowledge, is the \emph{first} to provide statistical tests for general group invariances with \emph{finite-sample guarantees on Type II errors}. In addition, we focus on kernel methods and propose deterministic algorithms for robust testing with respect to both finite and infinite groups, accompanied by a rigorous analysis of their convergence rates and sample complexity. Finally, we revisit the general framework in the specific case of kernel methods, showing that recent closeness measures to invariance, defined via group averaging, are provably robust, leading to powerful randomized algorithms.
R Vallabh Ramakanth
Optimal Oblivious Load-Balancing for Sparse Traffic in Large-Scale Satellite NetworksOblivious load-balancing in networks involves routing traffic from sources to destinations using predetermined routes independent of the traffic, so that the maximum load on any link in the network is minimized. We investigate oblivious load-balancing schemes for a NxN torus network under sparse traffic where there are at most k active source-destination pairs. We are motivated by the problem of load-balancing in large-scale LEO satellite networks, which can be modeled as a torus, where the traffic is known to be sparse and localized to certain hotspot areas. We formulate the problem as a linear program and show that no oblivious routing scheme can achieve a worst-case load lower than approximately $\frac{\sqrt{2k}}{4}$ when $1 < k \leq N^2/2$ and $\frac{N}{4}$ when $N^2/2 \leq k \leq N^2$. Moreover, we demonstrate that the celebrated Valiant Load Balancing scheme is suboptimal under sparse traffic and construct an optimal oblivious load-balancing scheme that achieves the lower bound. Further, we discover a $\sqrt{2}$ multiplicative gap between the worst-case load of a non-oblivious routing and the worst-case load of any oblivious routing. The results can also be extended to general NxM tori with unequal link capacities along the vertical and horizontal directions.
Session II: Machine Learning (2:45 - 4:15)
Shomik Jain
Interaction Context Often Increases Sycophancy in LLMsWe investigate how the presence and type of interaction context shapes sycophancy in LLMs. We evaluate two forms of sycophancy: (1) agreement sycophancy -- the tendency of models to produce overly affirmative responses, and (2) perspective sycophancy -- the extent to which models reflect a user's viewpoint. Using two weeks of interaction context from 38 users, we compare LLM responses generated with and without context. Overall, context shapes sycophancy in heterogeneous ways, underscoring the need for evaluations grounded in real-world interactions and raising questions for system design around extended conversations.
Yuexing Hao
Personalizing Prostate Cancer Education for Patients using an EHR-Integrated LLM AgentCancer patients often lack timely education and personalized support due to clinician workload. This quality improvement study develops and evaluates a Large Language Model (LLM) agent, MedEduChat, which is integrated with the clinic's electronic health records (EHR) and designed to enhance prostate cancer patient education. Fifteen non-metastatic prostate cancer patients and three clinicians recruited from the Mayo Clinic interacted with the agent between May 2024 and April 2025. Findings showed that MedEduChat has a high usability score (UMUX = 83.7/100) and improves patients' health confidence (Health Confidence Score rose from 9.9 to 13.9). Clinicians evaluated the patient-chat interaction history and rated MedEduChat as highly correct (2.9/3), complete (2.7/3), and safe (2.7/3), with moderate personalization (2.3/3). This study highlights the potential of LLM agents to improve patient engagement and health education.
Kaveh Alimohammadi
Activation-Informed Merging of LLMsModel merging, a method that combines the parameters and embeddings of multiple fine-tuned large language models (LLMs), offers a promising approach to enhance model performance across various tasks while maintaining computational efficiency. In this talk, I introduce Activation-Informed Merging (AIM), a technique that integrates the information from the activation space of LLMs into the merging process to improve performance and robustness. AIM is designed as a flexible, complementary solution that is applicable to any existing merging method. It aims to preserve critical weights from the base model, drawing on principles from continual learning (CL) and model compression. Utilizing a task-agnostic calibration set, AIM selectively prioritizes essential weights during merging. Empirically, AIM significantly enhances the performance of merged models across multiple benchmarks. Experiments suggest that considering the activation-space information can provide substantial advancements in the model merging strategies for LLMs, with up to a 40% increase in benchmark performance.
Ruizhe Huang
Generative AI for Weather Data AssimilationTo anchor weather products in reality, data assimilation integrates observational data into physical simulations of the atmosphere. Traditional approaches do this by using numerical model forecasts as a prior, which is expensive. Today, researchers are exploring the use of deep generative models, such as diffusion models, as emulators to reconstruct full weather fields directly from sparse observations, but existing guidance-based approaches can be unstable or have not been evaluated under real-world conditions. We introduce GLaD-Flow (Guided Latent D-Flow), which combines Guidance and D-Flow within the latent space. It uses Latent D-Flow to optimize the latent initial noise using an observation loss, then generates full fields with observation guidance using the optimized initial noise produced by Latent D-Flow. We conduct a comprehensive benchmark over the Continental United States (CONUS) by training the flow model from 2017 to 2022 and testing in 2023. We generate full ERA5-like fields for 4 surface variables (10-meter wind, 2-meter temperature, and 2-meter dewpoint) from sparse ground station observations. We test the generalizability of our method by evaluating performance on held-out test weather stations. Our results show that GlaD-Flow reduces the Root Mean Square Error (RMSE) compared to ERA5 by over 29% on average across 1,778 test stations, while retaining ERA5 physics. We estimate that GLaD-Flow reduces ERA5 error by 22.9% at median-distance locations across the CONUS, demonstrating meaningful generalization beyond the immediate vicinity of observation stations. Our work demonstrates that unconditional generative models, particularly the GLaD-Flow framework, provide a promising tool for reducing the cost and improving the accuracy of weather products.
Youngjae Min
HardNet: Hard-Constrained Neural Networks with Universal Approximation GuaranteesIncorporating prior knowledge or specifications of input-output relationships into machine learning models has attracted significant attention, as it enhances generalization from limited data and yields conforming outputs. However, most existing approaches use soft constraints by penalizing violations through regularization, which offers no guarantee of constraint satisfaction, especially on inputs far from the training distribution—an essential requirement in safety-critical applications. On the other hand, imposing hard constraints on neural networks may hinder their representational power, adversely affecting performance. To address this, we propose HardNet, a practical framework for constructing neural networks that inherently satisfy hard constraints without sacrificing model capacity. Unlike approaches that modify outputs only at inference time, HardNet enables end-to-end training with hard constraint guarantees, leading to improved performance. To the best of our knowledge, HardNet is the first method that enables efficient and differentiable enforcement of more than one input-dependent inequality constraint. It allows unconstrained optimization of the network parameters using standard algorithms by appending a differentiable closed-form enforcement layer to the network's output. Furthermore, we show that HardNet retains neural networks' universal approximation capabilities. We demonstrate its versatility and effectiveness across various applications: learning with piecewise constraints, learning optimization solvers with guaranteed feasibility, and optimizing control policies in safety-critical systems.
Parmida Davarmanesh
Efficient and accurate steering of Large Language Models through attention-guided feature learningSteering, or direct manipulation of internal activations to guide responses toward specific semantic concepts, is emerging as a promising avenue for both understanding how semantic concepts are stored within LLMs and advancing LLM capabilities. Yet, existing steering methods are remarkably brittle, with seemingly non-steerable concepts becoming completely steerable based on subtle algorithmic choices in how concept-related features are extracted. In this work, we introduce an attention-guided steering framework that overcomes three core challenges associated with steering: (1) automatic selection of relevant token embeddings for extracting concept-related features; (2) accounting for heterogeneity of concept-related features across LLM activations; and (3) identification of layers most relevant for steering. Across a steering benchmark of 512 semantic concepts, our framework substantially improved steering over previous state-of-the-art (nearly doubling the number of successfully steered concepts) across model architectures and sizes (up to 70 billion parameter models). Furthermore, we use our framework to shed light on the distribution of concept-specific features across LLM layers. Overall, our framework opens further avenues for developing efficient, highly-scalable fine-tuning algorithms for industry-scale LLMs.
Cameron Hickert
Probability-Aware Parking SelectionCurrent navigation systems conflate time-to-drive with the true time-to-arrive by ignoring parking search duration and the final walking leg. Such underestimation can significantly affect user experience, mode choice, congestion, and emissions. To address this issue, this paper introduces the probability-aware parking selection problem, which aims to direct drivers to the best parking location rather than straight to their destination. An adaptable dynamic programming framework is proposed that leverages probabilistic, lot-level availability to minimize the expected time-to-arrive. Closed-form analysis determines when it is optimal to target a specific parking lot or explore alternatives, as well as the expected time cost. Sensitivity analysis and three illustrative cases are examined, demonstrating the model's ability to account for the dynamic nature of parking availability. Given the high cost of permanent sensing infrastructure, we assess the error rates of using stochastic observations to estimate availability. Experiments with real-world data from Seattle indicate this approach's viability, with mean absolute error decreasing from 7% to below 2% as observation frequency grows. In data-based simulations, probability-aware strategies demonstrate time savings up to 66% relative to probability-unaware baselines, yet still take up to 123% longer than time-to-drive estimates.
Session III: Control & Decision Making (8:45 - 10:30)
Valia Efthymiou
Strategic Classification: sometimes strategic adapation is desirableStrategic classification examines how decision rules interact with agents who strategically adapt their features. Most existing models focus on max- imizing predictive performance, assuming agents best respond to the learned classifier. However, real decision-making systems are rarely optimized solely for accuracy: ethical, economic, and in- stitutional considerations often make some fea- ture changes more desirable than others. At the same time, principals may wish to incentivize these changes fairly across heterogeneous agents. While prior work has studied causal structure be- tween features, notions of desirability, and infor- mation disparities in isolation, this work initiates a unified treatment of these components within a single framework. We frame the problem as a constrained optimization problem that captures the trade-offs between accuracy, desirability, and fairness. We provide theoretical guaranties on the principal's optimality loss constrained to a particular desirability fairness tolerance for mul- tiple broad classes of fairness measures. Finally, through experiments on real datasets, we show the explicit tradeoff between maximizing accu- racy and fairness in desirability effort.
Will Sharpless
Bellman Value DecompositionHard constraints in reinforcement learning (RL) often degrade policy performance. Lagrangian methods offer a way to blend objectives with constraints, but require intricate reward engineering and parameter tuning. In this work, we extend recent advances that connect Hamilton-Jacobi (HJ) equations with RL to demonstrate how Bellman Value functions for multi-objective satisfaction may be decomposed into graphs of Value functions. In contrast with temporal logic approaches, which typically involve representing an automaton or using complex combinations of sparse-rewards, we derive explicit, tractable Bellman equations, making this perspective more amenable to high-dimensionality. We leverage our analysis to propose a variation of Proximal Policy Optimization, and demonstrate that it produces distinct behaviors from previous approaches, out-competing a number of baselines in success, safety and speed.
Miroslav Kosanic
Composite Control of Grid-Following Inverters for Stabilizing AI-Induced Fast Power DisturbancesAI data center loads present a new stability challenge for power systems, as query-driven GPU workloads produce power transients that vary on millisecond timescales. Such rapid, unpredictable fluctuations can violate the timescale separation assumptions underlying standard cascaded control designs for grid-following inverters. This paper uses a singular perturbation framework to provide rigorous stability guarantees for such systems. We show that the droop control structure emerges from reduced-system stability requirements and that it shouldn't be imposed a priori, and we prove that AI workloads satisfy a bounded-rate disturbance class due to physical constraints of power delivery hardware. The analysis yields explicit gain bounds linking inverter parameters to achievable disturbance rejection, a feasibility condition identifying the maximum tolerable load ramp rate, and a delay tolerance bound quantifying the tradeoff between fast power response and robustness to feedback communication latency.
Ashkan Soleymani
Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General GamesWe introduce \emph{Cautious Optimism}, a framework for substantially faster regularized learning in general games. Cautious Optimism, as a variant of Optimism, adaptively controls the learning pace in a dynamic, non-monotone manner to accelerate no-regret learning dynamics. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (\FTRL) and outputs an accelerated no-regret learning algorithm (\ours) by pacing the underlying \FTRL with minimal computational overhead. Importantly, it retains uncoupledness, that is, learners do not need to know other players’ utilities. Cautious Optimistic \FTRL (\ours) achieves near-optimal $O(\log T)$ regret in diverse self-play (mixing and matching regularizers) while preserving the optimal $O(\sqrt{T})$ regret in adversarial scenarios. In contrast to prior works (e.g., Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis does not rely on monotonic step sizes, showcasing a novel route for fast learning in general games. Moreover, instances of \ours achieve new state-of-the-art regret minimization guarantees in general convex games, exponentially improving the dependence on the dimension of the action space $d$ over previous works [Farina et al., 2022a].
Taylor Elise Baum
Adaptive Closed-loop Control of Arterial Blood PressureAbstract to be announced.
Jung-Hoon Cho
Formalizing and Estimating Task-Space Complexity for Zero-shot GeneralizationPolicies must operate across diverse conditions, yet a single controller is often conservative, while fully adaptive schemes can be prohibitively complex. We study contextual dynamical systems, where each task is parameterized by an observable context, and focus on zero-shot generalization of controllers. We formalize task-space complexity as the minimum number of context-independent controllers required to guarantee ϵ-level performance across all contexts. The definition is instantiated for zero-shot generalization via ϵ-tolerance sets that certify where a controller generalizes. We instantiate tolerance sets using a novel performance-based task dissimilarity measure--a signed divergence--that upper-bounds the performance loss when transferring policies from a source context to a target context. Exact complexity is NP-hard; we cast source selection as a set cover over contexts and introduce a greedy and LP-relaxation algorithm to solve it. We develop a certified underapproximation algorithm to efficiently estimate the tolerance set. The geometric tolerance sets by certified underapproximations induce a geometric set cover structure, reducing evaluation cost while preserving guarantees. Numerical experiments on a linear Mass-Spring-Damper and a nonlinear CartPole system demonstrate that the proposed approach attains the same certified ϵ-coverage with up to 30% fewer policies than standard heuristics, empirically validating task-space complexity framework.
Yingke Li
Beyond Explore–Exploit: Pragmatic Curiosity with Self-Consistent Learning and No-Regret OptimizationBayesian Optimization (BO) and Bayesian Experimental Design (BED) are often treated as separate solutions for exploitation and exploration, yet many real-world systems face hybrid objectives where learning and optimization must be pursued simultaneously under uncertainty, constraints, and limited data. This talk presents a principled approach to such hybrid learning–optimization problems by synthesizing two complementary results through the lens of active inference. First, I introduce pragmatic curiosity, an Expected Free Energy (EFE) objective that unifies BO and BED by explicitly trading off epistemic value (information gain) and pragmatic value (task utility) within a single decision rule, enabling effective behavior across hybrid tasks such as constrained system identification, targeted active search, and composite optimization with unknown preferences. Second, I provide the theoretical foundation behind this unification: under a sufficient curiosity condition, EFE-minimizing agents achieve both self-consistent learning (Bayesian posterior consistency) and efficient decision-making (bounded cumulative regret). Together, these results elevate curiosity from an ad hoc exploration heuristic to an intrinsic regularizer that couples belief updating and action selection, yielding a unified framework that is both practically effective and provably sound for hybrid learning–optimization.
Mingyang Liu
Computing Equilibrium beyond Unilateral DeviationAbstract to be announced.
Session IV: Statistics & Information (1:30 - 2:45)
Madelyn Andersen
Variational Inference in ImplementationAbstract to be announced.
Renfei Tan
Multi-agent Adaptive Mechanism DesignWe study a sequential mechanism design game in which a principal seeks to elicit truthful data from multiple rational agents while starting with no prior knowledge of agents' skill nor beliefs. We introduce Distributionally Robust Adaptive Mechanism (DRAM), an algorithm combining insights from online learning to jointly address agents' truthfulness and reward mechanism's cost-optimality. Throughout the sequential game, the mechanism would estimate agents' skills, then iteratively updates a distributionally robust linear program with shrinking ambiguity sets to reduce payments while preserving truthfulness. Our mechanism guarantees truthful reporting with high probability while achieving $\tilde{O}(\sqrt{T})$ cumulative regret, and we establish a matching lower bound showing that no truthful adaptive mechanism can asymptotically do better. To our knowledge, this is the first adaptive mechanism under the general settings that maintains truthfulness and achieves optimal regret when incentive constraints are unknown and must be learned.
Flora Shi
Instance-Adaptive Hypothesis Tests with Heterogeneous AgentsWe study hypothesis testing over a heterogeneous population of strategic agents with private information. Any single test applied uniformly across the population yields statistical error that is sub-optimal relative to the performance of an oracle given access to the private information. We show how it is possible to design menus of statistical contracts that pair type-optimal tests with payoff structures, inducing agents to self-select according to their private information. This separating menu elicits agent types and enables the principal to match the oracle performance even without a priori knowledge of the agent type. Our main result fully characterizes the collection of all separating menus that are instance-adaptive, matching oracle performance for an arbitrary population of heterogeneous agents. We identify designs where information elicitation is essentially costless, requiring negligible additional expense relative to a single-test benchmark, while improving statistical performance. Our work establishes a connection between proper scoring rules and menu design, showing how the structure of the hypothesis test constrains the elicitable information. Numerical examples illustrate the geometry of separating menus and the improvements they deliver in error trade-offs. Overall, our results connect statistical decision theory with mechanism design, demonstrating how heterogeneity and strategic participation can be harnessed to improve efficiency in hypothesis testing.
Anzo Teh
Solving Empirical Bayes via TransformersWe use transformers to solve one of the oldest statistical problems: Poisson means under empirical Bayes (Poisson-EB) setting. Here, a transformer model is pre-trained on a set of synthetically generated pairs and learns to do in-context learning (ICL) by adapting to unknown prior. Practically, we discover that very small models (300k parameters) can outperform the best classical algorithm (non-parametric maximum likelihood, NPMLE) both in runtime and validation loss, which we compute on out-of-distribution synthetic data as well as real-world datasets. Theoretically, we present our findings via two angles: first, we show that transformers can approximate classical EB estimators and therefore achieve vanishing regret; second, we use the theory of posterior contraction to explain how a pre-trained transformer, acting like a hierarchical Bayes estimator, can achieve vanishing regret.
Lelia Marie Hampton
Targeted urban afforestation can substantially reduce income-based heat disparities in US citiesPrevious studies on urban heat mitigation, critical for urban planning and public health, have generally focused on a handful of cities, ignored logistical constraints, or insufficiently resolved urban-scale processes. Here, we fuse satellite-derived estimates of urban heat and multiple physical properties to develop a non-parametric machine learning approach to capture non-linearities in thermal anomalies (ΔAT) across 493 U.S. cities. This enables computationally-efficient data-driven assessments of urban heat mitigation strategies, including strategies targeting low-income communities since ~90% of these cities show income-based temperature disparities. All strategies lower daytime ΔAT, with targeted afforestation with (without) albedo management reducing daytime ΔAT for low income groups from 0.56±0.94℃ to 0.22±0.92℃ (0.24±0.93℃) and income-based ΔAT gap from -0.50±0.94℃ to -0.15±91℃ (-0.17±93℃). Our results demonstrate the importance of targeted heat mitigation in low income communities, where residents have less options to adapt to extreme heat.
Josefina Correa
A Weakly Informative Prior for Bayesian Autoregressive ModelsEstimating the spectral content of wide-sense stationary signals is a common approach to characterizing how an intervention affects measured time-series data. One example are electroencephalogram signals recorded under anesthesia. In clinical studies, electroencephalogram signals are collected over multiple subjects and their spectra are computed using either Fourier-based or parametric approaches. A common analysis entails comparing the spectra of a wide-sense stationary data window after the stimulus onset to the spectra before. However, conventional approaches to analyzing these data do not account for between subject variability and could in turn provide inaccurate inferences. This work develops a Bayesian hierarchical auto-regressive modeling framework to estimate subject-level and population-level spectra. Our formulation provides a principled approach for constructing cohort-level estimates, which can be used to assess the extent to which a new subject is consistent with a cohort-level response. By formulating our model in terms of the partial autocorrelation function of the wide-sense stationary process, we formulate a prior that is weakly informative and can be used to check model assumptions. We validate our framework in simulation and apply it to the analysis of electroencephalogram signals from ten healthy volunteers undergoing propofol-mediated anesthesia.
Massachusetts Institute of Technology