Relaxing the IID Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization

Abstract

We consider sequential prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. We quantify relaxations of the classical i.i.d. assumption in terms of these constraint sets, with i.i.d. sequences at one extreme and adversarial mechanisms at the other. The Hedge algorithm, long known to be minimax optimal in the adversarial regime, was recently shown to be minimax optimal for i.i.d. data. We show that Hedge with deterministic learning rates is suboptimal between these extremes, and present a new algorithm that adaptively achieves the minimax optimal rate of regret with respect to our relaxations of the i.i.d. assumption, and does so without knowledge of the underlying constraint set. We analyze our algorithm using the follow-the-regularized-leader framework, and prove it corresponds to Hedge with an adaptive learning rate that implicitly scales as the square root of the entropy of the current predictive distribution, rather than the entropy of the initial predictive distribution.