- Conservative Bandits. In International Conference on Machine Learning (ICML), New York, NY, USA. June 20–22, 2016. (25% acceptance)
We study a novel multi-armed bandit problem that models the challenge faced by a company wishing to explore new strategies to maximize revenue whilst simultaneously maintaining their revenue above a fixed baseline, uniformly over time. While previous work addressed the problem under the weaker requirement of maintaining the revenue constraint only at a given fixed time in the future, the algorithms previously proposed are unsuitable due to their design under the more stringent constraints. We consider both the stochastic and the adversarial settings, where we propose, natural, yet novel strategies and analyze the price for maintaining the constraints. Amongst other things, we prove both high probability and expectation bounds on the regret, while we also consider both the problem of maintaining the constraints with high probability or expectation. For the adversarial setting the price of maintaining the constraint appears to be higher, at least for the algorithm considered. A lower bound is given showing that the algorithm for the stochastic setting is almost optimal. Empirical results obtained in synthetic environments complement our theoretical findings.
- Exploiting Symmetries to Construct Efficient MCMC Algorithms With an Application to SLAM. M.Sc. Thesis. University of Alberta, Edmonton, Canada. April 2015. (Winner of the Department of Computing Science M.Sc. Outstanding Thesis Award.)
Sampling from a given probability distribution is a key problem in many different disciplines. Markov chain Monte Carlo (MCMC) algorithms approach this problem by constructing a random walk governed by a specially constructed transition probability distribution. As the random walk progresses, the distribution of its states converges to the required target distribution. The Metropolis-Hastings (MH) algorithm is a generally applicable MCMC method which, given a proposal distribution, modifies it by adding an accept/reject step: it proposes a new state based on the proposal distribution and the existing state of the random walk, then either accepts or rejects it with a certain probability; if it is rejected, the old state is retained. The MH algorithm is most effective when the proposal distribution closely matches the target distribution: otherwise most proposals will be rejected and convergence to the target distribution will be slow. The proposal distribution should therefore be designed to take advantage of any known structure in the target distribution.
A particular kind of structure that arises in some probabilistic inference problems is that of symmetry: the problem (or its sub-problems) remains unchanged under certain transformations. A simple kind of symmetry is the choice of a coordinate system in a geometric problem; translating and rotating the origin of a plane does not affect the relative positions of any points on it. The field of group theory has a rich and fertile history of being used to characterize such symmetries; in particular, topological group theory has been applied to the study of both continuous and discrete symmetries. Symmetries are described by a group that acts on the state space of a problem, transforming it in such a way that the problem remains unchanged. We consider problems in which the target distribution has factors, each of which has a symmetry group; each factor's value does not change when the space is transformed by an element of its corresponding symmetry group.
This thesis proposes a variation of the MH algorithm where each step first chooses a random transformation of the state space and then applies it to the current state; these transformations are elements of suitable symmetry groups. The main result of this thesis extends the acceptance probability formula of the textbook MH algorithm to this case under mild conditions, adding much-needed flexibility to the MH algorithm. The new algorithm is also demonstrated in the Simultaneous Localization and Mapping (SLAM) problem in robotics, in which a robot traverses an unknown environment, and its trajectory and a map of the environment must be recovered from sensor observations and known control signals. Here the group moves are chosen to exploit the SLAM problem's natural geometric symmetries, obtaining the first fully rigorous justification of a previous MCMC-based SLAM method. New experimental results comparing this method to existing state-of-the-art specialized methods on a standard range-only SLAM benchmark problem validate the strength of the approach.
- Conservative Bandits. In Workshop on Machine Learning for eCommerce, Neural Information Processing Systems (NIPS), Montréal, QC, Canada. December 11, 2015.
We study a novel multi-armed bandit problem that models the challenge faced by a company wishing to explore new strategies whilst simultaneously maintaining their revenue above a fixed baseline. We develop a new algorithm that satisfies the revenue constraint and also enjoys problem dependent regret guarantees with respect to the optimal action. Lower bounds are given showing that the new algorithm is almost optimal.