Adam White

PI of the Reinforcement Learning and Artificial Intelligence Lab (RLAI)
Fellow of the Alberta Machine Intelligence Institute (Amii)
Canada CIFAR AI Chair
Assistant Professor
Department of Computing Science
University of Alberta
















amw8@ualberta.ca
(780) 908-5499
Staff Research Scientist

















adamwhite@google.com
Deepmind Alberta


















I graduated from the University of Alberta with a Ph.D. in Computing Science in 2015. During my doctoral studies I was advised by Richard Sutton, working in the RLAI lab. After that I worked as a Post-doc and Research Scientist in the Department of Computer Science at Indiana University in Bloomington.


My Students

If you are interested in joining my group as an MSc student, please apply directly to the MSc program. Do not contact me! I have no control over the admissions process: admission is based on grades, previous research experience, your research statement, and the quality of your reference letters. All students accepted to our MSc program get guaranteed TA funding. If you would like to work with me, then first apply to the MSc program, then contact once your are admitted.

Eugene Chen (MSc)
Tom Ferguson (Pdf)
Andrew Jacobsen (PhD)
Edan Meyer (MSc)
Samuel Neumann (PhD)
Subhojeet Pramanik (MSc)
Banafshe Rafiee (PhD)
Matthew Schlegel (PhD)
Han Wang (PhD)

Allumni

David Tao (MSc, 2022)
Samuel Neumann (MSc, 2022)
Derek Li (MSc, 2022)
Paul Liu (MSc, 2022)
Sina Ghiassian (PhD, 2022)
Raksha Kumaraswamy (PhD, 2021)
Matt McLeod (MSc, 2021)
Archit Sakhadeo (MSc, 2021)
Xutong Zhao (MSc, 2021)
Cam Linke (MSc, 2020)
Han Wang (MSc, 2020)
Niko Yasui (MSc, 2020)
Andrew Jacobsen (MSc, 2019)
Banafsheh Rafiee (MSc,2018)

Teaching

Reference Document: Empirical Design in Reinforcement Learning


Coursera Specialization on Reinforcement Learning


CMPUT 655: Reinforcement Learning I - Fall 2022
CMPUT 365: Introduction to Reinforcement Learning I - Fall 2021
CMPUT 607: Empirical Reinforcement Learning - Winter 2021
CMPUT 397: Reinforcement Learning I - Fall 2019
CMPUT 366: Intelligent Systems - Fall 2018
CMPUT 366: Intelligent Systems - Fall 2017
CMPUT 609: Reinforcement Learning - Fall 2017
CSCI-B 659: Reinforcement learning for Artificial Intelligence - Spring 2017
CSCI-B 659: Reinforcement learning for Artificial Intelligence - Spring 2016

Research

Keywords: Reinforcement Learning, Robotics, Knowledge Representation and Intrinsic Motivation

Adam's research is focused on understanding the fundamental principles of learning in young humans, animals, and applications. Adam seeks to understand the algorithms and representations that allow people to progress from motor babbling, to open-ended play, to purposeful goal-directed behaviours. Adam is interested in continual learning problems where the agent is much smaller than the world and thus must continue to learn, react, and track in order to perform well. In particular, Adam's lab has investigated intrinsic reward and exploration, more efficient algorithms for off-policy learning, practical strategies for automatic hyperparameter tuning and meta learning, representations for online continual prediction in the face of partial observability, and new approaches to planning with learned models. We try to connect our algorithmic and simulation work with real applications: we are currently working on using reinforcement learning to control a fresh-water treatment plant. In addition, Adam's group is deeply passionate about good empirical practices and new methodologies to help determine if our algorithms are ready for deployment in the real world.





Curriculum vitae

My current CV can be found here.

Journal Papers

Wang, A., Sakhadeo, A., White, A., Bell, J. M., Liu, V., Zhao, X., Kozuno, T., Fyshe, A., White, A. (2022). No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. Transactions on Machine Learning Research.

Patterson, A., White, A., White, M. (2022). A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning. Journal of Machine Learning Research.

Rafiee, B., Abbas, Z., Ghiassian, S., Kumaraswamy, R., Sutton, R. S., Ludvig, E. & White, A. (2022). From eye-blinks to state construction: diagnostic benchmarks for online representation learning. Adaptive Behavior.

Schlegel, M., Jacobsen, A., Zaheer, M., Patterson, A., White, A., & White, M. (2021). General value function networks. Journal of Artificial Intelligence Research.

Linke, C., Ady, N. M., White, M., Degris, T., & White, A. (2020).
Adapting behaviour via intrinsic reward: A survey and empirical study. Journal of Artificial Intelligence Research.

Modayil, J., White, A., Sutton, R. S. (2014). Multi-timescale Nexting in a Reinforcement Learning Robot. Adaptive Behavior, 22(2):146--160.

Whiteson, S., Tanner, B., & White, A. (2010). The reinforcement learning competitions. AI Magazine, 31(2): 81--94.

Tanner, B., & White, A. (2009). RL-Glue: Language-independent software for reinforcement-learning experiments. The Journal of Machine Learning Research, 10: 2133--2136.


Conference Papers

Jiang, R., Zhang, S., Chelu, V., White, A., & van Hasselt, H. (2022). Learning Expected Emphatic Traces for Deep RL. AAAI Conference on Artificial Intelligence.

McLeod, M., Lo, C., Schlegel, M., Jacobsen, A., Kumaraswamy, R., White, M., & White, A. (2021). Continual auxiliary task learning. Advances in Neural Information Processing Systems, 34.

Jiang, R., Zahavy, T., Xu, Z., White, A., Hessel, M., Blundell, C., van Hasselt, H. (2021). Emphatic Algorithms for Deep Reinforcement Learning. International Conference on Machine Learning (ICML).

Ghiassian S., Patterson A., Garg S., Gupta D., White A., White M.(2020). Gradient Temporal-Difference Learning with Regularized Corrections. International Conference on Machine Learning (ICML).

Ghiassian S., Rafiee B., Long Lo Y., White A. (2020). Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks. International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS).

Nath S, Liu V., Chan A., White A., White M. (2020). Training Recurrent Neural Networks Online by Learning Explicit State Variables. International Conference on Learning Representations (ICLR).

Wan Y., Zaheer M., Sutton R., White A., White M. (2019). Planning with Expectation Models. The International Joint Conference on Artificial Intelligence (IJCAI).

Rafiee B., Ghiassian S., White, A., Sutton R. (2019). Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots . The 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS).

Jacobsen A., Schlegel M., Linke C., Degris T., White, A., White M. (2019). Meta-descent for online, continual prediction . AAAI Conference on Artificial Intelligence.

Kumaraswamy R., Schlegel M., White, A., White M. (2018). Context-dependent upper-confidence bounds for directed exploration . Advances in Neural Information Processing Systems (NIPS).

Sherstan C., Bennett B., Young K., Ashley D., White, A., White M., Sutton R. (2018). Directly Estimating the Variance of the $\lambda$-Return Using Temporal-Difference Methods . Conference on Uncertainty in Artificial Intelligence (UAI).

Pan Y., Zaheer M., White, A., Patterson A., White M. (2018). Organizing experience: a deeper look at replay mechanisms for sample-based planning in continuous state domains . International Joint Conference on Artificial Intelligence (IJCAI).

Pan Y., White, A., White M. (2017). Accelerated Gradient Temporal Difference Learning . AAAI Conference on Artificial Intelligence (AAAI).

Sherstan, C., Machado, M., ,White, A., Patrick P. (2016). Introspective Agents: Confidence Measures for General Value Functions, Artificial General Intelligence (AGI).

White, A., White M. (2016). Investigating practical linear temporal difference learning. In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS). [ CODE ]

White, M., White A. (2016) Adapting the trace parameter in reinforcement learning, In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS).

White, A., Modayil, J., & Sutton, R. S. (2012). Scaling life-long off-policy learning. In the IEEE International Conference on Development and Learning and Epigenetic Robotics, 1--6. [paper of distinction award]

Modayil, J., White, A., Pilarski, P. M., & Sutton, R. S. (2012). Acquiring a broad range of empirical knowledge in real time by temporal-difference learning. In the IEEE International Conference on Systems, Man, and Cybernetics, 1903--1910.

Modayil, J., White, A., Sutton, R. S. (2012). Multi-timescale Nexting in a Reinforcement Learning Robot. Presented at the 2012 International Conference on Adaptive Behaviour, Odense, Denmark. To appear in: SAB 12, LNAI 7426, pp. 299-309, T. Ziemke, C. Balkenius, and J. Hallam, Eds., Springer Heidelberg.

Sutton, R. S., Modayil, J., Delp, M., Degris, T., Pilarski, P. M., White, A., & Precup, D. (2011). Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In The 10th International Conference on Autonomous Agents and Multiagent Systems: 2, 761--768.

White, M., & White, A. (2010). Interval estimation for reinforcement-learning algorithms in continuous-state domains. In Advances in Neural Information Processing Systems, 2433--2441.

Sturtevant, N. R., & White, A. M. (2007). Feature construction for reinforcement learning in hearts. In Computers and Games . Springer Berlin Heidelberg, 122--134


Preprints

Wang, H., Miahi, E., White, M., Machado, M. C., Abbas, Z., Kumaraswamy, R., Liu, V., & White, A. (2022). Investigating the Properties of Neural Network Representations in Reinforcement Learning. arXiv preprint arXiv:2203.15955.

Wang, H., Sakhadeo, A., White, A., Bell, J., Liu, V., Zhao, X., Liu, P., Kozuno, T., Fyshe, A., & White, M. (2022).No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. arXiv preprint arXiv:2205.08716.

Sutton, R. S., Machado, M. C., Holland, G. Z., Timbers, D. S. F., Tanner, B., & White, A. (2022). Reward-Respecting Subtasks for Model-Based Reinforcement Learning. arXiv preprint arXiv:2202.03466.

Patterson, A., Neumann, S., White, A. M., Kumaraswamy, R., & White, M. (2021). The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning.

Ghiassian S., Patterson A., White M.,, Sutton R. S., White A. (2019) Online Off-policy Prediction.

Other published works

Yasui N., Lim S., Linke C., White A., White M. (2019). An Empirical and Conceptual Categorization of Value-based Exploration Methods. ICML Exploration in Reinforcement Learning Workshop.

Pan Y., White, A., White M. (2017). Accelerated Gradient Temporal Difference Learning . European workshop on reinforcement learning (EWRL).

Schlegel M., White, A., White M. (2017). Stable predictive representations with general value functions for continual learning . Continual Learning and Deep Networks workshop at the Neural Information Processing System Conference.

White, A., & Sutton, R. S. (2014). GQ (lambda) Quick Reference Guide.

White, A., Modayil, J., & Sutton, R. S. (2014). Surprise and curiosity for big data robotics. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence.

Modayil, J., White, A., Pilarski, P. M., Sutton, R. S. (2012). Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning. International Workshop on Evolutionary and Reinforcement Learning for Autonomous Robot Systems, Montpellier, France. [Best paper award]

Modayil, J., Pilarski, P., White, A., Degris, T., & Sutton, R. (2010). Off-policy knowledge maintenance for robots. In Proceedings of Robotics Science and Systems Workshop (Towards Closing the Loop: Active Learning for Robotics) : 55.


Theses

White, A. (2015) Developing a predictive approach to knowledge. Doctoral thesis, University of Alberta.

White, A. (2006) A standard system for benchmarking in reinforcement learning. Master's thesis, University of Alberta.


See my google scholar page for a list of my publications that Google knows about.

Weekly Schedule


Contact info

Office: 307 Athabasca Hall

Mail:
Department of Computing Science
University of Alberta
Edmonton, Alberta
Canada