Homepage of Csaba Szepesvári

Department of Computing Science
University of Alberta
Edmonton, Alberta
Canada T6G 2E8
Office: 311 Athabasca Hall
Email: szepesva AT cs DOT ualberta DOT ca
Phone: (780) 492-8581
Fax: (780) 492-6393
Book cover for my book 'Algorithms for Reinforcement Learning' [en/hu dict]
[CMPUT 412]
[math genealogy]

Who am I?

Faculty at the Department of Computing Science, one of the 10 PIs at AICML (the Alberta Innovates Centre for Machine Learning) member of Reinforcement Learning and Artificial Intelligence group.

However, more importantly, I am part of my loving family. My wife is Beáta, our kids are Dávid, Réka, Eszter and Csongor. Short bio.

Csaba's family


  • (December 2015) NIPS + Workshops, Montreal!
  • (November 2015) Trip to UK, visiting Yee Whye Teh then off to Singapore to the Learning and Games workshop.
  • (October 2015) ALT/ DS 2015, in Banff
  • I was co-organizer with Sandra.
  • (September 2015) Four papers accepted at NIPS, yay!
  • (August 2015) Tutorial on reinforcement learning at the Machine Learning Summer School in Kyoto, Japan. The slides:
  • (July 2015) Keynote at EWRL on "Lazy Posterior Sampling for Parametric Nonlinear Control". slides
  • (July 2015) Invited talk on "Online learning and prediction on a budget" at the ICML 2015 Workshop on Resource-Efficient Machine Learning. slides
  • (July 2015) Talk on our IJCAI'15 paper, "Fast Cross-Validation for Incremental Learning" at Deepmind. slides
  • (June-July 2015) Lectures at the Online learning summer school, Copenhagen. Topics included Linear Bandits, Linear UCB, Generalized Linear Bandits, Combinatorial Bandits, Partial Monitoring, Thompson sampling for MDPs.
  • (June 2015) Invited talk (slides) on "How to Explore to Maximize Future Return" at AI 2015 Crazy timing (the conference dates totally coincided with the NIPS deadline)! I had no choice but to go to see Yaoliang getting his award as the author of the best PhD thesis in AI in 2015 in Canada! Congrats Yaoliang, well done!!
  • (March 2015) 3-part tutorial on "Online learning" at the Indian Institute of Technology (IIT) of Madras at the Workshop on Advances in Reinforcement Learning.
  • (March 2015) Talk on "Exploiting Symmetries to Construct Efficient MCMC Algorithms" at Waterloo. slides
  • (January 2015) Distinguished lecture at UBC on "Learning to Make Better Decisions: Challenges for the 21st Century". slides


  • Prospective grad students who are interested in joining the Statistical Machine Learning degree specialization program, which is a joint program between our department and the MathStat department should look here. Why should you apply?
  • Here is some advice for present and future grad students.
  • Responding to an "emergency situation", back in 2008 I have spent a few hours by searching on the IEEE website to collect recent references on applications of RL. Here are the results which are now linked to the page on Successes of RL. See also Satinder's similarly titled page here.

Research interests

Online learning research develops of learning algorithms that show good online performance, i.e., good performance while learning. Online learning tasks are sequential: In each step of the sequential process, the learning algorithm receives some information from the environment and makes a prediction so as to minimize the prediction loss. My team and I focus on interactive online learning problems, sequential processes where the predictions influence what future information is received. Interactive online learning problems are studied in various disciplines, such as within control theory under the name "dual control", or within machine learning itself in the area of reinforcement learning. While these problems are natural, interactive online learning is perhaps the area that is the least developed within online learning. To make progress, we explore special cases of interactive online learning, which allows us to identify and study the key issues in isolation. Besides, developing better algorithms for these special cases is of independent interest as they have often interesting uses on their own. We also study more fundamental questions as they arise. Big picture: I am interested in machine learning. In particular, I like to think about how to make the most efficient use of data in various situations and also how this can be done algorithmically. I am particularly interested in sequential decision making problems, which, when learning is put into the picture, leads to reinforcement learning. Up to 2008, the most frequently occuring keywords associated with my publications were theory (80), reinforcement learning (49), application (31), neural networks (24), stochastic approximation (17), function approximation (16), nonparametrics (15), control (15), online learning (13), adaptive control (10), performance bounds (10), vision (10), Monte-Carlo methods (8), particle filtering (8) . There is a fair amount of noise in the numbers here. And the chronology is also somewhat important. For example, I focused on neural networks up to around 2001:)