Homepage of Csaba Szepesvári

Department of Computing Science
University of Alberta
Edmonton, Alberta
Canada T6G 2E8
Office: 311 Athabasca Hall
Email: szepesva AT cs DOT ualberta DOT ca
Phone: (780) 492-8581
Fax: (780) 492-6393
Book cover for my book 'Algorithms for Reinforcement Learning' [en/hu dict]
[CMPUT 412]
[math genealogy]

Who is this guy?

Faculty at the Department of Computing Science, one of the 10 PIs at AICML (the Alberta Innovates Centre for Machine Learning) member of Reinforcement Learning and Artificial Intelligence group.

However, more importantly, I am part of my loving family. My wife is Beáta, our kids are Dávid, Réka, Eszter and Csongor. Short bio.

Csaba's family


  • (October 8, 2016) You should check out Challo! It is a fun phone app that let's you challenge your friends, keep track of who did what challenge and have tons of fun. Why is this here? The app is developed by Deon and Reka!
  • (September 1, 2016) Bandits blog devoted to the bandits course's material and beyond. New contents every week!
  • (August 16, 2016) Bandit Algorithms: My new graduate course co-developed with Tor.


  • Prospective grad students who are interested in joining the Statistical Machine Learning degree specialization program, which is a joint program between our department and the MathStat department should look here. Why should you apply?
  • Here is some advice for present and future grad students.
  • Responding to an "emergency situation", back in 2008 I have spent a few hours by searching on the IEEE website to collect recent references on applications of RL. Here are the results which are now linked to the page on Successes of RL. See also Satinder's similarly titled page here.

Research interests

Online learning research develops of learning algorithms that show good online performance, i.e., good performance while learning. Online learning tasks are sequential: In each step of the sequential process, the learning algorithm receives some information from the environment and makes a prediction so as to minimize the prediction loss. My team and I focus on interactive online learning problems, sequential processes where the predictions influence what future information is received. Interactive online learning problems are studied in various disciplines, such as within control theory under the name "dual control", or within machine learning itself in the area of reinforcement learning. While these problems are natural, interactive online learning is perhaps the area that is the least developed within online learning. To make progress, we explore special cases of interactive online learning, which allows us to identify and study the key issues in isolation. Besides, developing better algorithms for these special cases is of independent interest as they have often interesting uses on their own. We also study more fundamental questions as they arise. Big picture: I am interested in machine learning. In particular, I like to think about how to make the most efficient use of data in various situations and also how this can be done algorithmically. I am particularly interested in sequential decision making problems, which, when learning is put into the picture, leads to reinforcement learning. Up to 2008, the most frequently occuring keywords associated with my publications were theory (80), reinforcement learning (49), application (31), neural networks (24), stochastic approximation (17), function approximation (16), nonparametrics (15), control (15), online learning (13), adaptive control (10), performance bounds (10), vision (10), Monte-Carlo methods (8), particle filtering (8) . There is a fair amount of noise in the numbers here. And the chronology is also somewhat important. For example, I focused on neural networks up to around 2001:)