Pierre-Luc Bacon is a Facebook CIFAR AI Chair at Mila and an assistant professor at the Department of Computer Science and Operations Research (DIRO) at Université de Montréal.
Bacon’s research pertains to the challenge posed by the curse of horizon when learning and planning over long time spans. He is interested in tackling this problem from a representation learning perspective grounded in optimization methods. His research efforts in reinforcement learning focus on the learning problem over long spans in time based on the theoretical framework of temporally abstract actions of Sutton et al.
- Outstanding Student Paper Award, Association for the Advancement of AI, 2017
- Best Paper Award, Hierarchical Reinforcement Learning Workshop, Neural Information Processing Systems, 2017
Harb, J., Bacon, P. L., Klissarov, M., & Precup, D. (2018). When waiting is not an option: Learning options with a deliberation cost. In Thirty-Second AAAI Conference on Artificial Intelligence.
Henderson, P., Chang, W. D., Bacon, P. L., Meger, D., Pineau, J., & Precup, D. (2018). Optiongan: Learning joint reward-policy options using generative adversarial inverse reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
Touati, A., Bacon, P. L., Precup, D., & Vincent, P. (2018). Convergent TREE BACKUP and RETRACE with function approximation. In International Conference on Machine Learning (pp. 4955-4964). PMLR.
Bacon, P. L., Harb, J., & Precup, D. (2017). The option-critic architecture. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1).
Bengio, E., Bacon, P. L., Pineau, J., & Precup, D. (2015). Conditional computation in neural networks for faster models.
CIFAR is a registered charitable organization supported by the governments of Canada, Alberta and Quebec, as well as foundations, individuals, corporations and Canadian and international partner organizations.