Skip to content
Yoshua Bengio

Yoshua Bengio


  • Program Co-Director
  • Canada CIFAR AI Chair
  • Learning in Machines & Brains
  • Pan-Canadian AI Strategy




Recognized worldwide as one of the leading experts in artificial intelligence, Yoshua Bengio is most known for his pioneering work in deep learning (DL), earning him the 2018 A.M. Turing Award, “the Nobel Prize of Computing,” with Geoffrey Hinton and Yann LeCun. 

He is a Full Professor at Université de Montréal, and the Founder and Scientific Director of Mila – Quebec AI Institute. He co-directs the CIFAR Learning in Machines & Brains program as Senior Fellow and acts as Scientific Director of IVADO. 

In 2018, he collected the largest number of new citations in the world for a computer scientist and in 2019 was awarded the prestigious Killam Prize. He is a Fellow of both the Royal Society of London and Canada and Officer of the Order of Canada. 

Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence.


  • A.M. Turing Award 2018
  • Killam Prize in Natural Sciences 2018
  • Fellow of the Royal Society of London 2020
  • Fellow of the Royal Society of Canada 2017
  • Officer of the Order of Canada 2017
  • IEEE CIS Neural Networks Pioneer award 2019
  • Full Professor, Department of Computer Science and Operations Research, UdeM
  • Founder and Scientific Director of Mila
  • Scientific Director of IVADO
  • Canada CIFAR AI Chair
  • Co-Director of the CIFAR Learning in Machines & Brains program

Relevant Publications

  • Goodfellow, Ian J., Yoshua Bengio, and Aaron Courville (2016). Deep Learning. MIT Press.
  • Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio (2015). “Neural Machine Translation by Jointly Learning to Align and Translate”. In: ICLR’2015, arXiv:1409.0473.
  • LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton (2015). “Deep Learning”. In: Nature 521.7553, pp. 436–444.
  • Dauphin, Yann, Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio (2014). “Identifying and attacking the saddle point problem in high-dimensional non-convex optimization”. In: NIPS’2014, arXiv:1406.2572.
  • Montufar, Guido F., Razvan Pascanu, KyungHyun Cho, and Yoshua Bengio (2014). “On the Number of Linear Regions of Deep Neural Networks”. In: NIPS’2014, arXiv:1402.1869.
  • Goodfellow, Ian J., Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio (2014). “Generative Adversarial Networks”. In: NIPS’2014, arXiv:1406.2661.
  • Pascanu, Razvan, Guido Montufar, and Yoshua Bengio (2014). “On the number of inference regions of deep feed forward networks with piece-wise linear activations”. In: International Conference on Learning Representations 2014 (Conference Track),arXiv:1305.6663.
  • Yoshua Bengio, Li Yao, Guillaume Alain, and Pascal Vincent (2013). “Generalized Denoising Auto-Encoders as Generative Models”. In: NIPS’2013, arXiv:1305.6663.
  • Glorot, X., A. Bordes, and Y. Bengio (2011). “Deep Sparse Rectifier Neural Networks”. In: AISTATS’2011
  • Glorot, Xavier and Yoshua Bengio (2010). “Understanding the difficulty of training deep feedforward neural networks”. In: AISTATS’2010.
  • Bengio, Yoshua, Jerome Louradour, Ronan Collobert, and Jason Weston (2009). “Curriculum Learning”. In: ICML’09, 2009_curriculum_icml.
  • Bengio, Yoshua (2009). “Learning deep architectures for AI”. In: Foundations and Trends in Machine Learning 2.1, pp. 1–127.
  • Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol (2008). “Ex-tracting and Composing Robust Features with Denoising Autoencoders”. In: Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML’08). Ed. by William W. Cohen, Andrew McCallum, and Sam T. Roweis. ACM, pp. 1096–1103.
  • Bengio, Yoshua, Pascal Lamblin, Dan Popovici, and Hugo Larochelle (2007). “Greedy Layer-Wise Training of Deep Networks”. In: NIPS’2006. Ed. by Bernhard Schölkopf, John Platt, and Thomas Hoffman. MIT Press, pp. 153–160.
  • Bengio, Y., Delalleau, O., Le Roux, N. (2005) “The Curse of Highly Variable Functions for Local Kernel Machines”. NIPS’2005
  • Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin (2003). “A Neural Probabilistic Language Model”. In: Journal of Machine Learning Research 3, pp. 1137– 1155.
  • Bengio, Yoshua and Samy Bengio (2000). “Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks”. In: Advances in Neural Information Processing Systems 12 (NIPS’99). Ed. by S.A. Solla, T.K. Leen, and K-R. Müller. MIT Press, pp. 400–406.
  • LeCun, Yann, Leon Bottou, Yoshua Bengio, and Patrick Haffner (1998). “Gradient-Based Learning Applied to Document Recognition”. In: Proceedings of the IEEE 86.11, pp. 2278– 2324.
  • Bengio, Y., P. Simard, and P. Frasconi (1994). “Learning Long-Term Dependencies with Gradient Descent is Difficult”. In: IEEE Transactions on Neural Networks 5.2, pp. 157– 166.
  • Bengio, Yoshua, Samy Bengio, Jocelyn Cloutier, and Jan Gecsei (1991). “Learning a Synaptic Learning Rule”. In: International Joint Conference on Neural Networks (IJCNN). Seattle, WA, II–A969.

Support Us

CIFAR is a registered charitable organization supported by the governments of Canada, Alberta and Quebec, as well as foundations, individuals, corporations and Canadian and international partner organizations.

MaRS Centre, West Tower
661 University Ave., Suite 505
Toronto, ON M5G 1M1 Canada