Skip to content
CIFAR header logo
fr
menu_mobile_logo_alt
  • Our Impact
    • Why CIFAR?
    • Impact Clusters
    • News
    • CIFAR Strategy
    • Nurturing a Resilient Earth
    • AI Impact
    • Donor Impact
    • CIFAR 40
  • Events
  • Programs
    • Research Programs
    • Pan-Canadian AI Strategy
    • Next Generation Initiatives
    • CIFAR Arrell Future of Food Initiative
  • People
    • Fellows & Advisors
    • CIFAR Global Scholars
    • Canada CIFAR AI Chairs
    • AI Strategy Leadership
    • Leadership
    • Staff Directory
  • Support Us
  • About
    • Our Story
    • Awards
    • Partnerships
    • Publications & Reports
    • Careers
    • Equity, Diversity & Inclusion
    • Statement on Institutional Neutrality
    • Research Security
  • fr
Canada CIFAR AI Chairs

De-risking and de-mystifying large-scale AI

By: Krista Davidson
19 May, 2026
May 19, 2026
Colin Raffel with illustrative representation of data nodes in the background / Colin Raffel avec une représentation illustrative des nœuds de données en arrière-plan

The role of AI in democracy has become something of a Catch-22. While AI has the potential to enhance democracy through interactive elections, civic participation, and streamlined government services, it also poses significant risks, including the perpetuation of bias and misinformation. Perhaps one of the most serious risks is the immense power it provides to a few elite, high-resource companies. 

This is the impetus for Colin Raffel‘s research, a Canada CIFAR AI Chair and associate director at the Vector Institute. He is also an associate professor of Computer Science at the University of Toronto and a faculty researcher at Hugging Face, an online platform that enables developers to collaborate on models, datasets and applications.

Raffel’s research is focused on mitigating the risks of AI, decentralizing it and making it easier for others to develop large-scale AI without having to wrangle endless amounts of unnecessary and often unauthorized data for training purposes. 

“I was motivated by a growing concern around concentration of power among resource-rich companies. This motivated our focus on decentralized methods for developing large-scale AI. More recently, as AI has gotten more capable, my worries have broadened, and my group has increased its focus on mitigating risks,” says Raffel.

Raffel’s work is particularly poignant today amid concerns of a looming data shortage and a growing number of lawsuits regarding the use of unlicensed, copyrighted data to train Large Language Models (LLMs), advanced AI systems that can understand, process and generate complex tasks much like humans can. 

While many companies claim that using unauthorized data is the only way to enhance the quality of leading AI models and continue progress in the rapidly growing field. Raffel and his collaborators recently invalidated this claim through a project called The Common Pile. 

Using a large-scale dataset of openly licensed and public domain text, the team trained a series of models that performed comparably to those trained on unlicensed data. The Common Pile tested content from 30 sources of diverse domains, including research papers, code, books, audio transcripts and more. The research was regarded as a first step towards training models ethically and responsibly.

What’s more, the research makes it easier for others to advance and refine AI systems.

“We developed methods that made it possible for individual contributors to share model updates with one another, and combined them continuously to improve models in a decentralized fashion,” Raffel explains.

Despite these technical successes, Raffel remains wary of society’s growing overreliance on these tools. He states that delegating more labor and cognitive tasks to LLMs could lead to existential issues if extrapolated, noting that the technical problems underlying both existential and societal risks are often identical.

From music to machine learning

Raffel’s foray into the field was sparked by his love of music. “I got into research because I was a musician and wanted to develop new software tools for musicians,” he recalls. This passion led him to music information retrieval, an interdisciplinary field that extracts complex information from music — the same technology that powers song-matching apps and recommender algorithms.

The work led to an interest in machine learning, particularly algorithms with limited labeled data.

Raffel credits the Canada CIFAR AI Chairs program, the Vector Institute and the University of Toronto for cultivating a unique, collaborative ecosystem.

The Canada CIFAR AI Chairs program, a cornerstone of the Pan-Canadian AI Strategy, provides the long-term stability and funding necessary to allow elite AI researchers to focus on high-impact research. To date, the program has attracted more than 140 researchers to Canada to pursue their research.

“There is immense value in being outside the ‘gravitational pull’ of Silicon Valley,” he adds. “It allows us to think more critically and independently about the future of AI.”

  • Follow Us

Related Articles

  • The new frontier of AI and neurology
    March 28, 2026
  • CIFAR awards over $1M to support sociotechnical challenges in AI safety
    March 28, 2026
  • Privacy by Design
    March 05, 2026
  • How Angelica Lim is building robots that understand us
    February 04, 2026

Support Us

The Canadian Institute for Advanced Research (CIFAR) is a globally influential research organization proudly based in Canada. We mobilize the world’s most brilliant people across disciplines and at all career stages to advance transformative knowledge and solve humanity’s biggest problems, together. We are supported by the governments of Canada, Alberta and Québec, as well as Canadian and international foundations, individuals, corporations and partner organizations.

Donate Now
CIFAR header logo

MaRS Centre, West Tower
661 University Ave., Suite 505
Toronto, ON M5G 1M1 Canada

Contact Us
Media
Careers
Accessibility Policies
Supporters
Financial Reports
Subscribe

  • © Copyright 2026 CIFAR. All Rights Reserved.
  • Charitable Registration Number: 11921 9251 RR0001
  • Terms of Use
  • Privacy
  • Sitemap

Subscribe

Stay up to date on news & ideas from CIFAR.

This website stores cookies on your computer. These cookies are used to collect information about how you interact with our website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out more about the cookies we use, see our Privacy Policy.
Accept Learn more