By: Krista Davidson
28 Mar, 2026
CIFAR is pleased to announce the second cohort of AI Safety Catalyst Projects as part of the Canadian AI Safety Institute (CAISI) Research Program at CIFAR. These projects are designed to spark original sociotechnical research that addresses the most urgent challenges in the safety, accountability, and ethical deployment of AI.
As AI evolves from technical curiosity into agentic tools deeply embedded in our professional and democratic lives, the need for robust safety frameworks to safeguard our society has never been more critical. This new cohort of eight projects brings together experts across computer science, sociology, economics, and Indigenous governance to move AI safety beyond simple fixes toward holistic, socially grounded solutions focused on societal impacts.
By tackling issues ranging from Indigenous data sovereignty and the certification of AI for professional roles to safeguarding democratic processes against persuasive bias, these projects are building the regulatory and ethical blueprints to support a future where AI remains aligned with human welfare.
The AI Safety Catalyst Projects are funded by the Department of Innovation, Science and Economic Development Canada (ISED) through the federally-led Canadian AI Safety Institute. The CAISI Research Program at CIFAR serves as the independent scientific research engine for the federally led CAISI.
See full list of projects below:
Socio-technical solutions to improve information integrity and AI literacy
Jean-François Godbout (Mila & Université de Montréal), Nina Wang (York University), Gordon Pennycook (Cornell University), David Rand (Cornell University), Kellin Pelrine (FAR.AI), Matt Kowal (Goodfire), Taylor Lynn Curtis (Mila)
This project tackles the global threat of misinformation by developing a socio-technical framework to enhance information integrity and AI literacy. Through bilingual experiments involving Canadian participants, the team will evaluate how AI can effectively correct factual misconceptions within Canada’s English and French communities. The research explores the ethics of personalizing AI responses to individual cognitive styles while designing educational modules that foster critical thinking. By integrating these findings into open-source tools for fact-checking like Veracity, the project provides Canadians with proactive “epistemic companions” to build democratic resilience against AI-driven disinformation and societal polarization.
Addressing AI-safety through Indigenous community-based governance
Victoria Lemieux (University of British Columbia), Jacob Taylor (First Nations University), Robin Billy (Secwepemc Nation), Ethan Clark (Nationsfirst Technologies)
This project addresses the critical gap in current AI safety frameworks, which often overlook Indigenous rights and data sovereignty. By treating AI safety as a “common-pool resource,” the project team combines Elinor Ostrom’s commons theory with the Haudenosaunee Great Law of Peace to co-create a community-led governance model. Using decentralized infrastructure and privacy-enhancing technologies, the initiative will test this framework in a “living lab” setting. Ultimately, the project supports economic reconciliation by empowering Indigenous communities to oversee AI implementation, ensuring technology respects cultural heritage and seven-generation stewardship.
Democratic alignment of large language models (LLMs) through economic theory: relative preferences and strategic coordination
Clemens Possnig (University of Waterloo), Elliot Creager (University of Waterloo), Rohit Lamba (Cornell University)
This project addresses the limitations of current AI alignment methods, which often struggle to represent diverse societal values and resist strategic manipulation. By applying economic theory and mechanism design, the team will develop a framework for democratic alignment that moves beyond simple rankings to capture the intensity of human preferences. Using techniques like “quadratic voting,” the researchers aim to create protocols that empower communities to steer model behaviour while protecting against harmful exploitation. Ultimately, this work provides a blueprint for safer, more accountable AI systems that respect pluralistic perspectives and foster democratic resilience.
Repetition, resistance, and reinforcement: longitudinal effects of conversational AI on political attitudes
Semra Sevi (University of Toronto), Can Mekik (University of Toronto)
LLMs are now embedded in the everyday lives of millions of people. This project investigates how repeated interactions with conversational AI shape political persuasion, trust, and openness to attitude change. Although existing research shows that a single conversation with an AI chatbot can influence beliefs, far less is known about how persuasion unfolds when people engage with the same AI system repeatedly over time. Using a multi-wave survey experiment with a nationally representative sample of voters, this project directly compares repeated interactions with a chatbot to one-shot conversations. It examines whether familiarity and relational continuity increase receptiveness to persuasion, and whether personalization based on prior interactions strengthens persuasive impact. By disentangling relationship-building from tailored messaging, the project will identify the mechanisms through which AI influences political attitudes, including potential spillover effects.
Towards socially grounded AI safety: Integrating causal and institutional reasoning in language models
Matt Ratto (University of Toronto), Zhijing Jin (Canada CIFAR AI Chair, Vector Institute, University of Toronto)
This project addresses the risk of “sociologically naïve” AI systems that lack an understanding of cultural norms and institutional contexts. While current models are technically advanced, they often rely on surface-level correlations that can inadvertently reproduce harmful social biases. By bridging sociology and computer science, the team will develop agentic architectures and causal reasoning modules to embed social context into AI behaviour. Ultimately, this research moves AI safety beyond simple behavioral control toward relational accountability, ensuring that AI systems can navigate complex human ecologies appropriately while fostering more trustworthy human-AI coordination.
Economic foundations of AI certification
Jesse Perla (University of British Columbia), Kevin Leyton-Brown (Canada CIFAR AI Chair, Amii, University of British Columbia), Serena Wang (University of British Columbia)
This project addresses the urgent need for robust AI oversight as models demonstrate increasingly superhuman yet unpredictable capabilities. While current regulations focus on technical malfunctions, they lack a framework for certifying AI in specific professional roles. By integrating economic theory and computer science, the project team will develop a “safety-through-certification” framework that evaluates economic rationality and models multiagent interactions using behavioral game theory. Ultimately, the project provides a blueprint for domain-specific certification standards, ensuring agentic AI remains accountable and aligned with human welfare across sectors like healthcare and finance.
Performative empathy and deceptive alignment
Michael Inzlicht (University of Toronto)
This project addresses the safety risks of “performative empathy” in LLMs. While AI-generated empathy can improve clinical interactions, it risks “deceptive alignment,” where artificial care manipulates human trust and undermines objective medical judgment. Through large-scale experiments, the team will isolate features that trigger inappropriate trust and use signal detection theory to identify where AI empathy degrades decision quality. Ultimately, the project will co-design regulatory “circuit breakers” and institutional frameworks to ensure AI remains a safe tool for patient welfare rather than a manipulative force in healthcare.
Testing red-team safeguards against AI persuasion in democratic governance
Seth Wynes (University of Waterloo), Sam Johnson (University of Waterloo)
As AI becomes a primary information source for governance, current safeguards like human peer review are often too slow to be effective. This project addresses the risk of advanced AI systems manipulating democratic deliberation through persuasive bias. Through large-scale survey experiments and deliberative mini-publics, the team will test the efficacy of “red-team” AI safeguards designed to detect and neutralize biased information in real-time. By providing empirical evidence on AI’s persuasive power, the research offers actionable guidance for Canadian policy, ensuring democratic decision-making remains resilient against manipulation in high-stakes public consultations.