Researchers from CIFAR’s Gravity & the Extreme Universe (GEU) program came together with Canada CIFAR AI (CCAI) chairs and other AI and biomedical experts in a virtual roundtable in July 2020 to explore how the algorithms developed and used for astronomical / cosmological data analysis can be applied to similarly complex problems in biomedicine, and vice versa. This roundtable helped launch fruitful new research collaborations across disciplines.
Building on these previous efforts, on June 3, 2022, CIFAR convened a 1-day workshop that expanded the conversations between astronomers/cosmologists and AI researchers. Through panels and breakout group discussions, GEU members, CCAI chairs, and other astronomy and AI researchers from academia and industry discussed new and optimized ways of using AI and machine learning (ML) in astronomy and cosmology, as well as opportunities to apply concepts and techniques of physics and astronomy to the theoretical underpinnings of AI/ML. The participants also further shared their views on common challenges and opportunities across the two fields, paving the way for new collaboration opportunities.
Key Insights and Next Steps
- While most data analyses in astronomy and cosmology may likely involve ML within the next decade, astronomers need to carefully consider the advantages of doing so. ML is, essentially, a very powerful fitting function, and can arguably be trained to fit any data. Applying ML to a new problem just because it is possible is not necessarily scientifically interesting, and it could be difficult for researchers to assess if the generated fit is actually good.
- ML is most useful in cases where it provides specific advantages towards the scientific goal, e.g., where it provides significant speedup compared to traditional statistical methods, allows for improved accuracy (by learning complex prior probability distributions and avoiding compounding biases; without ML, researchers have to start with simpler priors and more approximations in their models which may bias results), and permits for gaining new physical insights from the data (such as by using symbolic regression methods to obtain new analytical equations).
- One of the most powerful applications of ML in astronomy/cosmology is for astrophysical / cosmological parameter inference. ML-based techniques can be useful for producing theoretical predictions that can be mapped to observed data (e.g., using convolutional neural networks to produce more general models than hydrodynamic simulations, at lower computational costs), inferring parameters from data (e.g., using a wavelet scattering transform estimator that more fully utilizes all the information content of the data while retaining interpretability), and determining the likelihood of the observed data given the inferred parameters (e.g., using flow-based generative models that incorporate translational and rotational symmetries, or models that can capture the non-Gaussianity in the data).
- A key consideration for using ML in parameter inference is the accuracy of the uncertainty estimates, which is affected by a combination of noise in the original data and errors generated by the neural network itself. The research community will need to define the level of accuracy needed in these estimates. Methods that involve a combination of simulation and ML (including Bayesian neural networks and mixture density networks) could improve the accuracy of uncertainty estimates.
- Researchers developing ML techniques for astronomy and cosmology should focus on techniques and workflows that are reproducible, explainable and interpretable (such as symbolic regression or wavelet scattering transform), minimizing data augmentation and instead using architectures that encode the expected physical symmetries of the problems of interest, so that models can be mapped to an existing body of scientific knowledge and provide some insight into what features of the data enabled learning. The community should develop some sort of benchmarks or standards so that performance can be compared, and which can help researchers evaluate if the models can be trusted.
- Cosmology and astrophysics, and physics more generally, can contribute to the development of ML in a number of important ways. For example, the generation of synthetic data for simulations is common in (astro)physics research, so physicists’ models for data generation could help in understanding how the distribution and features of data affect how ML models learn. ML researchers could also take inspiration from how (astro)physicists model the dynamics of the environment, and how simulations / models can be improved despite the effects of unknown physics, potentially leading to improvements in how ML models generalize to unseen data. In other words, (astro)physics could open up new opportunities for ML to better “learn how to learn”.
- Specific examples of applying (astro)physical and cosmological techniques to ML include combining statistical mechanical approaches with normalizing flow-based ML methods to develop a general purpose Bayesian inference algorithm; and, adapting energy-conserving relativistic Born-Infeld dynamics, which is used to describe the behaviour of scalar fields in potential energy landscapes in some models of cosmic inflation, as an alternative to the commonly used stochastic gradient descent method for optimization in ML. Physics techniques can also be used to analyze (and perhaps improve) the behaviour of ML models with respect to increasing model size (number of parameters), dataset size and amount of computation, in the form of scaling laws.
- An analogy may be drawn between AI architectures — as multi-scale, hierarchical networks through which information flows — and the formation and evolution of large-scale structures in the Universe. The approaches and methodologies used by cosmologists, e.g., in understanding the emergence of the cosmic web from quantum information, could thus inform the design of more explainable and interpretable AI architectures.
- At a more fundamental level, studying ML models or neural networks as physical phenomena, like a system of interacting particles, could potentially lead to advances in the theoretical understanding of ML.
Roundtable Participants
- J. Richard Bond, Professor, Canadian Institute for Theoretical Astrophysics, University of Toronto / Fellow, Gravity & the Extreme Universe program, CIFAR
- Man Leong (Mervyn) Chan, Postdoctoral Fellow, University of British Columbia
- Audrey Durand, Assistant Professor, Université Laval / Canada CIFAR AI Chair, Mila
- Cora Dvorkin, Associate Professor, Harvard University / Senior Investigator, NSF AI Institute for Artificial Intelligence and Fundamental Interactions
- Sébastien Fabbro, Science Computing Specialist, Canadian Astronomy Data Centre, Herzberg Astronomy and Astrophysics, National Research Council of Canada
- Daryl Haggard, Associate Professor and Canada Research Chair in Multi-messenger Astrophysics, McGill University / Azrieli Global Scholar, Gravity & the Extreme Universe program, CIFAR
- Yashar Hezaveh, Assistant Professor and Canada Research Chair in Astrophysical Data Analysis and Machine Learning, Université de Montréal
- Renée Hložek, Assistant Professor, University of Toronto / Azrieli Global Scholar, Gravity & the Extreme Universe program, CIFAR
- Shirley Ho, Group Leader, Cosmology X Data Science, Flatiron Institute
- Jared Kaplan, Associate Professor, Johns Hopkins University / Co-Founder, Anthropic
- Victoria Kaspi, Professor and Director of McGill Space Institute, McGill University / Director and R. Howard Webster Foundation Fellow, Gravity & the Extreme Universe program, CIFAR
- Juna Kollmeier, Professor and Director of Canadian Institute for Theoretical Astrophysics, University of Toronto / Fellow, Gravity & the Extreme Universe program, CIFAR
- Flavie Lavoie-Cardinal, Assistant Professor and Canada Research Chair in Intelligent Nanoscopy of Cellular Plasticity, Université Laval
- Luis Lehner, Research Faculty, Perimeter Institute / Fellow, Gravity & the Extreme University program, CIFAR
- Adrian Liu, Assistant Professor, McGill University / Azrieli Global Scholar, Gravity & the Extreme Universe program, CIFAR
- Ashish Mahabal, Lead Computational and Data Scientist, Center for Data Driven Discovery, California Institute of Technology
- Alexander Maloney, Professor, McGill University
- Jess McIver, Assistant Professor and Canada Research Chair in Gravitational Wave Astrophysics, University of British Columbia
- Laurence Perreault Levasseur, Assistant Professor, Université de Montréal / Associate Member, Mila
- Siamak Ravanbakhsh, Assistant Professor, McGill University / Canada CIFAR AI Chair, Mila
- Nayyer Raza, Graduate Student, McGill University
- Daniel Roberts, Principal Researcher, Salesforce / Research Affiliate, Center for Theoretical Physics, MIT
- Uroš Seljak, Professor and Director of Berkeley Center for Cosmological Physics, University of California, Berkeley
- Eva Silverstein, Professor, Stanford University / Advisor, Gravity & the Extreme Universe program, CIFAR
- Ingrid Stairs, Professor, University of British Columbia / Fellow, Gravity & the Extreme Universe program, CIFAR
- Scott Tremaine, Professor, University of Toronto / Advisory Committee Chair, Gravity & the Extreme Universe program, CIFAR
- Simon White, Emeritus Director, Max Planck Institute for Astrophysics / Advisor, Gravity & the Extreme Universe program, CIFAR
Further Reading
CIFAR resources:
Algorithms in Astronomy and Biomedicine (event brief)
AI for Astronomy and Health (research brief)
AI Research Enables Astronomy Breakthrough (CIFAR news)
Exploring the Great Unknown (CIFAR news)
Other resources:
GWSkyNet-Multi: A Machine-learning Multiclass Classifier for LIGO–Virgo Public Alerts, by Thomas Abbott et al.
Discovering Symbolic Models from Deep Learning with Inductive Biases, by Miles Cranmer et al.
Translation and Rotation Equivariant Normalizing Flow (TRENF) for Optimal Cosmological Analysis, by Biwei Dai and Uroš Seljak
Machine Learning and Cosmology, by Cora Dvorkin et al.
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization, by G. Bruno De Luca and Eva Silverstein
Recovering the Wedge Modes Lost to 21-cm Foregrounds, by Samuel Gagnon-Hartman et al.
Scaling Laws for Autoregressive Generative Modeling, by Tom Henighan et al.
Simulation-Based Inference of Strong Gravitational Lensing Parameters, by Ronan Legin et al.
Rediscovering Orbital Mechanics with Machine Learning, by Pablo Lemos et al.
The Principles of Deep Learning Theory, by Daniel Roberts et al.
Going Beyond the Galaxy Power Spectrum: an Analysis of BOSS Data with Wavelet Scattering Transforms, by Georgios Valogiannis and Cora Dvorkin
Machine Learning Advantages in Canadian Astrophysics, by Kim Venn et al.
Modeling Assembly Bias with Machine Learning and Symbolic Regression, by Digvijay Wadekar et al.
For more information, contact
Johnny Kung
Senior Officer, Knowledge Mobilization and Publications
CIFAR
johnny.kung@cifar.ca