A key aspect of astronomy and cosmology is the use of instruments such as telescopes and interferometers to observe the Universe through a variety of channels or “messengers” (such as different parts of the electromagnetic spectrum, including visible light and radio waves; neutrinos; and more recently, gravitational waves). The large amount of, often noisy, observational data gathered will then need to be analyzed to pinpoint the astronomical phenomena of interest, and to extract information for better modelling and understanding our Universe. Astronomers and cosmologists, including a number of researchers in CIFAR’s Gravity & the Extreme Universe program, are making use of powerful computer algorithms, including methods based on artificial intelligence (AI) and machine learning (ML), to conduct a variety of analyses on astronomical images and data. Advances in these algorithms can be adapted to tackle similarly complex data in biomedicine, from genomics to medical imaging.
Why this matters
Data analysis for astronomical and biomedical imaging share certain similar features and constraints:
Both types of imaging generate large amounts of data, which present similar issues with data curation and management.
The target image or signal can often be a small and/or transient component that needs to be segmented from a noisy background and correctly identified and classified — a task that is amenable to be tackled with AI/ML.
Unlike the case of applying ML to images of everyday objects, however, in both astronomy and microscopy there often aren’t a large collection of well-annotated preexisting images for training the algorithms.
At the same time, the number and quality of images in astronomy can often be limited by the available window of observation, i.e., there are no opportunities to go back and re-image — a challenge as well for many instances of medical imaging.
Finally, for certain kinds of non-optical imaging, such as in radio astronomy or magnetic resonance imaging (MRI), the images are obtained indirectly and must similarly be reconstructed from the signal.
As such, AI algorithms developed to tackle these challenges in astronomical imaging can potentially be applied to solve similar problems in biomedical imaging, and vice versa.
Key insights from CIFAR researchers
When gravitational wave (GW) interferometers such as LIGO detect a signal, astronomers would turn their telescopes to the region of the sky where the gravitational wave signal is localized to, in an attempt to identify an “electromagnetic (EM) counterpart” (a signal in the EM spectrum, be it in the gamma ray, visible or another wavelength) that corresponds to the GW source. The nature of the EM counterpart, or the lack of one, provides information about the type of astronomical event that created the GW signal. In a recent paper, Daryl Haggard, Maria Drout and their colleagues reported on a search for an EM counterpart for a GW signal proposed to have originated from the merger of a neutron star and a black hole. After subtracting the observed images from reference images (obtained prior to, or sufficiently long after the GW event) to obtain “difference” images, these triplets of images were vetted by a deep-learning based “real-bogus” classifier. The algorithm was trained with about 10% of the total data (about 2000 triplets) visually inspected by the researchers to be potentially real EM sources or not, supplemented with a smaller set of known transient astronomical events (or “transients”). The classifier decreased by 90% the number of candidates for further analysis. While, at the end, the researchers did not identify a compelling EM candidate for the GW event, the work establishes an analysis pipeline for similar searches for future observations.
Several ongoing and upcoming optical astronomical surveys, such as the Legacy Survey of Space and Time (LSST) to be conducted at the Vera C. Rubin Observatory in Chile, monitor the sky for transients, including supernovae, kilonovae (mergers of a black hole and a neutron star, or of two neutron stars) and tidal disruption events (where a star is pulled apart by a black hole). Such surveys will eventually make millions of detections every night, so astronomers need a way to automatically analyze the incoming signals to identify candidates of interest. Renée Hložek and her colleagues reported on their development of a real-time classifier called RAPID (Real-time Automated Photometric IDentification) using a deep recurrent neural network (RNN) with gated recurrent units (GRUs), which allows for the accurate early classification (within two days of first detection) of 12 different types of transients. At such early timepoints the data available is sparse, but the ability to make an early classification, rather than waiting for the full light curve (time-series of photometric measurements), allows for more detailed follow-up analyses of the transient’s source, using tools such as spectroscopy, while the event is still in progress. Given the lack of large, well-labelled datasets (particularly of rarer transient events), the authors created a simulated set of more than 48,000 transients and used 60% as the training set. The trained RNN classifier was able to accurately classify the remaining simulated data as well as actual observational data from the Zwicky Transient Facility.
The Canadian Hydrogen Intensity Mapping Experiment (CHIME) is an interferometric radio telescope that is playing a major role in the detection of fast radio bursts (FRBs) — mysterious, transient (on the scale of milliseconds in duration) but bright radio pulses whose origin and nature are still unknown. Similar to the case of other sky-surveying telescopes, terabytes of radio observation data is generated by CHIME every second, and 142 Gb/s of data are input into the telescope’s back-end, so the CHIME/FRB Collaboration (which includes CIFAR fellows Matt Dobbs, Mark Halpern, Gary Hinshaw, Vicky Kaspi, Ue-Li Pen, Scott Ransom, Kendrick Smith, and Ingrid Stairs) has devised an automated, real-time data analysis pipeline to sift through the data and identify candidate FRB events for further analysis. The pipeline implements a supervised machine learning method called Support Vector Machines in a classifier to evaluate the probability that a signal represents an actual astrophysical event rather than a radio frequency interference (RFI, e.g., from cellular or TV transmission), based on features including the signal’s pattern, location and signal-to-noise ratio. Signals that are classified as being of astrophysical origin are then fed down the pipeline to ultimately determine if the full data associated with the event should be discarded or saved.
An emerging tool in observational cosmology is the 21-cm “hydrogen line” – the line in the EM spectrum corresponding to the transition between two energy states in neutral hydrogen atoms. While this energy transition occurs with a low probability, the immense amount of hydrogen in our Universe creates an observable signal that can be used to create a 3-D intensity map (with the third dimension provided by the signal’s redshift, z, for which a larger value corresponds to a greater distance from Earth and thus further back in time). 21-cm cosmology is thought to be particularly informative for the Universe’s “Dark Ages” between about 400,000 years after the Big Bang, when the Universe had cooled down enough for the “recombination” of electrons and atomic nuclei to form neutral atoms, to about 1.5 billion years after the Big Bang, when light from the first stars and galaxies led to “reionization” (the initial stages of which can be observed as “bubbles” in 21-cm maps). To examine how 21-cm cosmology can be used to understand the Universe during this era, Adrian Liu and his colleagues used convolutional neural networks (CNNs) to analyze simulated 21-cm data, either 2-D images of the sky at a specific redshift, or 2-D “light cones” in specific directions (“lines of sight”) backward through time. In one experiment, CNNs are used to develop a classifier to differentiate between images that result from different models of how reionization occurred, either due to star-forming galaxies or to the non-stellar radiation of active galactic nuclei (where energy is emitted as matter is accreted into supermassive black holes; e.g., quasars). In a second study, the researchers used CNNs to analyze light cones obtained from models with different astrophysical parameters, related to factors such as the rate of star formation or the emission and absorption of different forms of radiation, which provide information about the process and timing of various cosmological events. The team is also exploring the use of CNNs to reconstruct the shape of ionized regions in 21-cm images, distorted by contaminant filtering during data processing, using the information that hasn’t been filtered out. As the next generation of radio telescopes for 21-cm intensity mapping begin to come online and generate data, such CNN-based tools will be an important part of astronomers’ analysis toolkit.
The CHIME/FRB Collaboration et al. 2018. The CHIME Fast Radio Burst project: System overview. ApJ 863:48.
The CHIME/FRB Collaboration et al. 2019. Observations of fast radio bursts at frequencies down to 400 megahertz. Nature 566:230-234.
Gillet N et al. 2019. Deep learning from 21-cm tomography of the cosmic dawn and reionization. Mon. Not. R. Astron. Soc. 484:282-293.
Hassan S et al. 2018. Identifying reionization sources from 21 cm maps using Convolutional Neural Networks. Mon. Not. R. Astron. Soc. 483:2524-2537.
Liu A et al. 2020. Data analysis for precision 21 cm cosmology. Publ. Astron. Soc. Pac. 132:062001.
Muthukrishna D et al. 2019. RAPID: Early classification of explosive transients using deep learning. Publ. Astron. Soc. Pac. 131:118002.
Putzky P et al. 2019. Invert to learn to invert. In Wallach H. et al (eds.), Advances in Neural Information Processing Systems 32 (NeurIPS 2019). San Diego: Neural Information Processing Systems Foundation, Inc.
Vieira N et al. 2020. A deep CFHT optical search for a counterpart to the possible neutron star – black hole merger GW190814. Apj 895:96.
Untangling the cosmos (symposium brief)
A repeating fast radio burst (research brief)
AI research enables astronomy breakthrough (CIFAR news)
CHIME: Detecting strange flashes from the cosmos (CIFAR news)