Skip to main content

Galaxy classification via machine learning

To understand how galaxies grow, by component or as a whole, and whether galaxies evolve in two phases (i.e. spheroid formation, and disk growth) single component fitting and bulge and disc decomposition (i.e. fitting the light profile of the inner and outer components simultaneously) needs to be performed in a comprehensive manner starting in the local universe and slowly increasing in redshift. A major problem faced when doing this structural decomposition on large samples of galaxies is the identification of multi-component systems. There are two possibilities to deal with this:
  1. Use morphological classification as a prior and only fit galaxies identified as multi-component systems with multiple light profiles
  2. Fit all galaxies blindly with single and multi-component light profiles and use logical filters and other criteria (e.g. BIC or AIC) after the fact to find the multi-component systems.
The first option is preferred since it is computationally less expensive, however, the morphological classification can be difficult. Having robust morphology classifications is not only important for fitting the light profiles of galaxies but also to understand how the relative fractions of various galaxy types change over cosmic time.
Currently the morphology of galaxies is typically established by visual classification, either by astronomers or through citizen science projects like Galaxy Zoo. However,  this is mostly restricted to lower redshifts and we are reaching the limit of the number of galaxies that can easily be classified this way. Establishing a tool to classify galaxies is especially important in the era of the upcoming new ground and space-based telescopes (such as the Euclid mission, the Wide Field InfraRed survey Telescope РWFIRST, the James Webb Space telescope РJWST, the Giant Magellan Telescope РGMT to name a few) which will enable us to probe larger areas of sky with better resolution, probing deeper towards the beginning of galaxy assembly, increasing the number of known redshifts into the billions and collecting high resolution imaging for a sizable fraction of them.
Machine learning is arguably a suitable approach for classifying large datasets of galaxy images. State of the art machine learning algorithms (deep learning) have achieved good results in recent years, particularly on image classification tasks. Additionally, feature extraction is used in machine learning to automatically extract information from the raw pixels of images rather than requiring domain knowledge to engineer features such as the light profile fit of a galaxy for classification. Automatically extracted features could produce better classification results than manually engineered features. Therefore, this project explores the feasibility of using deep learning algorithms (e.g. convolutional neural networks) to classify galaxies based on their morphology.