TY - JOUR
T1 - An automatic taxonomy of galaxy morphology using unsupervised machine learning
AU - Hocking, Alex
AU - Geach, James E.
AU - Sun, Yi
AU - Davey, Neil
N1 - This article has been accepted for publication in Monthly Notices of the Royal Astronomical Society ©: 2017 the Author (s). Published by Oxford University Press on behalf of the Royal Astronomical Society. All rights reserved
PY - 2017/9/15
Y1 - 2017/9/15
N2 - We present an unsupervised machine learning technique that automatically segments and labels galaxies in astronomical imaging surveys using only pixel data. Distinct from previous unsupervised machine learning approaches used in astronomy we use no pre-selection or pre-filtering of target galaxy type to identify galaxies that are similar. We demonstrate the technique on the Hubble Space Telescope (HST) Frontier Fields. By training the algorithm using galaxies from one field (Abell 2744) and applying the result to another (MACS 0416.1-2403), we show how the algorithm can cleanly separate early and late type galaxies without any form of pre-directed training for what an 'early' or 'late' type galaxy is. We then apply the technique to the HST Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) fields, creating a catalogue of approximately 60 000 classifications. We show how the automatic classification groups galaxies of similar morphological (and photometric) type and make the classifications public via a catalogue, a visual catalogue and galaxy similarity search. We compare the CANDELS machine-based classifications to human-classifications from the Galaxy Zoo: CANDELS project. Although there is not a direct mapping between Galaxy Zoo and our hierarchical labelling, we demonstrate a good level of concordance between human and machine classifications. Finally, we show how the technique can be used to identify rarer objects and present lensed galaxy candidates from the CANDELS imaging.
AB - We present an unsupervised machine learning technique that automatically segments and labels galaxies in astronomical imaging surveys using only pixel data. Distinct from previous unsupervised machine learning approaches used in astronomy we use no pre-selection or pre-filtering of target galaxy type to identify galaxies that are similar. We demonstrate the technique on the Hubble Space Telescope (HST) Frontier Fields. By training the algorithm using galaxies from one field (Abell 2744) and applying the result to another (MACS 0416.1-2403), we show how the algorithm can cleanly separate early and late type galaxies without any form of pre-directed training for what an 'early' or 'late' type galaxy is. We then apply the technique to the HST Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) fields, creating a catalogue of approximately 60 000 classifications. We show how the automatic classification groups galaxies of similar morphological (and photometric) type and make the classifications public via a catalogue, a visual catalogue and galaxy similarity search. We compare the CANDELS machine-based classifications to human-classifications from the Galaxy Zoo: CANDELS project. Although there is not a direct mapping between Galaxy Zoo and our hierarchical labelling, we demonstrate a good level of concordance between human and machine classifications. Finally, we show how the technique can be used to identify rarer objects and present lensed galaxy candidates from the CANDELS imaging.
KW - Methods: data analysis
KW - Methods: observational
KW - Methods: statistical
UR - http://www.scopus.com/inward/record.url?scp=85032578083&partnerID=8YFLogxK
U2 - 10.1093/mnras/stx2351
DO - 10.1093/mnras/stx2351
M3 - Article
AN - SCOPUS:85032578083
SN - 0035-8711
VL - 473
SP - 1108
EP - 1129
JO - Monthly Notices of the Royal Astronomical Society
JF - Monthly Notices of the Royal Astronomical Society
IS - 1
ER -