Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language ‘captioning’ model

Michael J Smith, Nikhil Arora, Connor Stone, Stéphane Courteau, James E Geach

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
40 Downloads (Pure)

Abstract

We present ‘Pix2Prof’, a deep learning model that can eliminate any manual steps taken when measuring galaxy profiles. We argue that a galaxy profile of any sort is conceptually similar to a natural language image caption. This idea allows us to leverage image captioning methods from the field of natural language processing, and so we design Pix2Prof as a float sequence ‘captioning’ model suitable for galaxy profile inference. We demonstrate the technique by approximating a galaxy surface brightness (SB) profile fitting method that contains several manual steps. Pix2Prof processes ∼1 image per second on an Intel Xeon E5-2650 v3 CPU, improving on the speed of the manual interactive method by more than two orders of magnitude. Crucially, Pix2Prof requires no manual interaction, and since galaxy profile estimation is an embarrassingly parallel problem, we can further increase the throughput by running many Pix2Prof instances simultaneously. In perspective, Pix2Prof would take under an hour to infer profiles for 105 galaxies on a single NVIDIA DGX-2 system. A single human expert would take approximately 2 yr to complete the same task. Automated methodology such as this will accelerate the analysis of the next generation of large area sky surveys expected to yield hundreds of millions of targets. In such instances, all manual approaches – even those involving a large number of experts – will be impractical.
Original languageEnglish
Pages (from-to)96-105
Number of pages10
JournalMonthly Notices of the Royal Astronomical Society
Volume503
Issue number1
DOIs
Publication statusPublished - 1 Feb 2021

Fingerprint

Dive into the research topics of 'Pix2Prof: fast extraction of sequential information from galaxy imagery via a deep natural language ‘captioning’ model'. Together they form a unique fingerprint.

Cite this