Abstract
Responses of olfactory receptors (ORs) can be predicted by applying machine learning methods on a multivariate encoding of an odorant's chemical structure. Physicochemical descriptors that encode features of the molecular graph are a popular choice for such an encoding. Here, we explore the EVA descriptor set, which encodes features derived from the vibrational spectrum of a molecule. We assessed the performance of Support Vector Regression (SVR) and Random Forest Regression (RFR) to predict the gradual response of Drosophila ORs. We compared a 27-dimensional variant of the EVA descriptor against a set of 1467 descriptors provided by the eDragon software package, and against a 32-dimensional subset thereof that has been proposed as the basis for an odor metric consisting of 32 descriptors (HADDAD). The best prediction performance was reproducibly achieved using SVR on the highest-dimensional feature set. The low-dimensional EVA and HADDAD feature sets predicted odor-OR interactions with similar accuracy. Adding charge and polarizability information to the EVA descriptor did not improve the results but rather decreased predictive power. Post-hoc in vivo measurements confirmed these results. Our findings indicate that EVA provides a meaningful low-dimensional representation of odor space, although EVA hardly outperformed "classical" descriptor sets.
Original language | English |
---|---|
Pages (from-to) | 855-865 |
Number of pages | 11 |
Journal | Molecular informatics |
Volume | 32 |
Issue number | 9-10 |
DOIs | |
Publication status | Published - 2 Oct 2013 |