Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Cited by Google
- Similars in SciELO
- Similars in Google
Share
Revista Colombiana de Química
Print version ISSN 0120-2804On-line version ISSN 2357-3791
Abstract
VALENCIA-COLMAN, Laura S. and DAZA С., Edgar E.. Recognition of biosynthetic pathways for semiochemicals using machine learning techniques. Rev.Colomb.Quim. [online]. 2022, vol.51, n.2, pp.35-40. Epub Jan 12, 2024. ISSN 0120-2804. https://doi.org/10.15446/rev.colomb.quim.v51n2.101546.
In this work we consider 148 semiochemicals reported for the family Scarabaeidae, whose chemical structure was characterized using a set of 200 molecular descriptors from five different classes. The selection of the most discriminating descriptors was carried out with three different techniques: Principal Component Analysis, for each class of descriptors, Random Forests and Boruta-Shap, applied to the total of descriptors. Although the three techniques are conceptually different, they select a similar number of descriptors from each class. We proposed a combination of machine learning techniques to search for a structural pattern in the set of semiochemicals and then perform their classification. The pattern was established from the high belonging of a subset of these metabolites to the groups that were obtained by a grouping method based on fuzzy C-means logic; the discovered pattern corresponds to the biosynthetic pathway by which they are obtained biologically. This first classification was corroborated with Kohonen's self-organizing maps. To classify those semiochemicals whose belonging to a biosynthetic pathway was not clearly defined, we built two models of Multilayer Perceptrons which had an acceptable performance.
Keywords : Random forests; C-means; molecular descriptors; family Scarabaeidae; multilayer perceptron; neural networks.