SciELO - Scientific Electronic Library Online

 
vol.51 issue2Behavior of amoxicillin in water by means of implicit and explicit solvation methodsEl efecto de la temperatura del pre-tratamiento ácido de tierra de diatomeas natural en la relación SI/Al de zeolitas ZSM-5 author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • On index processCited by Google
  • Have no similar articlesSimilars in SciELO
  • On index processSimilars in Google

Share


Revista Colombiana de Química

Print version ISSN 0120-2804On-line version ISSN 2357-3791

Abstract

VALENCIA-COLMAN, Laura S.  and  DAZA С., Edgar E.. Recognition of biosynthetic pathways for semiochemicals using machine learning techniques. Rev.Colomb.Quim. [online]. 2022, vol.51, n.2, pp.35-40.  Epub Jan 12, 2024. ISSN 0120-2804.  https://doi.org/10.15446/rev.colomb.quim.v51n2.101546.

In this work we consider 148 semiochemicals reported for the family Scarabaeidae, whose chemical structure was characterized using a set of 200 molecular descriptors from five different classes. The selection of the most discriminating descriptors was carried out with three different techniques: Principal Component Analysis, for each class of descriptors, Random Forests and Boruta-Shap, applied to the total of descriptors. Although the three techniques are conceptually different, they select a similar number of descriptors from each class. We proposed a combination of machine learning techniques to search for a structural pattern in the set of semiochemicals and then perform their classification. The pattern was established from the high belonging of a subset of these metabolites to the groups that were obtained by a grouping method based on fuzzy C-means logic; the discovered pattern corresponds to the biosynthetic pathway by which they are obtained biologically. This first classification was corroborated with Kohonen's self-organizing maps. To classify those semiochemicals whose belonging to a biosynthetic pathway was not clearly defined, we built two models of Multilayer Perceptrons which had an acceptable performance.

Keywords : Random forests; C-means; molecular descriptors; family Scarabaeidae; multilayer perceptron; neural networks.

        · abstract in Spanish | Portuguese     · text in Spanish     · Spanish ( pdf )