SciELO - Scientific Electronic Library Online

 
vol.4 issue1Productivity costs associated to voice symptoms, low sleep quality, and stress among college professors during homeworking in times of COVID-19 PandemicImmediate effect of two semi-occluded vocal tract exercises in glottal contact of occupational voice users author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • On index processCited by Google
  • Have no similar articlesSimilars in SciELO
  • On index processSimilars in Google

Share


Revista de investigación e innovación en ciencias de la salud

On-line version ISSN 2665-2056

Abstract

MORIKAWA, Mateus; HERNANE SPATTI, Danilo  and  DAJER, María Eugenia. Wavelet packet transform and multilayer perceptron to identify voices with a mild degree of vocal deviation. Rev. Investig. Innov. Cienc. Salud [online]. 2022, vol.4, n.1, pp.16-25.  Epub June 06, 2022. ISSN 2665-2056.  https://doi.org/10.46634/riics.126.

Introduction:

Laryngeal disorders are characterized by a change in the vibratory pattern of the vocal folds. This disorder may have an organic origin described by anatomical fold modification, or a functional origin caused by vocal abuse or misuse. The most common diagnostic methods are performed by invasive imaging features that cause patient discomfort. In addition, mild voice deviations do not stop the individual from using their voices, which makes it difficult to identify the problem and increases the possibility of complications.

Aim:

For those reasons, the goal of the present paper was to develop a noninvasive alternative for the identification of voices with a mild degree of vocal deviation applying the Wavelet Packet Transform (WPT) and Multilayer Perceptron (MLP), an Artificial Neural Network (ANN).

Methods:

A dataset of 74 audio files were used. Shannon energy and entropy measures were extracted using the Daubechies 2 and Symlet 2 families and then the processing step was performed with the MLP ANN.

Results:

The Symlet 2 family was more efficient in its generalization, obtaining 99.75% and 99.56% accuracy by using Shannon energy and entropy measures, respectively. The Daubechies 2 family, however, obtained lower accuracy rates: 91.17% and 70.01%, respectively.

Conclusion:

The combination of WPT and MLP presented high accuracy for the identification of voices with a mild degree of vocal deviation.

Keywords : Voice; voice disorder; voice classification; voice deviation; artificial neural network; multilayer perceptron; wavelet packet transform; dysphonia; laryngeal diseases; vocal cords.

        · abstract in Spanish     · text in English     · English ( pdf )