Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Cited by Google
- Similars in SciELO
- Similars in Google
Share
Revista Colombiana de Estadística
Print version ISSN 0120-1751
Rev.Colomb.Estad. vol.43 no.1 Bogotá Jan./June 2020 Epub June 05, 2020
https://doi.org/10.15446/rce.v43n1.77542
ARTÍCULOS ORIGINALES DE INVESTIGACIÓN
Generalized Poisson Hidden Markov Model for Overdispersed or Underdispersed Count Data
Modelo oculto de Markov de Poisson generalizado para datos de recuento sobredispersados o subdispersos
a Department of Statistics, St. Thomas College, Pala, India. PhD. E-mail: sthottom@gmail.com
b Department of Statistics, St. Thomas College, Pala, India. Research Scholar. E-mail: ambilystat06@gmail.com
The most suitable statistical method for explaining serial dependency in time series count data is that based on Hidden Markov Models (HMMs). These models assume that the observations are generated from a finite mixture of distributions governed by the principle of Markov chain (MC). Poisson-Hidden Markov Model (P-HMM) may be the most widely used method for modelling the above said situations. However, in real life scenario, this model cannot be considered as the best choice. Taking this fact into account, we, in this paper, go for Generalised Poisson Distribution (GPD) for modelling count data. This method can rectify the overdispersion and underdispersion in the Poisson model. Here, we develop Generalised Poisson Hidden Markov model (GP-HMM) by combining GPD with HMM for modelling such data. The results of the study on simulated data and an application of real data, monthly cases of Leptospirosis in the state of Kerala in South India, show good convergence properties, proving that the GP-HMM is a better method compared to P-HMM.
Key words: EM algorithm; Generalized Poisson distribution; Hidden Markov Model; Overdispersion
El método estadístico más adecuado para explicar la dependencia serial en los datos de recuento de series de tiempo se basan en los modelos ocultos de Markov (HMM). Estos modelos suponen que las observaciones se generan a partir de un finito mezcla de distribuciones regidas por el principio de la cadena de Markov (MC). El modelo de Markov oculto de Poisson (P-HMM) puede ser el método ms utilizado para modelar las situaciones mencionadas anteriormente. Sin embargo, en el escenario de la vida real, este modelo no puede considerarse como la mejor opción. Teniendo en cuenta este hecho, nosotros, en este artículo, apostamos por la distribución generalizada de Poisson (GPD) para modelar datos de conteo. Este método puede rectificar la sobredispersión y subdispersión en el modelo de Poisson. Aqu desarrollamos Poisson generalizado Modelo de Markov oculto (GP-HMM) combinando GPD con HMM para modelando tales datos. Los resultados del estudio sobre datos simulados y una aplicación de datos reales, casos mensuales de leptospirosis en el estado de Kerala en South India, muestra buenas propiedades de convergencia, lo que demuestra que el GP-HMM Es un método mejor en comparación con P-HMM.
Palabras clave: Algoritmo EM; Distribución generalizada de Poisson; Modelo oculto de Markov; Sobredispersión
References
Baum, L. E. (1972), 'An Inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes', Inequalities 3, 1-8. [ Links ]
Cepeda-Cuervo, E. & Cifuentes-Amado, M. V. (2017), 'Double Generalized Beta-Binomial and Negative Binomial Regression Models', Revista Colombiana de Estadística 40(1), 141-163. [ Links ]
Consul, P. C. (1989), Generalized Poisson Distributions: Properties and Applications, Dekker, New York. [ Links ]
Consul, P. C. & Jain, G. C. (1973), 'A Generalization of Poisson Distribution', Technometrics 15(4), 791-799. [ Links ]
Consul, P. C. & Shoukri, M. M. (1984), 'Maximum likelihood estimation for the generalized Poisson distribution', Communication in Statistics - Theory and Methods 13(12), 1533-1547. [ Links ]
Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), 'Maximum Likelihood from Incomplete Data via the EM Algorithm', Journal of the Royal Statistical Society, Serie B 39(1), 1-38. [ Links ]
Greenwood, M. G. & Yule, G. U. (1920), 'An inquiry into the nature of frequency distributions representative of multiple happenings, with particular reference to the occurrence of multiple attacks of disease or of repeated accidence', Journal Royal Statistical Society 83, 255-279. [ Links ]
Joe, H. & Zhu, R. (2005), 'Generalized Poisson Distribution: the Property of Mixture of Poisson and Comparison with Negative Binomial Distribution', Biometrical Journal 47(2), 219-229. [ Links ]
Kendall, M. & Stuart, A. (1963), The Advanced Theory of Statistics, Vol. 1, Hafner Publishing Co., New York. [ Links ]
Neyman, J. (1931), 'On a new class of contagious distributions, applicable in entomology and bacteriology', Technometrics 10, 35-57. [ Links ]
Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), 'An Empirical Comparison of EM Initialization Methods and Model Choice Criteria for Mixtures of Skew-Normal Distributions', Revista Colombiana de Estadística 35(3), 457-478. [ Links ]
Sebastian, T., Jeyaseelan, V., Jeyaseelan, L., Anandan, S., George, S. & Bangdi-wala, S. (2019), 'Decoding and modelling of time series count data using Poisson hidden Markov model and Markov ordinal logistic regression models', Statistical Methods in Medical Research 28(5), 1552-1563. [ Links ]
Tuenter, H. J. H. (2000), 'On the generalized Poisson distribution', Statistica Neerlandica 54, 374-376. [ Links ]
Wang, W. & Famoye, F. (1997), 'Modelling household fertility decisions with generalized Poisson regression', Journal of Population Economics 10, 273-283. [ Links ]
Witowski, V. & Foraita, R. (2013), HMMpa: Analysing accelerometer data using hidden markov models, R package version 1.0.1. *https://cran.r-project.org/package=HMMpa [ Links ]
Witowski, V., Foraita, R., Pitsiladis, Y., Pigeot, I. & Wirsik, N. (2014), 'Using hidden Markov models to improve quantifying physical activity in accelerometer data - A simulation study', PLOS ONE 9(12), 77-92. [ Links ]
Zucchini, W. & MacDonald, I. L. (2009), Hidden Markov Models for Time Series: An Introduction Using R, Chapman and Hall, Boca Raton. [ Links ]