Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Cited by Google
- Similars in SciELO
- Similars in Google
Share
Revista Colombiana de Estadística
Print version ISSN 0120-1751
Rev.Colomb.Estad. vol.35 no.spe2 Bogotá June 2012
1Universidad Nacional de Colombia, Departamento de Estadística, Bogotá, Colombia. Ph.D. student. Email: jedavilas@unal.edu.co
2Universidad Nacional de Colombia, Departamento de Estadística, Bogotá, Colombia. Professor. Email: lalopezp@unal.edu.co
3Universidad Nacional de Colombia, Departamento de Estadística, Bogotá, Colombia. Professor. Email: lgdiazm@unal.edu.co
We introduce a new approach for modeling multivariate overdispersed binomial data, from a plant pathogen complex. After recalling some theoretical foundations of generalized linear models (GLMs) and Copula functions, we show how the later can be used to model correlated observations and overdispersed data. We illustrate this approach using fungal incidence in vegetables, which we analyzed using Gaussian copula with Beta-binomial margins. Compared to classical and generalized linear models, the model using Gaussian copula function best controls for overdispersion, being less prone to the underestimation of standard errors, the major cause of wrong inference in the statistical analysis of plant pathogen complex.
Key words: Epidemiological methods, Extra-binomial variation, Multivariate data.
Se introduce un nuevo enfoque para modelar datos binomiales multivariados con sobredispersión, obtenidos de complejos de patógenos vegetales. Después de revisar los conceptos básicos de los modelos lineales generalizados (GLMs) y las funciones Cópula, se muestra cómo estas últimas pueden usarse para modelar observaciones correlacionadas y datos con sobredispersión. Se ilustra el método usando la incidencia de hongos en hortalizas, analizando el caso por medio de la función cópula Gaussiana con marginales Beta-binomiales. Comparado con los modelos lineales clásicos y generalizados, el modelo construido con la cópula Gaussiana es el que mejor controla la sobredispersión, siendo menos propenso a la subestimación de los errores estándar, la causa más importante de inferencia inapropiada en el análisis estadístico de complejos de patógenos vegetales.
Palabras clave: métodos epidemiológicos, variación extra-binomial, datos multivariados.
Texto completo disponible en PDF
References
1. Acar, E., Craiu, R. & Yao, F. (2011), 'Dependence calibration in conditional copulas: a nonparametric approach', Biometrics 67, 445-453. [ Links ]
2. Casella, G. & Berger, R. (2002), Statistical Inference, 2 edn, Duxbury Press, Florida, United States. [ Links ]
3. Cely, B. (1996), Control de mildeo velloso (Peronospora destructor) en el cultivo de cebolla de rama mediante protección cruzada, Tesis de grado, Universidad Pedagógica y Tecnológica de Colombia, Tunja, Colombia. [ Links ]
4. Cherubini, U., Luciano, E. & Vecchiato, W. (2004), Copula Methods in Finance, John Wiley & Sons, England. [ Links ]
5. Claeskens, G. & Hjort, N. (2008), Model Selection and Model Averaging, Cambridge University Press, Cambridge. [ Links ]
6. Cox, D. R. (1983), 'Some remarks on overdispersion', Biometrika 7(1), 269-274. [ Links ]
7. Durrant, W. & Dong, X. (2004), 'Systemic acquired resistance', Annual Review of Phytopathology 42, 185-209. [ Links ]
8. Dávila, E. (2005), Modelación multivariada de la sobredispersión en datos binarios, aplicación en epidemiología vegetal, Tesis de Maestría, Universidad Nacional de Colombia, Bogotá, Colombia. [ Links ]
9. Dávila, E. & López, L. (2010), Modeling multivariate overdispersed binomial data, 'International Biometrics Conference', XXV International Biometric Conference, Florianópolis, Brazil. [ Links ]
10. Embrechts, P. (2009), 'Copulas: a personal view', The Journal of Risk and Insurance 76(3), 639-650. [ Links ]
11. Fischer, M. (2011), Multivariate copulas, 'Dependence Modeling Vine Copula Handbook', World Scientific, p. 19-36,. [ Links ]
12. Genest, C. & Neslehova, J. (2007), 'A primer on copulas for count data', ASTIN Bulletin 37(2), 475-515. [ Links ]
13. Genest, C., Rémillard, B. & Beaudoin, D. (2009), 'Goodness-of-fit tests for copulas: a review and a power study', Insurance: Mathematics and Economics 44, 199-213. [ Links ]
14. Griffiths, D. A. (1973), 'Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease', Biometrics 29, 637-648. [ Links ]
15. Gronneberg, S. (2011), The copula information criterion and its implications for the maximum pseudo-likelihood estimator, 'Dependence Modeling Vine Copula Handbook', World Scientific, p. 113-138. [ Links ]
16. Hardle, W. & Simar, L. (2007), Applied Multivariate Statistical Analysis, Springer-Verlag, Berlin. [ Links ]
17. Heyde, C. (1997), Quasi-likelihood And Its Applications: A General Approach To Optimal Methods of Estimation, Springer, New York. [ Links ]
18. Hinde, J. & Demetrio, C. (1998), Overdispersion: Models and estimation, XIII Sinape, Caxambu, Brazil. [ Links ]
19. Joe, H. (1997), Multivariate Models and Dependence Concepts, Chapman and Hall, London. [ Links ]
20. Jorgensen, B. (1997), Dispersion Models, Chapman and Hall, London. [ Links ]
21. Jorgensen, B. & Lauritzen, S. (2000), 'Multivariate dispersion models', Journal of Multivariate Analysis 74, 267-281. [ Links ]
22. Kojadinovic, I., Yan, J. & Holmes, M. (2011), 'Fast large-sample goodness-of-fit for copulas', Statistica Sinica 21, 841-871. [ Links ]
23. Lambert, P. & Vandenhende, F. (2002), 'A copula-based model for multivariate non-normal longitudinal data: Analysis of a dose titration safety study on a new antidepressant', Statistics in Medicine 21, 3197-3217. [ Links ]
24. Li, J. & Wong, W. (2011), 'Two-dimensional toxic dose and multivariate logistic regression, with application to decompression sickness', Biostatistics 12, 143-155. [ Links ]
25. Madsen, L. & Fang, Y. (2011), 'Joint regression analysis for discrete longitudinal data', Biometrics 67(3), 1171-1175. [ Links ]
26. McCullagh, P. & Nelder, J. (1989), Generalized Linear Models, Chapman and Hall/CRC, London. [ Links ]
27. McCulloch, C., Searly, S. & Neuhaus, J. (2008), Generalized Linear and Mixed Models, Wiley, New York. [ Links ]
28. Mikosch, T. (2006), 'Copulas: tales and facts (with discussion and rejoinder)', Extremes 9, 3-63. [ Links ]
29. Nelsen, R. (2006), An Introduction to Copulas, 2 edn, Springer, New York. [ Links ]
30. Nikoloulopoulos, A. (2012), 'Letter to the editor', Biostatistics 13(1), 1-3. [ Links ]
31. Nikoloulopoulos, A. & Karlis, D. (2010), 'Modeling multivariate count data using copulas', Statistics in Medicine 27, 6393-6406. [ Links ]
32. Smith, P. & Heitjan, F. (1993), 'Testing and adjusting for departures from nominal dispersion in generalized linear models', Applied Statistics 42(1), 31-34. [ Links ]
33. Song, P. X. (2000), 'Multivariate dispersion models generated from gaussian copula', Scandinavian Journal of Statistics 27, 305-320. [ Links ]
34. Song, P. X. (2007), Correlated Data Analysis: Modeling, Analytics, and Applications, Springer, New York. [ Links ]
35. Song, P. X., Li, M. & Yuan, Y. (2009), 'Joint regression analysis of correlated data using gaussian copulas', Biometrics 65, 60-68. [ Links ]
36. Song, P. X., Li, M. & Yuan, Y. (2011), 'Joint regression analysis for discrete longitudinal data - rejoinder', Biometrics 67(3), 1175-1176. [ Links ]
37. Tregouet, D., Ducimetiere, P., Bocquet, V., Visvikis, S., Soubrier, F. & Tiret, L. (1999), 'A parametric copula model for analysis of familial binary data', American Journal of Human Genetics 64(3), 886-893. [ Links ]
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv35n2a05,
AUTHOR = {Dávila, Eduardo and López, Luis Alberto and Díaz, Luis Guillermo},
TITLE = {{A Statistical Model for Analyzing Interdependent Complex of Plant Pathogens}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2012},
volume = {35},
number = {2},
pages = {255-270}
}