1. Introduction
In recent years there has been an increase in mortality rates due to the presence of malaria along the national territory. It should be highlighted that malaria is a disease caused by the protozoa of the plasmodium genus, transmitted by vectors1. According to the Ministry of Health there are four types of this parasite (P. falciparum, P. vivax, P. malariae and P. ovale) that affect approximately 12 million people 1.
The National Institute of Health (INS) in its protocol of public health security, ensures that about 85% of the national rural territory is below 1,600 meters above sea level and has the ideal geographic, climatic and epidemiological conditions for the transmission of this disease 2. According to statistics in the last decade, there is an annual average of 130 to 150 deaths, with the Pacific region of the country being the most significant 3.
Deficiency in control programs, socioeconomic conditions of the population with a high degree of vulnerability, and problems with unsatisfied basic needs affect the gradual increase of malaria transmission 4. According to the INS in its 2014 epidemiological bulletin, it reveals the quantitative and percentage distribution of cases registered throughout the national territory, in which the department of Chocó registers more than 11,000 cases which represent 41.41% of the total cases, turning into a pillar of the spread of the disease 5. The data required for this analysis were collected through the Sivigila 2.
As previously mentioned, there are several studies that involve these variables and their existing spatial relationship. Among them, the figures provided by the DANE, INS newsletters and specifically an article from the biomedical journal on malarial epidemiology in Colombia 6. Based on these studies and the data collected, it was considered of great importance to perform a spatial statistics analysis that would prove or reject the hypothesis that there is a spatial relationship of a disease such as malaria taking into account the physical and healthy conditions as variables and that validate that the closer a study area is in relation to the infection zone, the greater the probability of infection.
this way, the main objective of the research was to implement a spatial statistical model that best explained the dependence between the observations of Malaria in 2016 and the explanatory variables that possibly have an impact on the occurrence of the disease, as well as the generation of maps that allow to stratify the zones according to the number of occurrence and the risk that the spreading of the infection in the department of Chocó presents.
2. Metodology
2.1. Study zone
The department of Chocó is located on the western side of Colombia, in the Pacific region, located between the latitudes north 4°00'50" and 8°42'32" and the longitudes west 76°2'57" and 77°53'38" (Figure 1). The departmental territory is within the area of equatorial climates, characterized by high rainfall, the temperature of its valleys and coastal lowlands, is higher than 27° C, usually accompanied by high relative humidity 90% 7. In the department there are natural national parks (Utría, Katíos and Tatamá). The majority of the population is black and originated from the African slaves brought during colonial times, for mining exploitation. In the Department there are 82 safeguards, 6 of which are shared with the department of Valle del Cauca 8.
The territory is made up of the basins of the Atrato, San Juan and Baudó rivers. In this geographical framework, the following physiographic units are distinguished: the coastal strip, divided by cabo Corrientes; considered the most important accident of the Pacific coast. The hydrographic system is one of the most abundant and interesting in the country 9; In addition to the Atrato, San Juan and Baudó Rivers, the Andágueda, Bebará, Bebaramá, Bojayá, Docampadó, Domingodó, Munguidó and Opogodó Rivers are important 10. Chocó has 30 municipalities, 147 corregimientos and 135 police inspections. The most important cities are in their order Quibdó, Istmina, Condoto, Acandí and Bahía Solano 11.
The limits of the department are: by the north with Panama and the Caribbean Sea; on the east with the departments of Antioquia, Risaralda and Valle del Cauca; and to the west with the Pacific Ocean. (Figure 1). It has an area of 46,530 km2, a population of 3'657,821 inhabitants and its capital with a population of 500,093 inhabitants 12 for the year 2005 according to the DANE.
2.2 Data
According to the INS there are various types of physical, economic and social conditions that help this disease to spread and affect more areas than others 5. Because of this, some variables that can provide significant information in the spatial modeling of the disease are included. First, it is considered that the height above sea level allows the survival of the species of the transmitting mosquito 4 since at higher altitudes the probability of the carrier's existence is lower. The height is usually in hand with another physical variable such as temperature which in the first instance that could play a fundamental role in the reproduction of the larva of the mosquito giving the ideal conditions for its development as it has been shown in some African countries 13. Variables such as precipitation and relative humidity (the latter are average annual values) as well as the area covered by forest create an ideal environment for the incubation, proliferation and reproduction of the insects that transmit malaria 14.
Moreover, there are social and economic factors. These factors, according to their management, can reduce or accelerate the transmission of the disease, generating appropriate habitats for the transmitting mosquito. The incidence of sewage and aqueduct coverage can be highlighted since they directly affect the quality of the population's water resources, the level of Unsatisfied Basic Needs (%), which highlights that 79.19% of the total population of the population department is in precarious conditions 15.
For the elaboration of the database, information obtained from the land use planning diagram (EOT) of each of the municipalities of the department of Chocó was taken into account since in them one of the first components within the formulation study is the description of the basic variables of the municipality. Among them and the most important we have the POT of Quibdó 16, the municipality of Tadó 17, the municipality of Lloró 18, the municipality of Riosucio 19 and the municipality of Istmina 20. The data for the other municipalities were taken by the EOT of each of them.
Figure 2 shows the boxplot diagram in which the distribution of malaria cases in the department is observed. There is evidence of the existence of very high outliers (values above 1,100) at the upper end of the box, these values are found in municipalities south of the department while lower values are in the north. It is observed as a value within the interquartile range corresponds to the number of deaths in the municipality of Riosucio (see arrow) which shows the relationship of the graph and the municipalities in the department.
Table 1 shows the nomenclature that will be applied in the future for each of the study variables that will be taken into account in the next analyzes.
2.3 Preparation of Malaria Maps
The calculation of the standardized morbidity rate (SMR) was made, which is obtained from the quotient between the number of cases observed for each of the municipalities in the department of Chocó and the number of expected cases, the latter was obtained multiplying the population at risk presented in each municipality and the total incidence rate, which, in turn, is the quotient between the total number of cases observed and the total population present in the 26 selected municipalities of the department. This rate allowed to detect the municipalities where there was a greater or lesser number of observed and expected cases.
It is possible to represent the study of a disease through disease probability maps, spatial autocorrelation tests, case aggregations and spatial regression models 21. As for the probability maps, the function was used, which yields the probability of finding a high or low observed value compared to a comparison with the expected values, for this case of malaria for each municipality. The test used to determine the existence of self spatial correlation is the Moran Index test (equation 1), in order to identify the degree of spatial aggregation for each of the observations the LISA statistic was evaluated (Local Indicators for Spatial Autocorrelation).
Where, the sum of elements of the weight matrix. The observations Z are the deviations from the mean, Xi is the value of the variable in a given spatial unit and Xj is the value of the variable in another location, usually the neighbors to Xi 22.
Knowing that linear models are not the best approximation for modeling events related to the counting of cases of a certain disease, a Poisson distribution was assumed, which is part of the generalized linear models (GLMs for its acronym in English), this being one of the most appropriate for the type of variable that will be used 23 and in this way assign statistical modifications to the values so that the variables explain most of the model and statistically significant variables are obtained at 0.05 and 0.1 of level of significance. Figure 3 shows the generic formulas of the probability distributions that are part of the GLM 24.
3. Analysis and Discussion of Results
To select the best spatial weights matrix, self-correlation tests and map analysis, the SMR for malaria was taken into account in 2016, as well as the explanatory variables that seek to model the endemic phenomenon to environmental and socioeconomic conditions. The statistical analysis of each variable separately as well as its spatial interaction allow the selection of the most convenient spatial criteria and in this way observe with the Moran index the possible existence of self-spatial correlation that implies the abstraction of a regression model that generate the variables that best explain the malaria rate for each municipality of the department of Chocó.
3.1 AIC Criterion: Selection of the spatial weights matrix
To make the selection of the best spatial matrix, the Akaike criterion (AIC Akaike's Information Criterion) was used for each of the types of spatial weights neighborhood matrices, which can be binary or standardized, all the criteria that were evaluated are presented in Table 2, along with each of the results of AIC.
From the criteria based on graphs, the sphere of influence was selected because it had the lowest value of AIC and because it takes into account the intersection of the regions in at least two places (Figure 4), while the criteria of 1 more neighbors close and the tower contiguity criterion give the best forecasts for the generation of the regression and analysis models of the Moran scaterplot as well as the same indicator.
MATRICES DE VECINDADES | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Contigüidad Física | Criterios basados en gráficas | Criterios basados en distancia | ||||||||
Torre | Reina | Triangulación Delaunay | Esfera de Influencia | Gráfica de Gabriel | Vecinos Relativos | Vecino más cercano | ||||
1 | 2 | 3 | 4 | |||||||
AIC | 33,274 | 26,575 | 30,973 | 18,397 | 31,579 | 30,864 | 32,67 | 23,31 | 28,02 | 32,04 |
General Histogram
Figure 5a shows the set of municipalities with the lowest standardized morbidity rate, while in Figure 4b the municipality of Bagadó has a high rate, given that the number of cases observed is greater than the number of cases. Cases that could be expected for this municipality. This analysis was applied in this stage only to the variable SMR because the explanatory variables have a role in the modeling, for this point the exploratory analysis of the malaria data seeks to give a hypothetical picture of how this phenomenon explained by itself in the municipalities of the department of Chocó.
3.2 Spatial Autocorrelation
To verify the initial hypothesis of the presence or absence of spatial dependence, the significant association of similar or different values among the neighboring municipalities of the Department of Chocó was examined through the I. de Moran tests. The global tests of the I. de Moran for the SMR yielded a value of 0.4835 generated from 999 simulations, as its value is positive it is spoken of a positive self-correlation, not very strong, but it reveals the existence of the phenomenon through of the municipalities as study areas.
Map of spatial correlation auto
Taking into account the statistical results of certain variables such as the one observed in Figure 5, it is evident that there is a certain degree of spatial relationship between the incubation of this disease and the fatal cases that occur. Taking this into account, an analysis of the global Moran index is made with which the relationship of the areas in terms of malaria and the dispersion of the points between the analysis of pair of variables can be identified.
The maps generated from the local Moran index can be visualized directly in the units of measurement that for this case are the municipalities (see Figure 6) and scatter diagrams that will be explained later in figure 8. The map of the Moran index it generates the graphic representation of the dispersion of the explanatory variables vs the response variable and they are grouped in quadrants (from High to Low according to the axis of the quadrant and their combinations). High values of the phenomenon and the variables that seek to explain it indicate a positive response that indicates that there is a relationship between the municipalities and very low values indicate that the explanatory variable is not the response of malaria in a given municipality.
In this way the municipalities with low-low values (LL) of Figure 6 are Bahía Solano, Unguia, Acandí and Riosucio. Similarly, the municipalities with high-high values (HH) of Figure 6 are Istmina, Pan-American Union, Tadó, San Pablo Canton and Paimado. This map of the malarial relationship allows us to observe how the high values of malaria are concentrated in nearby neighbors of the south of the department while the municipalities to the north and border with Panama and the ocean have a low relation of hatching. The local index shows how areas of low malarious rate are related to other municipalities, while southern municipalities share environmental variables that generate favorable circumstances for the development of the disease.
The Moran index is based on the tower criterion that best explained the spatial autocorrelation of malaria among the study municipalities. However, the spatial autocorrelation map is not the only result that yields this statistical index. A very valuable map to identify the spatial distribution of the disease is the probability map presented in Figure 7, there some municipalities of the department show a significant response to the probability of having high rates of deaths from malaria being evaluated with the level of significance less than 0.05, 0.01, 0.001 and 0.0001 respectively.
The probabilistic measure indicates that municipalities such as Acandí, Riosucio, Tado and Istmina register a degree of spatial autocorrelation at levels of significance less than 0.01 and the municipality of Unguia less than 0.001. Compared to the spatial autocorrelation map of the local Moran, the municipalities that establish a spatial relationship between the variables are quite similar, and a high probability is due to the forest cover that makes other terrain variables adapt or remove the transmitting insects.
Moran Scaterplot
There are several ways to calculate the Moran index, visualize it and even interpret it, in this case the scaterplot is presented, this graph shows the position of the representative points of each municipality in a Cartesian plane where each quadrant represents (High-high, high- low, low-low and low-high) respectively (Figure 8).
A high concentration of points is observed in the HH quadrant, which is equivalent to the map in Figure 6 where some municipalities are observed to the south of the department with this behavior, the distribution of these points is observed mainly in the opposite quadrants and yields a Moran index. Of 0.48 confirming the relationship of malaria among the municipalities. These municipalities in quadrant one could explain how the area of highest concentration of cases can explain how hot spots and otherwise the quadrant three that explains the northern municipalities where the cold spots are and indicates that the spatial relationship between them it exists only that it minimizes the probability of occurrence of malaria cases.
3.3 Poisson Regression Model
Once the existence of spatial autocorrelation is identified from the scatterplot and the autocorrelation map, it is proposed to generate a regression model that has the result of identifying the variables that explain malaria among the municipalities of the department of Chocó. To identify the candidate variables, individual data analyzes are performed that seek to observe the statistical behavior of the variable against the malaria rate.
The individual analysis of the forest cover variable shown in Figure 9 shows a tendency of the data to a function of standard deviation with respect to the mean, in this way the intercept would adjust the data. Starting from this statistical concept, the standard deviation would be applied to the forest cover variable that will be entered into the regression model.
Initially, the spatial delay model shows a square R of 0.5694 with the unmodified variable, however, applying a standard deviation function, the R squared increases to 0.6305, which improves the analysis of the response variables of the model. The Poisson regression model was defined from the variables that were considered to be explanatory for the dependent variable, as previously mentioned. Next, the formula used for the model is presented:
The regression models were selected according to some variables in the evaluation of the levels of significance, but especially taking into account the information explained by the model. In this way, Table 3 shows the results of the R squared for each of the models taking into account the totality of proposed variables to explain malaria in the department of Chocó.
Model | Initial R-squere | Modify R-squere |
---|---|---|
Clasic | 0.5509 | 0.63 |
Spatial lag | 0.5305 | 0.53 |
Residuals | 0.5759 | 0.58 |
It was found that the statistically significant variables are the unsatisfied basic needs and forest cover with a modification to the original probability of the data, with this modification the model adjusted from 0.55 to 0.63 according to the R-square. Based on these variables entered into the model, those already mentioned are significant, since they are less than α = 0.05, while the other variables, although affecting the model in statistical and spatial terms, are necessary to convert before being eliminated from modeling (Table 4).
Variable | Coefficient | St. Error | Z-Value | Probability |
---|---|---|---|---|
Constant | -1653.91 | 1015.53 | -1.6286 | 0.1033 |
Log_Pob | -135.43 | 239.73 | -0.5649 | 0.5721 |
High | 0.169 | 0.1897 | 0.8908 | 0.373 |
Precipitation | 0.01946 | 0.0226 | 0.8585 | 0.3905 |
Desv_Forest | -0.1257 | 0.05334 | -2.3566 | 0.0184 |
NBI | 30.8457 | 10.0117 | 3.0809 | 0.002 |
CAA | 10.7117 | 7.1895 | 1.4899 | 0.1362 |
Significant variables can express a large amount of the model, in some cases they must be transformed to adapt better to the distribution of the data, so it is not advisable to eliminate variables in the first instance only because they are not immediately significant. Some variables such as NBI can explain a lot of the model because it is related to CAA, although it is not significant, so to finalize the construction of a regression model requires an interpretation of the variables, the scope and limitations of them within the spatial context that is addressed.
4. Conclusions
The phenomenon transmitted by malaria vectors has a tendency of self-spatial correlation, where high-value groupings occur and another one at the opposite extreme with low values. The dispersion of malaria is an effect of the presence of some factors that allow the adequate incubation and proliferation of the virus.
The sanitation factor transmitted as sewer and water coverage is significant since it generates a greater incubation of the mosquito that transmits malaria, as well as the forest cover that generates its own conditions for its development. Some variables such as precipitation and height are linked to the results of forest cover, thus contributing to the model that follows a Poisson distribution affected by data in percentage rates. Some variables such as temperature were not significant under any of the regression models, so it was prudent to dispense with it since the temperature throughout the department is quite fluctuating and in the regression model could affect the final result. The quality of life of people is an agent that accelerates the proliferation of infection and insects and more in departments where the rate of NBI is considerably high.
The study allows to conclude the relationship of the phenomenon among municipalities, taking into account how the mortality of high malaria cases are related in space as can be seen in the local Moran map and is perfectly comparable with the exploratory statistics chart. Studies like this one show the current needs of health that have not been widely addressed in aspects such as the geographical one that can open a new perspective of the malaria context in Colombia. It is recommended to analyze this type of diseases over time and validate more advanced statistical models with conjugated distributions that can explain the behavior of the data in a more precise way.