1. Introduction
Droughts are temporary natural events of insufficiency of water, manifested in the inability to satisfy the demands of human activities and the environment [1]. Beyond this definition, droughts are also classified according to their impacts on specific sectors [2] and based on the physical processes involved [3] in meteorological, hydrological, agricultural, socioeconomic droughts, among others. This paper is focused on meteorological and hydrological droughts, which solely require hydro-climatic data for its analysis.
On the one hand, meteorological droughts express the lack of precipitation over a region in a given period of time, where the deficit is evaluated with respect to average values [2]. On the other hand, hydrological droughts are defined as a significant decrease in the availability of water within the hydrological cycle [4] (insufficient volumes of surface and subsurface water) and are usually associated with scarcity, in terms of deficits for human consumption [5].
Droughts occurrence has a direct impact on domestic water supply, affecting food security, public health and can alter economic sectors dependent on water, such as irrigation, hydroelectric production, and agro-industry. [6]. This is why droughts are considered one of the threats of natural origin of priority study, as they represented, in the twentieth century, the greatest economic impact of all natural hazards [7].
Some of the deficiencies in drought management include limited in-situ information and the lack of tools for forecasting and monitoring [8]. Therefore, it is urgent to improve regional monitoring, the databases used, and the early warning systems [1].
According to this, in this paper, we have explored the use of two global reanalysis databases from the eartH2Observe project, in order to consider its use in studies of meteorological and hydrological droughts in Colombia, through the calculation of drought indicators. From the diverse existing indices to describe droughts, we selected three: the Standardized Precipitation Index (SPI), the Standardized Precipitation Evapotranspiration Index (SPEI) and the Water Crowding Index (WCI). These inidices were chosen due to their easy application, the absence of restrictions of scale or topography, and because they are recommended by the World Meteorological Organization (WMO) [9].
Besides, in order to assess the influence on the results of the use of different sources of information, the uncertainty associated with diverse data inputs was quantified.
2. Case of study
The Magdalena-Cauca river basin (MCRB) is one of the most important watersheds in Colombia, due to its physical and socioeconomic characteristics. It has a total area of 273,459 km2, equivalent to 24% of the Colombian territory, and concentrates 80% of the GDP and 80% of the Colombian population.
The Andes Mountain Range defines the hydrographic system of the MCRB, by forming two valleys, corresponding to the Cauca (1,015 km) and the Magdalena River, (1,550 km). These rivers drain from south to north to the mouth in the Caribbean Sea, at Bocas de Ceniza, as shown in Fig.1.
Climates in the MCRB are very diverse, due partly to its large area, the complex relief, and the considerable altitudinal differences. The average annual precipitation (P) in the MCRB is around 2,050 mm with maximal values of 260 mm in October (Fig.2). The south and middle part of the basin experience two periods of high and low rainfall, with wetter periods during the months of April-May and October-November, with a bimodal precipitation regime. In contrast, in the northern part of the basin, near the mouth, there is a unique wet season in the second half of the year (May-November), characteristic of a unimodal rainfall regime.
The average potential evapotranspiration (PET) in the MCRB is close to 1,570 mm per year (as shown in Fig. 2, monthly distributed). Lower values occur in the upland areas, with figures around 880 mm, and maximum values in the lowlands, close to 1,900 mm per year (Fig.3).
3. Data and methods
This section describes the three databases considered, the three drought indicators used, and the evaluation approach adopted, in order to evaluate the use of global data from the eartH2Observe project for drought analysis.
3.1. Meteorological data
The use of series of at least 30 years as a reference for the calculation of climatic anomalies, under the assumption of stationary conditions, has been reported in several studies [10, 11, 12]. In this sense, the three databases considered in this study (summarized in Table 1) were accumulated in monthly time steps for the period 1980-2010 (31 years), since it corresponds to the current period of the climatological normal, defined by the WMO [13].
3.1.1. Observed or in-situ time series
In situ information, used as a benchmark was provided by the Institute of Hydrology, Meteorology and Environmental Studies of Colombia (IDEAM, for its acronym in Spanish). It included daily time series of precipitation (905 stations), maximum temperature (468 stations), and minimum temperature (259 stations), whose location is shown in Fig. 4.
These daily time series were pre-processed within the eartH2Observe project, considering the amount of missing and atypical data. Since the in-situ information comes from records for specific stations, we created daily spatialized fields for the distributed analysis. For this purpose, daily in-situ precipitation values (P) were interpolated, using the geo-statistical method of Kriging with External Drift and taking as secondary variable the elevation above sea level from a Digital Elevation Model (DEM) of 30 m resolution [14].
Likewise, daily temperature values (T) were interpolated using CoKriging, also taking altitude as a secondary variable [14]. In this manner, we obtained spatial maps of P and T with a spatial resolution of 0.1°, and a total of 2215 cells or pixels within the MCRB domain. In both cases, we used cross-validation to identify the best interpolation method.
Due to the simplicity and good performance of the Hargreaves equation (in comparison with other evaluated evapotranspiration equations) [15], it was used to estimate distributed daily potential evapotranspiration (PET), with the same spatial resolution, accumulated on monthly basis.
3.1.2. WATCH Forcing Data Methodology Applied to ERA- Interim (WFDEI)
The eartH2Observe project had two phases of development, producing two sets of hydrometeorological products. The first one used the meteorological forcing dataset WFDEI with spatial resolution of 0.5°, generated by bilinear interpolation for the period 1979-2012, at various time scales [16]. It includes variables of wind speed at 10 m height, atmospheric pressure, temperature at 2 m height, long and short wave radiation, precipitation and snow. However, in this work snow was not considered.
The WFDEI dataset was constructed using the EU WATCH project methodology [17] from the ERA-Interim dataset. This data set is a global meteorological reanalysis developed by the European Center for Medium-Range Weather Forecast (ECMWF), using a sequential system of data assimilation. The data set combines observed data with the simulations from a forecasting model, to estimate the evolution of the atmosphere globally. Based on this, the ECMWF applied a simulation of the global terrestrial water cycle in the 20th century through a set of hydrological models and their inter-comparison, finally making a correction by sequential elevation and by monthly bias [18].
According to the meteorological variables of the WFDEI, researchers from Deltares in The Netherlands calculated PET globally using four different equations: Penman-Monteith, Priestly-Taylor, Hargreaves, and Blanney-Criddle [19]. The estimates made with the Hargreaves equation showed the greatest similarity with those of the WorldClim PET [20], so these were selected for the analysis of droughts in this study.
3.1.3. Multi-Source Weighted-Ensemble Precipitation (MSWEP)
During the second phase of the eartH2Observe project, a second set of hydrometeorological data was prepared for the period 1974-2014, with precipitation and snow coming from the MSWEP [21] product. The other variables were derived from the ERA-Interim dataset, with a series of applied corrections [22].
The MSWEP data set is a distributed global precipitation product, constructed from data from ground stations, remote sensing with ground radars, information derived from satellites and atmospheric models. Its main objective is doing hydrological modeling and it is currently available with spatial resolutions of 0.25° and 0.1°, for several time scales: monthly, daily and sub-daily every 3 hours [21].
Created with several sources of information, it follows a simple procedure, summarized in four steps: 1) Derivation of a climate average corrected by long-term bias, 2) Evaluation of several satellite data sets and precipitation reanalysis in terms of temporal variability, 3) Long-term climatic average was temporarily reduced in a staggered manner, first to the monthly time scale, then to the daily time scale, and 4) Finally, it was reduced to the 3-hour scale using weighted average anomalies of precipitation, derived from the different data sets [21].
As with the products of the first phase, Deltares calculated the daily potential evapotranspiration at a global scale using the Hargreaves equation [19].
3.2. Population data
Gridded population data for the MCRB were created by extrapolating the raw census of the target years 1993, 2005, and the projections for 2010, from the public information of the Department of Statistics in Colombia (DANE, for its acronym in Spanish) [23].
First, we created a base map for the 2005 population, by interpolating the census values and incorporating additional geographic data, in order to produce weighting matrices for determining how to apportion population by pixel. Then, assuming that the population is growing with an exponential model, the rates of growth were calculated by municipality using the 1993, 2005 and 2010 data. Finally, maps of annual population were projected throughout the study period, from 1981 to 2010.
3.3. Drought indices
Drought indicators simplify information about meteorological phenomena as precipitation to understand its change over a time period [24]. These indicators assess whether a region is experiencing a drought and quantify its severity. They are also useful for monitoring and mapping regional water supply trends, both temporal and spatial in two dimensions.
Although no indicator is better than another, some indices are more appropriate, depending on the region, the type of drought to analyze, the available information and the objective of the study. The indicators chosen to describe the drought at the regional level in the MCRB are described below.
3.3.1. Standardized Precipitation Index (SPI)
Mckee et al. [25] described the SPI as an indicator that allows determining the rarity of a drought or an anomalously humid event in a particular time scale (e.g. 1, 3, 6, 12, 24, etc. months), for any place that has a continuous record of precipitation. Since its definition, the SPI has been widely used to characterize drought events. The SPI is calculated based on a normalization of the rainfall series according to a statistical distribution function. The time series used must be long-term, with at least 30 years of data and may include missing data, as long as they are not statistically representative.
The SPI results are classified using a qualitative scale in which positive and negative values correspond to wet and dry events, respectively. A drought event is identified when the SPI value is equal to or less than -1.0 [25].
3.3.2. Standardized Precipitation Evapotranspiration Index (SPEI)
The main characteristic of the SPI is that it only requires precipitation data to identify different types of droughts (depending on the temporal scale used). This can be an advantage due to its practicality, but without considering the temperature, the results may not correctly represent the water balance in a region. In order to have a more robust indicator, Vicente-Serrano et al. [26] modified the SPI, by switching the precipitation information with the water balance (B = P - PET) time series, thus creating the SPEI.
Instead of analyzing the precipitation series, the SPEI evaluates the water balance, understood as the difference between precipitation and potential evapotranspiration. Compared to the SPI estimation, the SPEI calculations require additional information on climatic variables such as temperature, radiation, vapor pressure, wind velocity, among others, for the calculation of potential evapotranspiration, depending on the equation chosen. The SPI and SPEI are classified using the ranges shown in Table 2.
3.3.3. Water Crowding Index (WCI
To define the hydrological drought (as an alteration in the water availability), some authors have focused on the quantification of the scarcity of water for human consumption generated by drought events. Thus, Malin Falkenmark has developed a series of analysis of scarcity from a hydrological perspective [27], defining the WCI according to eq. (1).
WCI is calculated as the ratio between the population in a region (demand) and the annual volume of available water (supply), where the availability of water is understood as the precipitation that does not return to the atmosphere, e.g. surface runoff, available flow in a channel or aquifers’ recharge, depending on the scale of analysis [28]. In this study, runoff (R) represents the water available, calculated as precipitation (P) minus actual evapotranspiration (AET): R = P - AET. The AET has been estimated with the Budyko equation [29].
WCI classification (Table 3) evaluates how many people can benefit from each unit of available water, understanding a unit of water as one million cubic meters per year.
3.4. Methodological approach
To evaluate the eartH2Observe reanalysis products, for the drought analysis in the MCRB, the P and PET daily distributed data were re-gridded using bilinear interpolation to a 0.1° spatial resolution and accumulated monthly. Then, the distributed monthly values of AET and B were calculated.
Once the meteorological and demographic inputs were produced on the same spatial scale, the next phase consisted in calculating the three drought indicators with the three input databases (in-situ, WFDEI, and MSWEP). SPI and SPEI were calculated at 1, 3, 12 and 24 months, while the WCI was annually estimated. With these results, we characterize the dry events that the in-situ data set identifies and compare their correlation with the WFDEI and MSWEP results, using the Root-mean-square error (RMSE) and the Spearman correlation (ρ).
The last methodological phase of the study was to evaluate the uncertainty of the results, due to the use of several different meteorological information sources, applying the methodology proposed by Hu et al. [30]. This methodology quantifies the uncertainty of the results in terms of the bias and confidence intervals (CI), considering the impact of the uncertainty associated with the nature of the sample, on the uncertainty of the estimated values of the indicator.
For this purpose, we conducted a seasonal re-sampling of the meteorological series (P for the SPI, B for the SPEI and R for the WCI). This process starts by creating 1,000 random subsets from the original series (observed, WFDEI or MSWEP), i.e. a random selection of the data of a month (e.g Jan 1980 , Jan 1981 …, Jan 2010 ) that comprises a new sample of the original size. Then, we calculated the indicators (SPI, SPEI or WCI) for each sample, and its results are associated with a probability distribution function (DESPI, DESPEI or DEWCI). Based on this distribution, the 90% confidence interval (CI) of the indicator is estimated.
At last, we evaluated the CI through the containing ratio (CR) indicator [31], which is a ratio expressed as a percentage, between the number of observed values within the limits of the 90% IC and the total length of the observed variable, as shown in eq. (2).
4. Results and discussion
4.1. Drought events
In this section, we discuss the results for the SPI and the SPEI indices in the MCRB at temporal scales of 1, 3, 12 and 24 months, as well as the annual results for the WCI. All of them were calculated first with the in-situ series, to identify the most important drought events that have affected the MCRB during the analysis period (1980-2010).
Ten events were identified as those with the largest incidence throughout the MCRB and correspond to the years 1980, 1982-1983, 1985-1986, 1988, 1990-1992, 1995, 1997, 2001-2002, 2007 and 2009-2010 (representative months for these events are shown in Fig.5). These events were selected with a spatial affectation threshold of 40%, that is, when 40% of the basin or more had moderate, severe or extreme drought conditions (SPI, SPEI ≤ -1.00), according to the national recommendations for drought evaluations [32].
These results are consistent with the historical records of hydropower and irrigation affectations associated with droughts, which affected the levels of reservoirs and the general availability of water in the country. For example, the production of hydraulic electricity in the country is affected by droughts. The 1992 energy crisis forced the national government to take rationing measures with power cuts, and even to adopt a daylight-saving time, which is a rare measure in countries on the equatorial line [33]. Likewise, there was also a reduction in energy production during 1997-1998, although there was no need for electricity rationing.
Fig. 6 shows the SPI and SPEI indices, compared to the Oceanic Niño Index values (ONI) [34], where it is clear that seven, out of the ten drought events identified, correspond to a warm “El Niño” phase of the El Niño-Southern Oscillation (ENSO). In Fig. 6, the panels corresponding to the small temporal accumulations (i.e. 1 and 3 months) show dry oscillating periods (SPI or SPEI below -1), which do not always coincide with a period of “El Niño” or negative values of the ONI, as in the case of the 1985 and 2001 events. This indicates that other climatic phenomena can generate droughts in the MCRB as strong as the ENSO does.
Likewise, the 12 and 24 months panels in Fig. 6 indicate that periods with precipitation deficit are continuous, in such a way that their effects are aggregated in an annual trend that could affect river flows, reservoir levels, and even groundwater. Therefore, all periods with dry tendencies are consistent with an “El Niño” event. However, despite no apparent relationship with the ENSO, the 1985 and 2001 events continue to be relevant for the 12-month scale. These droughts are usually associated with the natural climatic variability in the country and effects from a rapid change from “El Niño” to “La Niña” phase [35, 36].
Due to the annual scale of the WCI, it was necessary to average the monthly values of the ONI series. The results identify years with warm (“El Niño”) or cold (“La Niña”) trends depending on the average temperature of the Pacific Ocean. This limits the analysis of the WCI results and their comparison with meteorological drought indicators. Nevertheless, Fig. 7 shows a clear correspondence between the warm years (ONI > 0) and the increase in the value of the hydrological drought indicator averaged within the MCRB, mainly for the years 1990-1992, 1997, 2001- 2002 and 2009.
4.2. Assessment of the reanalysis performance
We also made the calculations presented in section 4.1 with the two reanalysis datasets: WFDEI and MSWEP. The quantification of the efficiency of the reanalysis for reproducing the results of the indices calculated with the in-situ database is described below.
Through a cross-validation analysis between the distributed in-situ indices and the same indicators calculated with the two reanalysis datasets for the entire basin, we obtained maps of the Spearman correlation coefficient (ρ) and for the Root mean square error (RMSE). However, only the average values are analyzed below.
The Spearman coefficient describes the relationship between two variables using a monotonic function. For the two meteorological drought indicators, SPI and SPEI, the MSWEP shows a stronger correlation than the WFDEI (see Figs. 8, 9), in all the temporal scales analyzed. None of the scales evaluated shows a correlation smaller than 0.5 for any of the two products analyzed. The largest correlations are mainly located on the south of the MCRB, on the mountain areas, and along the three branches of the Colombian Andes, which can affect the flows in the lower part of the basin. Likewise, both products decrease their performance in the middle and lower part of the basin, in the areas of flat slopes and upstream of the mouth of the Magdalena River. This may be due to the spatial scale used.
Although both reanalysis datasets have errors with respect to the in-situ indicators, Figs. 10 and 11 show that the RMSE of the MSWEP (mean 0.61) is much lower than that of the WFDEI (mean 0.83). Moreover, the SPEI differences are slightly larger than the SPI ones, regardless of the reanalysis dataset used. This may be due to the choice of the equations for potential and actual evapotranspiration, and the regional scale of the analysis. Thus, it is recommended to perform further analyses for hydrological droughts with more equations for evapotranspiration and with other indices based on runoff and streamflow.
Due to the annual scale of the WCI, it was necessary to average the monthly values of the ONI series. The results identify years with warm (“El Niño”) or cold (“La Niña”) trends depending on the average temperature of the Pacific Ocean. This limits the analysis of the WCI results and their comparison with meteorological drought indicators, but identify the years in which the effects of the drought could affect the water supply, according to domestic demand. Fig. 7 shows a clear correspondence between the warm years (ONI > 0) and the increase in the value of the hydrological drought indicator averaged within the MCRB, mainly for the years 1990-1992, 1997, 2001- 2002 and 2009.
Likewise, in some years in Fig. 7, there are differences in the occurrence of “El Niño” and the peaks of the WCI. This can be associated with the annual analysis scale, the magnitude and duration of the event. For example, although the droughts of 1980 and 1988 affected more than 40% of the basin, it was for a shorter period than three months, which is why the annual average does not reflect an extreme condition.
Regardless of the reanalysis dataset considered, the large differences identified in the WCI calculations are not equally found in the meteorological drought indicators. This is due to the order of magnitude of the WCI, since it is not standardized, and it magnifies the differences where the largest population is concentrated.
Thus, the analysis confirms that the reanalysis WFDEI and MSWEP adequately identify the drought events in the MCRB. For the meteorological drought indicators, the performance of the products improves as the time scale of analysis increases, with an optimum close to the 12 months-scale of accumulation. In addition, the MSWEP can predict correctly temporal and spatial trends of droughts in the MCRB, with consistently better correlations and minor mean errors than the WFDEI product. Similarly, for hydrological droughts, Fig. 12 depicts a better performance for the MSWEP, with mean values of the Spearman correlation (ρ) around 0.82, a bias below 40% in contrast with the 0.67 and 45% values of the WFDEI.
Once the results were analyzed with the original inputs, we applied in both, the WFDEI and MSWEP products, the uncertainty analysis procedure proposed by Hu et al. [30]. The precipitation, balance, and runoff distributed time series were sampled 1,000 times, calculating the three drought indices with each of the Bootstrap sets.
Fig. 13 shows the containing ratio (CR) maps for the nine drought indices evaluated in this study. The top panels present the results for the WFDEI and the lower panels, the indicators calculated with the MSWEP. These values represent the relationship between the number of values of the in-situ indicators within the 90% confidence interval (CI) limits of the reanalysis indices.
In general terms, for the SPI and SPEI, the CI calculated with the MSWEP contains about 15% more in-situ values than the WFDEI, with an average close to 45% of the values derived from observations. In the same way, the CR increases with the scale of the temporary accumulation for the two reanalysis databases, with an optimal value for the 12-month accumulation period. This occurs due to the typical periodic variability of precipitation and temperature in the MCRB.
Unlike meteorological drought indices, the WCI confidence intervals are highly variable, due to the differences in magnitude of populated and rural areas. With this condition, the WFDEI confidence bands contain about 73% of the in-situ values, having better performance in areas of low population density. Meanwhile, the MSWEP has a CR of 52%, maintaining the trend of better performance in rural areas.
Although there were relatively high values of CR, the CI for the WCI are very wide. This hides the real divergence between the observed values and the prediction limits of the reanalysis. In rural areas for the WFDEI and the MSWEP, the CR is up to 45% and 47% higher with respect to the reference values, respectively, and in the main cities is up to 81% and 91%.
5. Conclusions
In this paper, we evaluated two reanalysis datasets (WFDEI and MSWEP) for the study of meteorological and hydrological droughts in the Magdalena-Cauca basin, using three drought indicators: SPI, SPEI, and WCI.
Through the calculation of the SPI and the SPEI indices with the in-situ series, ten meteorological drought events of large incidence at the regional scale were identified in the MCRB. Apart from the events of 1985-1986, 1995 and 2001-2002, which are associated with extremes of seasonal dry spells in the country, all the other drought periods were found to have a strong Spearman correlation close to -0.70, with a lag of 3 months with the warm “El Niño” phase of the ENSO.
According to the in-situ WCI results, water scarcity have no effect on the basin in terms of hydrological droughts. However, this indicator clearly shows the effects of high-density population, as hot spots in the main cities of the MCRB.
The WFDEI and MSWEP reanalysis databases show consistency in the evaluation of drought events in the MCRB. For the meteorological drought indicators, the performance of the datasets improves as the temporal accumulation increases. The 12-month scale represents the best results overall, due to the accumulation of meteorological conditions.
For the three indicators selected, the MSWEP can predict the temporal and spatial trends, exhibiting higher correlations and lower mean errors, compared to the WFDEI results. For this reason, the MSWEP product is recommended here to further explore droughts not only in the MCRB, but also in Colombian regions with limitations to obtain in-situ data information.
Finally, we used the Bootstrap method to evaluate the impact of sampling uncertainty on the estimation of drought indicators. The results show that although the MSWEP and the WFDEI are different products, they follow the same trends for the uncertainty associated with the SPI and the SPEI estimation, which once again indicates the consistency of the results derived from reanalysis data.