Original Article


Spatial structure of precipitation in the Brazilian Amazonia: geostatistics with block kriging


The Geostatistical analysis is an important tool for identifying distances over which rainfall shows spatial correlation. The objectives of the present work are: to model the spatial structure of rainfall patterns in the Brazilian rainforest using three timescales and to assess kriging interpolation for predicting rainfall data at ungauged sites. We used Geostatistical modeling combined with block kriging. The rainfall data were collected from 218 rain gauges corresponding to 64 municipalities located within 9 Amazonia states. Data corresponded to monthly mean rainfall and annual rainfall computed from historical records. The last date for data collection was December 2015. Three timescales were considered: months with minimum and maximum rainfall, January-June and July-December time periods and annual rainfall. The spherical model fitted reasonably well the semivariograms corresponding to March and January-June, the Gaussian model fitted quite well the semivariograms corresponding to September and July-December periods while the exponential model fitted the annual rainfall. The cross-validation analyses based on Mean Absolute Error and goodness-of-prediction statistics showed that kriged values (kriging maps) could be better predictors of rainfall for areas without rain gauges (unsampled zones) than mean rainfall values computed from adjacent sampled areas. It was found a significant negative statistical relationship between forest loss and rainfall occurrence which confirms the influence of deforestation on the hydrology of the Amazon basin.

Key words: 

Amazonian Rainforest; Geostatistics; Spatial Variability; Hydrology.


The Amazonian rainforest is connected to the global hydrological cycle (Yoon & Zeng, 2010). Deforestation so as intense fires are factors affecting rainfall distribution and the Amazonian climate in general (Coe et. al., 2013). Bonini et. al. (2014) found negative correlations between deforestation and rainfall in Mato Grosso. Some studies have shown a decrease in regional evapotranspiration and precipitation due to increase in surface temperature associated to large-scale deforestation (Sellers et. al., 1997). This alters the hydrological cycle and influences the regional climate (Bagley et. al., 2014). Debortoli et. al. (2015) found that the rainy season was shorter at 88 % of the 200 observed rain gauges. Rainfall occurrence at the Northeast and the Southern regions of Brazil is influenced by the seasonal displacement of the Intertropical Convergence Zone and the South Atlantic Convergence Zone (Marengo et. al., 2011).

Rainfall variability in the Amazonia is correlated with the sea surface temperature patterns in the Atlantic and Pacific oceans during December, January and February (Martins et. al., 2015). The reality is complex since past investigations have shown a decadal intensification of rainfall over the whole Amazon basin (Chen et. al., 2002). The vulnerability of the region could be clear by the fact that Amazonia experienced its worst drought events in 2005 and 2010 (Davidson et. al., 2012). However, it has been hypothesized that a physical mechanism of forest-induced atmospheric circulation called biotic pump by Makarieva & Gorshkov (2007) was responsible of the Amazonian forests greened-up during that 2005 drought (Saleska et. al., 2007).

Costa et. al. (2008) used geostatistical methods for analyzing precipitation extremes in Southern Portugal while Sarangui et. al. (2015) mapped the rainfall variability over the island of St. Lucia. In general, geostatistical analysis of climatic variables has been very difficult to perform due, in part, to the limited availability of measurement points which affects the estimation at ungauged sites (Sarangi et. al., 2005). Webster & Oliver (1992) demonstrated that at least 150 spatial points (e.g. rain gauges or meteorological stations) are required for a reliable geostatistical analysis. Once this problem has been solved, an appropriate kriging estimator (Burgess & Webster, 1980) could be a valuable tool for predicting rainfall data at ungauged sites. The spatial variability of precipitation also controls the spatial variation of other variables such as evapotranspiration and water storage. Due to the environmental significance of the Amazonian basin, the analysis of the spatial distribution of rainfall patterns is essential for agricultural, engineering and ecological planning. The objectives of the present work are to model the spatial structure of rainfall patterns in the Brazilian rainforest using three timescales and to search for any statistical relationship between deforestation and rainfall occurrence.

The present study spanned 64 municipalities corresponding to 9 Amazonian states from Brazil. The geographical zone covered approximately from 5.03 ºN to 16.45 ºS and 42.83 to 70.93 ºW. Rainfall data refer to historical monthly mean and annual records collected from 218 rain gauges (Figure 1).

Figure 1. 

Study region and rain gauge locations.

The records ranged from the last 16 years for Rondonopolis (Mato Grosso state) to the last 106 years for Manaus (Amazonas state). The last date for data collection was December 2015. Three timescales were considered: individual months with minimum (September) and maximum (March) rainfall, January-June (rainy season), July-December (dry season) and annual rainfall. Note that January-June and July-December overlap winter and summer periods. However, our main goal was to establish a 6-month timescale separating wet and dry seasons.

We performed an analysis on the potential relationship between the deforestation index and the annual rainfall including all the Amazon states. Most investigations try to search for relationships between deforested areas (km2) and rainfall. From our point of view that approach could be fictitious in this case as different states have different spatial extensions. We used instead the percentage of forest loss which relates forest loss (ha) to the total forest area (ha) for a particular site or state. The forest loss percentages are available at http://www.mongabay.com for 2001-2012 period (Butler, 2014). With that proposal, we averaged the available historical annual rainfall record for each studied Amazonian state and correlated them with the corresponding forest loss index.

Geostatistics incorporates the spatial coordinates of observations x,y in data processing. It is fundamental for the modeling of spatial patterns, prediction at unsampled points and estimation of the uncertainty inherent to those predictions (Goovaerts, 1998). The semivariance function, γh , is one of the key components of geostatistics as it summarizes the spatial variation of the studied variable.

Where γh is the value of the experimental semivariance at each separation distance (lag), h, N(h) is the number of data pairs separated by the lag h , zxi and zxi+h are the rainfall values at location xi and xi+h , respectively. A plot of ((h) versus h defines the experimental semivariogram. Different models are available for fitting the experimental semivariogram (see for instance McBratney & Webster, 1986). We explored three transitive models (spherical, exponential and Gaussian models).

Equation (2) represents the spherical isotropic model. In this case C0 is the nugget variance or nugget effect ( C00 ), C is the structural variance ( CC0 ), C0+C is the upper limit of the semivariogram (sill) and A0 is the distance at which semivariogram reaches a constant semivariance value (correlation range parameter). The Spatial Dependence Degree Index ( SDDI ) can be computed as the ratio between the structural variance, C , which characterizes the variance accounted for by the spatial dependence and the sill, C0+C . For example, SDDI<0.25 suggests weak spatial dependence, 0.25<SDDI<0.75 corresponds to moderate spatial dependence while SDDI>0.75 indicates strong spatial dependence. This is a modification to the ratio nugget/sill variance proposed by Cambardella et. al. (1994). The range is the limit of spatial dependence. The spherical model characterizes a moving average of a randomized process (Kuzyakova et. al. 2001).

Equation (3) is the isotropic exponential model. It differs from the spherical model in the rate at which the sill is reached. It represents autoregressive processes of first order (e.g. Markov or Poisson processes) as the autocorrelation function decays exponentially (Wang et. al. 2010).

Equation (4) corresponds to the Gaussian or hyperbolic isotropic model. It is almost similar to the exponential model but the departure from the nugget variance is smoother.

Block kriging is a robust interpolation method for estimating values of the variable (e.g. rainfall values in this case) at ungauged zones (Burguess & Webster, 1980). The block kriging estimator ( ZS ) of the average rainfall on a given zone ( S ) is built as a linear combination of the available rainfall data:

Where ZS* is the estimated kriged value of Z in the area S and i are weighing factors such that:

In this case n is the number of rainfall values in the vicinity of ZS* . The block kriging estimation is unbiased under the condition:

That is, the mathematical expectation of ZS* is the same as that of the real ZS value.

The accuracy of rainfall maps was assessed through a cross-validation process (Davis, 1987). The agreement between estimated and measured values was quantified using mean absolute error ( MAE hereafter) (Voltz & Webster, 1990) and goodness-of-prediction statistics ( G statistics hereafter) (Agterberg, 1984). The MAE statistics is in fact a residual sum:

Where z^xi is the predicted rainfall value at location i .

The G statistics is a sort of balance between kriging and sample mean as potential rainfall predictors:

Where zmean is the mean value of all the observations.

Equation (9) defines three possibilities:

  1. G=0 indicates that kriged values or sample mean could be used as rainfall predictors,

  2. G<0 suggests that sample mean is a better predictor than kriging,

  3. G>0 indicates that kriging is more appropriate than sample mean for making reliable predictions at ungauged sites.

It is useful to recall that due to earth curvature the use of geographical coordinates for large scale spatial analysis is inappropriate. Thus, in order to perform the spatial analysis, geographical coordinates were transformed into rectangular x,y coordinates.

Classical statistics (e.g. first and second order statistics and test of normality) and linear regression analysis were conducted using the StatisticaTM Software Package (StatSoft. Inc., 2011). All the geostatistical analyses were performed using the GS+ Geostatistical Software Package (Gamma Design Software, 2001). In particular, we set the minimum lag class distance to 247.25 km. Block kriging estimation was carried out using 4 x 4 local grids and 16 neighbors within a radius equal to the range of the semivariogram. We selected a total of six map contour levels.

March and September were the months with maximal and minimum rainfall values, respectively. Those months also showed the maximum and minimum standard deviation values (117 mm and 60 mm, respectively) (Table 1).

Table 1. 

Descriptive statistics of monthly, six month and annual rainfall. KSL is Kolmogorov-Smirnov and Lilliefors test for normality. NS means that Rainfall distribution does not differ statistically from normality at p<0.05 .

MonthMean (mm)Min. (mm)Max. (mm)Std.DevCV (%)KSL

The spatial distribution of rainfall based on upper thresholds is highly variable in the whole region (Figure 2A-E). The maximal historical annual rainfall (4253 mm) as represented by the upper threshold was located in Fonte Boa (Amazonian state, 2.35 ºS, 65.12 ºW) while the minimum upper threshold of annual rainfall (1980 mm) was recorded at Caceres (Mato Grosso state, 16.05 ºS, 57.68 ºW) (Figure 2E).