### INTRODUCTION

The Amazonian rainforest is connected to the global hydrological cycle (Yoon & Zeng, 2010). Deforestation so as intense fires are factors affecting rainfall distribution and the Amazonian climate in general (Coe *et. al.*, 2013). Bonini *et. al.* (2014)
found negative correlations between deforestation and rainfall in Mato
Grosso. Some studies have shown a decrease in regional
evapotranspiration and precipitation due to increase in surface
temperature associated to large-scale deforestation (Sellers *et. al.*, 1997). This alters the hydrological cycle and influences the regional climate (Bagley *et. al.*, 2014). Debortoli *et. al.* (2015)
found that the rainy season was shorter at 88 % of the 200 observed
rain gauges. Rainfall occurrence at the Northeast and the Southern
regions of Brazil is influenced by the seasonal displacement of the
Intertropical Convergence Zone and the South Atlantic Convergence Zone (Marengo *et. al.*, 2011).

Rainfall
variability in the Amazonia is correlated with the sea surface
temperature patterns in the Atlantic and Pacific oceans during December,
January and February (Martins *et. al.*, 2015).
The reality is complex since past investigations have shown a decadal
intensification of rainfall over the whole Amazon basin (Chen *et. al.*, 2002).
The vulnerability of the region could be clear by the fact that
Amazonia experienced its worst drought events in 2005 and 2010 (Davidson *et. al.*, 2012). However, it has been hypothesized that a physical mechanism of forest-induced atmospheric circulation called biotic pump by Makarieva & Gorshkov (2007) was responsible of the Amazonian forests greened-up during that 2005 drought (Saleska *et. al.*, 2007).

Costa *et. al.* (2008) used geostatistical methods for analyzing precipitation extremes in Southern Portugal while Sarangui *et. al.* (2015)
mapped the rainfall variability over the island of St. Lucia. In
general, geostatistical analysis of climatic variables has been very
difficult to perform due, in part, to the limited availability of
measurement points which affects the estimation at ungauged sites (Sarangi *et. al.*, 2005). Webster & Oliver (1992)
demonstrated that at least 150 spatial points (e.g. rain gauges or
meteorological stations) are required for a reliable geostatistical
analysis. Once this problem has been solved, an appropriate kriging
estimator (Burgess & Webster, 1980)
could be a valuable tool for predicting rainfall data at ungauged sites.
The spatial variability of precipitation also controls the spatial
variation of other variables such as evapotranspiration and water
storage. Due to the environmental significance of the Amazonian basin,
the analysis of the spatial distribution of rainfall patterns is
essential for agricultural, engineering and ecological planning. The
objectives of the present work are to model the spatial structure of
rainfall patterns in the Brazilian rainforest using three timescales and
to search for any statistical relationship between deforestation and
rainfall occurrence.

### MATERIALS AND METHODS

#### Study area and rainfall data

The present study spanned 64 municipalities corresponding to 9 Amazonian states from Brazil. The geographical zone covered approximately from 5.03 ºN to 16.45 ºS and 42.83 to 70.93 ºW. Rainfall data refer to historical monthly mean and annual records collected from 218 rain gauges (Figure 1).

Figure 1.

Study region and rain gauge locations.

The records ranged from the last 16 years for Rondonopolis (Mato Grosso state) to the last 106 years for Manaus (Amazonas state). The last date for data collection was December 2015. Three timescales were considered: individual months with minimum (September) and maximum (March) rainfall, January-June (rainy season), July-December (dry season) and annual rainfall. Note that January-June and July-December overlap winter and summer periods. However, our main goal was to establish a 6-month timescale separating wet and dry seasons.

#### Deforestation data

We
performed an analysis on the potential relationship between the
deforestation index and the annual rainfall including all the Amazon
states. Most investigations try to search for relationships between
deforested areas (km^{2}) and rainfall. From our point of view
that approach could be fictitious in this case as different states have
different spatial extensions. We used instead the percentage of forest
loss which relates forest loss (ha) to the total forest area (ha) for a
particular site or state. The forest loss percentages are available at http://www.mongabay.com for 2001-2012 period (Butler, 2014).
With that proposal, we averaged the available historical annual
rainfall record for each studied Amazonian state and correlated them
with the corresponding forest loss index.

#### Theoretical background on Geostatistics

Geostatistics incorporates the spatial coordinates of observations $\left(x,y\right)$ in data processing. It is fundamental for the modeling of spatial patterns, prediction at unsampled points and estimation of the uncertainty inherent to those predictions (Goovaerts, 1998). The semivariance function, $\gamma \left(h\right)$ , is one of the key components of geostatistics as it summarizes the spatial variation of the studied variable.

Where $\gamma \left(h\right)$ is the value of the experimental semivariance at each separation distance (lag), h, N(h) is the number of data pairs separated by the lag $h$ , $z\left({x}_{i}\right)$ and $z\left({x}_{i}+h\right)$ are the rainfall values at location ${x}_{i}$ and ${x}_{i}+h$ , respectively. A plot of ((h) versus $h$ defines the experimental semivariogram. Different models are available for fitting the experimental semivariogram (see for instance McBratney & Webster, 1986). We explored three transitive models (spherical, exponential and Gaussian models).

Equation (2) represents the spherical isotropic model. In this case C_{0} is the nugget variance or nugget effect (
${C}_{0}\ge 0$
),
$C$
is the structural variance (
$C\ge {C}_{0}$
),
${C}_{0}+C$
is the upper limit of the semivariogram (sill) and A_{0}
is the distance at which semivariogram reaches a constant semivariance
value (correlation range parameter). The Spatial Dependence Degree Index
(
$SDDI$
) can be computed as the ratio between the structural variance,
$C$
, which characterizes the variance accounted for by the spatial dependence and the sill,
${C}_{0}+C$
. For example,
$SDDI<0.25$
suggests weak spatial dependence,
$0.25<SDDI<0.75$
corresponds to moderate spatial dependence while
$SDDI>0.75$
indicates strong spatial dependence. This is a modification to the ratio nugget/sill variance proposed by Cambardella *et. al.* (1994). The range is the limit of spatial dependence. The spherical model characterizes a moving average of a randomized process (Kuzyakova *et. al.* 2001).

Equation (3) is the isotropic
exponential model. It differs from the spherical model in the rate at
which the sill is reached. It represents autoregressive processes of
first order (e.g. Markov or Poisson processes) as the autocorrelation
function decays exponentially (Wang *et. al.* 2010).

Equation (4) corresponds to the Gaussian or hyperbolic isotropic model. It is almost similar to the exponential model but the departure from the nugget variance is smoother.

#### Block Kriging

Block kriging is a robust interpolation method for estimating values of the variable (e.g. rainfall values in this case) at ungauged zones (Burguess & Webster, 1980). The block kriging estimator ( ${Z}_{S}$ ) of the average rainfall on a given zone ( $S$ ) is built as a linear combination of the available rainfall data:

Where ${Z}_{S}^{*}$ is the estimated kriged value of $Z$ in the area $S$ and ${}_{i}$ are weighing factors such that:

In this case $n$ is the number of rainfall values in the vicinity of ${Z}_{S}^{*}$ . The block kriging estimation is unbiased under the condition:

That is, the mathematical expectation of ${Z}_{S}^{*}$ is the same as that of the real ${Z}_{S}$ value.

The accuracy of rainfall maps was assessed through a cross-validation process (Davis, 1987). The agreement between estimated and measured values was quantified using mean absolute error ( $MAE$ hereafter) (Voltz & Webster, 1990) and goodness-of-prediction statistics ( $G$ statistics hereafter) (Agterberg, 1984). The MAE statistics is in fact a residual sum:

Where $\widehat{z}\left({x}_{i}\right)$ is the predicted rainfall value at location $i$ .

The $G$ statistics is a sort of balance between kriging and sample mean as potential rainfall predictors:

Where $z\left(mean\right)$ is the mean value of all the observations.

Equation (9) defines three possibilities:

$G=0$ indicates that kriged values or sample mean could be used as rainfall predictors,

$G<0$ suggests that sample mean is a better predictor than kriging,

$G>0$ indicates that kriging is more appropriate than sample mean for making reliable predictions at ungauged sites.

It is useful to recall that due to earth curvature the use of geographical coordinates for large scale spatial analysis is inappropriate. Thus, in order to perform the spatial analysis, geographical coordinates were transformed into rectangular $\left(x,y\right)$ coordinates.

#### Standard statistics and Geostatistical computation

Classical
statistics (e.g. first and second order statistics and test of
normality) and linear regression analysis were conducted using the
Statistica^{TM} Software Package (StatSoft. Inc., 2011). All the geostatistical analyses were performed using the GS+ Geostatistical Software Package (Gamma Design Software, 2001). In particular, we set the minimum lag class distance to 247.25 km. Block kriging estimation was carried out using
$4x4$
local grids and 16 neighbors within a radius equal to the
range of the semivariogram. We selected a total of six map contour
levels.

### RESULTS AND DISCUSSION

March and September were the months with maximal and minimum rainfall values, respectively. Those months also showed the maximum and minimum standard deviation values (117 mm and 60 mm, respectively) (Table 1).

Table 1.

Descriptive statistics of monthly, six month and annual rainfall. KSL is Kolmogorov-Smirnov and Lilliefors test for normality. NS means that Rainfall distribution does not differ statistically from normality at $\text{p}<0.05$ .

The spatial distribution of rainfall based on upper thresholds is highly variable in the whole region (Figure 2A-E). The maximal historical annual rainfall (4253 mm) as represented by the upper threshold was located in Fonte Boa (Amazonian state, 2.35 ºS, 65.12 ºW) while the minimum upper threshold of annual rainfall (1980 mm) was recorded at Caceres (Mato Grosso state, 16.05 ºS, 57.68 ºW) (Figure 2E).