It is of importance to perform hydrological forecast using a finite hydrological time series. Most time series analysis approaches presume a data series to be ergodic without justifying this assumption. This paper presents a practical approach to analyze the mean ergodic property of hydrological processes by means of autocorrelation function evaluation and Augmented Dickey Fuller test, a radial basis function neural network, and the definition of mean ergodicity. The mean ergodicity of precipitation processes at the Lanzhou Rain Gauge Station in the Yellow River basin, the Ankang Rain Gauge Station in Han River, both in China, and at Newberry, MI, USA are analyzed using the proposed approach. The results indicate that the precipitations of March, July, and August in Lanzhou, and of May, June, and August in Ankang have mean ergodicity, whereas, the precipitation of any other calendar month in these two rain gauge stations do not have mean ergodicity. The precipitation of February, May, July, and December in Newberry show ergodic property, although the precipitation of each month shows a clear increasing or decreasing trend.
A hydrological process can be usually regarded as a stochastic process and any observation is just a realization of a random variable representing the stochastic process. A realization of a stochastic process is defined as the outcome of an experiment in which the process is observed (Shahin et al., 1993). For example, a time series of observed precipitation data at a gauge station is a realization of the precipitation process at the area the gauge station covers. A collection of all possible realizations of a stochastic process, i.e. the ensemble, are used to represent the process.
Given that the variations of a hydrological variable representing certain
hydrological process are usually very complicated and affected by random
factors, statistical properties of the process, such as the phase mean
function of a data series
To date, only limited discussions about the application of time series ergodicity (Domowitz and El-Gamal, 2001; Morvai and Weiss, 2005) have been reported. Most studies of time series applications, such as in the fields of hydrology, hydrodynamics, and noise (Jiang and Zheng, 2005; Oliveira et al., 2006; Veneziano and Tabaei, 2004), discuss statistic characteristics simply by assuming time series having ergodicity without justifying this assumption with a rigorous approach. There have been a few discussions concerning ergodicity in the field of hydrological research. Liu (1998) assumed that ergodicity exists between the spatial distribution and the temporal propagation of hydrological factors of a water exchange system, i.e. these processes are restricted by ergodicity. Xia (2005) used power-weighted Markov chains to predict “plum rain” intensity (an East Asian rainy season usually lasting from June to July) and concluded that this process has ergodicity. In general, the ergodicity of time series refers to the ergodicity of stationary processes, which means that the process averaged over time behaves identical to the process averaged over space.
Until recently, particular studies on ergodic property analysis for hydrological processes have not yet been performed. However, the study of the ergodicity itself is not only significant but also indispensable because it is a fundamental presumption for many time series problems (Ding and Deng, 1988; Fiori and Janković, 2005; Hsu, 2003; Liu, 1998; Mitosek, 2000; Wang et al., 2004). This study proposes a practical approach for mean ergodic property analysis using autocorrelation function (ACF) or Augmented Dickey Fuller (ADF) test and a radial basis function (RBF) neural network. The term ergodic and ergodic property or ergodicity are used mainly in mathematical physics, e.g. dynamics, and the theory of stationary stochastic processes. This study focuses on the ergodicity analysis for stationary stochastic processes which are commonly applied in hydrology.
A process is said to be ergodic if its statistical properties (such as its
mean and variance) can be deduced from a single, sufficiently long sample
(realization) of the process. A stochastic process shows ergodicity when its
mean and covariance functions are ergodic, i.e. mean erogdicity and
covariance ergodicity. Since the ergodicity of covariance function which
usually relates to forth-order moments of the process is difficult to
verify, only the mean ergodic property of the process (or sequence) is
discussed in this paper. For a given stochastic sequence
It has been proved that only stationary processes could have ergodicity (Davis et al., 1994). Stationarity implies that the statistical parameters of the series computed from different samples do not change except due to sampling variations. A time series is said to be strictly stationary if its statistical properties do not vary with changes of time origin. A less strict type of stationarity is called weak stationarity or second-order stationarity where the first- and second-order moments depend only on time differences (Chen and Rao, 2002). In nature, strictly stationary time series does not exist, and weakly stationary time series is practically considered as stationary time series. In addition to the stationarity, another necessary condition for ergodicity analysis is that the samples from the single realization should be taken from a large enough period of time.
Currently there are no particular statistic tests designed for ergodic
property analysis; we, therefore, perform the mean ergodicity analysis based
on its definition and demonstrate a practical approach with a series of case
studies using monthly precipitation data series collected from two rain
gauge stations located in China and one in the US respectively. Whereas the
definition of mean ergodicity is simple and straightforward, the practical
analysis of mean ergodicity can be complicated. As discussed in the last
section that a stochastic process is not ergodic unless stationary,
a stationary test for the data series representing a stochastic process is
then necessary as a prerequisite for further ergodicity analysis. Another
challenge lies in the fact that the infinitely long data series required by
the definition of mean ergodicity cannot be achieved in reality. This
challenge can be overcome by extending the data series using approaches such
as a reliable artificial intelligence approach. We propose to solve this
challenge by predicting the
Given that the commonly used statistical inference is no longer valid for a non-stationary data series, it is necessary to examine the stationarity of a data series. The standard method for stationarity test is a unit root test, i.e. a time series is stationary if there exists a unit root. The stationarity of a stochastic process is determined by the roots of its characteristic function. If all the characteristic roots are located outside of the unit element, then the process is stationary, whereas, the process is non-stationary if one or more roots are on or within the unit element or circle with unit radius. If a characteristic root has a value of unit, it is called unit root. Dickey Fuller (DF) test and Augmented Dickey Fuller (ADF) test are two commonly used unit root test. ADF test, actually an extension of DF test, eliminates the autocorrelation of residues by increasing the lags of the variable of a time series.
Without loss of generality, DF test can be illustrated with a simple AR (1)
process. Consider a stochastic process,
Since the
The original data series
The following procedure for ergodicity analysis is then proposed for
a practical analysis of ergodicity of a data series: (i) perform the stationary
analysis for the data series by evaluating its autocorrelation function.
A data series has no ergodicity unless it is stationary. (ii) Calculate the
sample mean value series
Ergodicity analysis is performed for the monthly precipitation data series of three sites to demonstrate the proposed ergodicity analysis approach, including Lanzhou of Gansu Province and Ankang of Shan'xi Province, China, and Newberry Michigan, USA.
A mean ergodicity analysis is performed for each individual monthly
precipitation data series of the 121 year precipitation data (1893–2013)
collected from NOAA Newberry Correctional Facility, MI, USA
(46.35
The statistics of each individual monthly precipitation series of Newberry
are given in Table 1. The ACF plots, as
shown in Fig. 1, and the ADF test indicate that
all the 12 individual monthly precipitation data series at Newberry are
stationary. The
Fifty years (1951–2000) of monthly precipitation data are collected from
Lanzhou Rain Gauge Station (103.70
Similarly, the ergodicity analysis is performed for the monthly mean
precipitation data series of each calendar month for the Ankang rain gauge
station. Seventy years (1929–1998) of precipitation data from Ankang Rain
Gauge Station (109.03
The coefficient of variance (CV) has been widely used to measure dispersion of a data series. The more concentrated the distribution of a random variable is, the more obvious is its regularity, and vice versa. The coefficients of variance for each monthly precipitation data series of Lanzhou rain gauge station, Ankang rain gauge station, and Newberry are calculated, as shown in Table 1, respectively. As all the monthly data series are stationary in our study, we synthesize a combined stationary and nonstationary data series by clustering the monthly data series of Ankang station into four classes, as shown in Table 2, in order to investigate the non/stationarity and ergodicity simultaneously. The stationarity test and mean ergodicity analysis are then performed to the clustered data using the proposed methodology. The stationarity analysis by evaluating the autocorrelation of each group of data series shows that the (Ankang: January, February, March, November, December) series is non-stationary; whereas all the other three groups of data series are stationary. It can be seen that the coefficient of variance for clustered monthly data series of Ankang that are stationary are smaller than that of non-stationary monthly data series. This indicates that the monthly precipitation with small coefficients of variance has more tendency to be stationary, or more regular.
Although a small coefficient of variance provides the representing data series with more tendency to be stationary, it does not give a clear indication to the ergodicity of the data series. The coefficient of variance of precipitation time series with ergodicity is not necessarily smaller than those without ergodicity. Among the monthly precipitation data series of Newberry, the data series of (Newberry: September) has the smallest coefficient of variance but it does not have an ergodic property. Moreover, although the coefficients of variance of Newberry (September), Lanzhou (September), and Ankang (April), are smaller than Newberry (July), Lanzhou (August), and Ankang (June), respectively, the latter ones have ergodicity while the former ones do not. Furthermore, an ergodic process is stationary while the converse may not be true. The stationarity of a data series is the prerequisite of its ergodicity, rather than a guarantee to ergodicity; there are stationary processes which are not ergodic. In other words, a process with ergodicity is necessarily stationary in the strict sense. Therefore, neither the coefficient of variance nor the stationarity test, which is commonly performed in time series analysis, can take over the ergodicity analysis in order to make sure the statistics of a reality, such as the mean, can be safely used as those of its population.
Comparison of the ergodicity of the monthly precipitation data series of May and June of Ankang analyzed by their monthly data series and the clustered data series indicates that the ergodic property could change when new data is introduced. The analysis of May precipitation in Ankang shows it has ergodicity; whereas, the analysis using the clustered data series Ankang (May, June, September) indicates that this data series does not have mean ergodic property. This difference can be explained that the Ankang (May) monthly precipitation may become non-ergodic when new data, for example Ankang (June, September), is introduced, which actually can considered as new observations of Ankang (May) and the variation of Ankang (May) and Ankang (June, September) could be owing to the change of natural or man-made factors that affect precipitation processes.
A linear trend analysis is also performed following Vamos and Craciun (2012) for some of the ergodic monthly precipitation data series of the three rain gauge stations, as shown in Fig. 4. The August precipitation at neither Lanzhou nor Ankang station shows obviously periodic. The August precipitation at Lanzhou rain gauge station shows an overall decreasing trend while the August precipitation at Ankang rain gauge station shows remains almost stable around its mean value. In Newberry, only the precipitation of May shows a relatively stable trend, the precipitation of February has a clear decreasing trend since 1970s, and the precipitation of both July and December has an increasing trend. However, according to the ergodicity analysis, the precipitation of those months having ergodicity in the three rain gauge stations, in the long run will fluctuate around their mean value rather than keep varying as shown in Fig. 4.
In this study, the 50
Ergodic property analysis for hydrological processes is difficult but worthy of discussion. One may argue that whether a data series representing a hydrological process is ergodic, it does not actually affect the practice of analysis of this hydrological process, therefore, the test of ergodicity can be completely neglected. Some researchers (Duan and Goldys, 2001; Koutsoyiannis, 2005; Liu, 1998), however, have pointed out that hydrological processes may have ergodic properties although no particular ergodicity analysis was performed in these works. This study presents a practical approach to analyze the mean ergodicity of hydrological processes, which bridges the concept of ergodicity and its application in hydrological process analysis. This approach primarily includes the stationarity test of the data series through its ACF or ADF test, avoiding the difficulty in analyzing the stationarity of the data series directly from its definition, the extension of the length of the data series, via the RBF network in this study, and the ergodicity analysis based on the sample mean sequence and its variance series. Three case studies, the ergodicity analysis for the monthly precipitation of Lanzhou in the Yellow River Basin of Chin, Ankang in the Han River basin of China, and Newberry, MI, USA, are conducted using the proposed approach.
Our research reveals that the precipitations of March, July, and August in Lanzhou, and May, June, and August in Ankang have ergodicity; therefore the stochastic and statistical analysis of the precipitation of these months based on the observations (sample) in these two stations are expected more reliable than the analysis for any other calendar months' precipitation in the two stations. The ergodicity analysis of precipitation data series of each individual month in Newberry, MI, USA, which has a relatively long observation history indicates that the precipitation of February, May, July, and December show ergodic property, although not all of the precipitation of these months has a tendency converging to its mean value, respectively.
This study focuses mainly on the mean ergodicity analysis; approaches to the covariance ergodicity analysis of hydrological processes need to be developed in the future, which would provide us more useful information. In addition, as discussed, the application of ergodicity seems still controversial although its concept and properties have been applied commonly in hydrology by presuming hydrological processes automatically having ergodicity. More discussion and methodologies on ergodicity analysis would certainly bridge the gap between its concept and application.
The paper is supported by the National Science and Technology Support Projects (Grant No. 2006BAB04A08).
Statistics of monthly precipitation data of each case study site.
System clustering of the precipitation series at Ankang, China.
ACF of monthly precipitation data series of Newberry, MI, USA.
Extended
The precipitation data (dotted in figures) of some ergodic months and their trend (straight lines) in Lanzhou, Ankang, and Newberry.