A linear regression for a short and long sequence of hydrologic events is used to lengthen the short sequence. The lengthened sequence consists of the original observations and regressed values plus noise, where the noise is a random variable with zero mean and variance proportional to the variance of the observations for the short sequence about the line of regression. Estimates of the mean and variance for the lengthened sequence are shown to be unbiased. If the correlation coefficient, which measures the strength of the linear regression, exceeds about 0.5, then the estimates of the mean and variance based on the lengthened sequence are better estimators of the population values of the mean and variance than the estimates based on the observations for the short sequence. If noise is not added to the regressed values, the correlation coefficient must exceed about 0.8 to obtain improvement in the estimates by use of correlated values. INTRODUCTION In statistical studies of hydrology, the assumption is made that a sequence of a finite number of observed events represents a random sample from an infinite population of such events, where the outcome of each event is governed by some probability distribution. Any change hi the hydrologic regime with which a given sequence is associated is reflected in a change of the probability distribution. Although various parameters may be used to describe a probability distribution, the mean, variance, and skewness are three parameters which provide information about the most useful properties of any probability distribution. The mean is a measure of central tendency, a value about which the events tend to cluster, whereas the variance measures the dispersion or average spread of the events about the mean. The skewness is a measure of the asymmetry of the distribution of the events about the mean. These three parameters cannot define a probability distribution uniquely, except in special cases; nevertheless, they do provide characteristics which may be used to describe and compare various hydrologic phenomena. The population values of these parameters are generally unknown, but the values may be estimated from a sequence of observations. How reliable these estimates are depends primarily upon the period or length of the sequence the total number of observations. If the estimates are unbiased, their reliability increases with an increase of the sequence length. To increase a sequence length by making additional observations in time in order to obtain more reliable estimates of the mean, variance, and skewness is not always operationaUy or economically feasible. If not, recourse must be had to other procedures for increasing a sequence length. One such procedure, which is the topic of discussion in this paper, is to utilize the relations among hydrologic phenomena. A relation among the concurrent observations for a short and a long sequence can be used to obtain estimates of the nonobserved events for the short sequence which correspond to the observed events for the nonconcurrent portion of the long sequence. In this manner the short sequence is lengthened. The observed and estimated events for the lengthened sequence can be used to obtain estimates of the mean, variance, and skewness. Whether or not the reliability of these estimates is greater than that for the estimates of these parameters based only on the observations depends mainly upon the strength of the relation between the concurrent observations for the short and long sequences. A mathematical evaluation of this procedure has been made by several investigators under the assumptions that (1) the events are independently distributed in time, (2) the concurrent events for two sequences have a joint normal distribution, (3) the relation between the concurrent events is defined by a linear regression, and (4) no changes occur hi the hydrologic regimes with which the sequences are associated. The strength of the linear regression is measured by the productmoment correlation coefficient. Under the assumption of normality, only the mean and variance need to be considered because these two parameters uniquely define a normal probability distribution. For this distribution, the skewness is zero. El E2 STATISTICAL STUDIES IN HYDROLOGY H. A. Thomas, Jr. (written commun. 1956), showed that the lengthened sequence yields an unbiased estimate of the mean, and if the product-moment correlation coefficient exceeds lf-yNi 2 where NI is the length of the concurrent period, the reliability of this estimate of the mean is greater than that based only on the observations for the short sequence. These results also were obtained by W. G. Cochran (1953) in an earlier and independent study of a double sampling problem. J. R. Rosenblatt (1959) showed that the lengthened sequence yields a biased estimate of the variance, and that the reliability of this estimate is greater than that based only on the observations for the short sequence if the product-moment correlation coefficient exceeds about 0.8. M. B. Fiering (1963) obtained somewhat similar results for the estimates of the mean and variance when more than one long sequence is related to a short sequence by a multiple linear regression. In the studies cited above, the estimated events in the lengthened sequence were regression estimates, that is, estimates which correspond to values on the line of regression. These estimates tend to yield a smaller variance than would the real observations. In order to "preserve" the variance inherent with the observations, a random component must be added to the regression estimates. This component, often referred to as noise, is normally distributed with zero mean and variance proportional to the variance of the observations for the short sequence about the line of regression. This paper reports the results of an investigation made to determine the reliability of estimates of the mean and variance computed from a lengthened sequence when noise is added to the regression values. Noise is shown to have no effect on the reliability of the estimate of the mean. However, the addition of noise is shown to lead to an unbiased estimate of the variance for the lengthened sequence. The reliability of this estimate is greater than that when no noise is added to the regression values. STATISTICAL MODEL The long and short sequences for a pair of hydrologic phenomena are denoted by x and y. In general, the two phenomena need not be the same. If, for example, y denotes streamflow, x may denote streamflow, precipitation, temperature, or a geochronologic phenomena such as tree-ring widths. The observed events for the long and short sequences are represented as where NI is the length of the short sequence and (Ni-+-Na) is the length of the long sequence. For this representation of the two sequences, NI also denotes the concurrent period of observation. In practice, the NI observations for the short sequence need not correspond to the first NI observations of the long sequence, nor need the NI concurrent observations of x and y occur consecutively. However, there is no loss of generality if the two sequences are represented as above. The concurrent observations x and y are assumed to have a joint normal probability distribution with parameters nx, ny, a2., a2,, and p=ft