Pacific Coastal and Marine Science Center
Bedform Sedimentology Site—ripples, dunes, and crossbedding
Forecasting Techniques, Underlying Physics, and Applications
5.4 Examples5.4A Lorenz system Before applying forecasting techniques to geophysical data, it is instructive to examine results for mathematical models such as the Lorenz system (Chapter 2) for which the underlying dynamics are known. Using a time series of a single variable (such as the temperature difference term), we can use forecasting to demonstrate that the system is nonlinear and deterministic, with 34 degrees of freedom. We begin with a time series of the temperaturedifference variable computed from the Lorenz equations. Spectral analysis of this time series demonstrates that it is nonperiodic (Fig. 5.7), which makes the time series a candidate for nonlinear forecasting. To perform the forecasting, approximately 3000 points are used for a learning set, and several hundred points are reserved to test the predictive ability of the relations learned in the learning set. We then evaluate model performance as a function of embedding dimension (the number of preceding points that are used to predict each next value in the time series), prediction time, and the number of nearest neighbors used to evaluate the coefficients in equation 5.5. Figure 5.7 Power spectral density for a 6000point time series of the Lorenz system; nonperiodicity is demonstrated by the lack of a peak. The solid line represents power spectral density, and the dotted lines represent the 95% confidence limits (calculated from the standard deviation of power at each frequency). To perform these calculations the time series was broken into 128point pieces. Results of exploratory modeling demonstrate that model performance is optimum if the embedding dimension has a value of 4 (Fig. 5.8). This value is larger than the actual number (3) of degrees of freedom of the Lorenz system, because modeling error and noise (even digitization error!) can preferentially degrade models using low embedding dimensions. Consequently, the embedding dimension of the most accurate model is an upper limit of a system's number of active degrees of freedom. Any stationary nonperiodic system must have at least 3 degrees of freedom; if a stationary system has only two dimensions, then it can not be nonperiodic because its attractor must be confined to a single plane. Trajectories can not cross each other, because this would imply that a system behaves differently for exactly the same conditions; the trajectories therefore must either be attracted to a fixed point, spiral outward indefinitely (nonstationary), or follow a single repeating (periodic) path. The Lorenz system does not follow any of these paths, so we conclude that the system has at least 3 active degrees of freedom. Thus, the number of degrees of freedom must be greater than or equal to 3 and less than or equal to 4. Figure 5.8 RMS forecasting error as a function of embedding dimension for the Lorenz system with and without 15% white (uncorrelated) noise. The noisefree time series is modeled most accurately using an embedding dimension of four, but the system actually has only three degrees of freedom. Forecasts for the noisy series are less accurate, and the series is modeled most accurately using a higher embedding dimension. Figure 5.9 Forecasting error as a function of number of neighbors (k) used to solve equation (5.5) for the Lorenz system with and without 15% white noise. Where k is small, forecasts are nonlinear and deterministic; where k is large, models are linear and/or stochastic (Casdagli, 1992). The relative performance of these deterministicversusstochastic models can be used to infer properties of the system dynamics. Where minimum prediction errors are for models with low k, the system is deterministic and nonlinear. The noisefree series is modeled most accurately using models near the nonlinear deterministic extreme, whereas the noisy series is modeled more accurately using nonlinear stochastic models. Results also demonstrate that the Lorenz system is most accurately modeled using a small number of nearest neighbors (Fig. 5.9), which indicates that the system is situated toward the deterministic end of the deterministicstochastic spectrum. The best model is one that learns from the behavior of only the nearest 16 neighbors to each predictee; the best model therefore ignores more than 99% of the learning set when making each prediction.Because real data are not as noisefree as computergenerated data, it is instructive to add noise to the Lorenz time series and repeat the forecasting computations. For this experiment we added uncorrelated noise having a standard deviation of 15% of that of the signal (the Lorenz time series). Noise alters the results in the following manner: forecasts are poorer, andmore significantly for characterizing the systemthe best model employs a greater number of nearest neighbors (because the Lorenzplusnoise system is more stochastic) and a larger embedding dimension (because of the noisereducing capability of the longer sequence of previous points). These results emphasize two of the points made above. First, the embedding dimension of the best model is merely an upper limit to the number of degrees of freedom in the system. Second, a natural time series is influenced by multiple processes: (1) behavior of the system of interest, (2) the effects of unsteady natural forcing exerted on the system of interest, and (3) artificial processes such as measurement error and instrument noise. Forecasting may not be able to distinguish these individual components. We might attempt to use forecasting to document the exponential divergence of nearby trajectories in the Lorenz attractor, but simple attempts would be unsuccessful. Individual trajectories stretch apart and are compressed with each orbit around the attractor, making it impossible to document a systematic exponential divergence from only a few trajectories. Alternatively, the error statistics computed from equation 5.6 are averaged in such a way as to obscure exponential divergence. These computational difficulties are in addition to the geometry of the attractor, which limits divergence by confining the trajectories to a small region of space. 5.4B Turbulent lateral separation eddies Lateral separation eddies are fixed eddies that occur in channel expansions (Fig. 5.10). The main channel flow separates from the bank at the upstream end of the eddy and reattaches to the bank at the downstream end. Where the flow reattaches, water piles up, and the flow within the eddy is driven upstream. Field and lab measurements demonstrate that these eddies pulsate erratically even when subject to constant forcing (constant mainchannel discharge); spectral analysis demonstrates that the pulsations are nonperiodic (Rubin and McDonald, 1995). Figure 5.10 Schematic diagram of a lateral separation eddy. At least three classes of systems can exhibit nonperiodic behavior: (1) highdimensional (many degrees of freedom) linear systems, (2) lowdimensional nonlinear systems (chaos), and (3) highdimensional nonlinear systems. Ideally, we would like to characterize nonperiodic eddy pulsations as one of these kinds of systems. In the first half of this century, Landau (1944) proposed that turbulence (nonperiodic flow) resulted from a large number of modes of excitation of a fluid, but in the last few decades, it has been shown theoretically and experimentally that some examples of nonperiodic flow are lowdimensional chaos (Lorenz, 1963; Ruelle and Takens, 1971; Gollub and Swinney, 1975; Gollub et al., 1980). The experimental studies that have documented lowdimensional chaos have focused on flows that are at the threshold of turbulence (transitional with laminar flow). Several alternate explanations that do not involve lowdimensional chaos (spinglass relaxation, spatial noise amplification, and transients) have been proposed recently and are noted by Crutchfield and Kaneko (1988). They argue that transient effects can dominate a system for long time intervals, and they therefore question the relevance of lowdimensional chaos to fully developed turbulence. Lowdimensional chaos (nonperiodicity with few degrees of freedom) such as the Lorenz system requires both lowdimensionality (by definition) and nonlinearity (to allow nonperiodicity in a lowdimensional system). Two forecasting techniques can be used to evaluate the hypothesis that nonperiodicity in recirculating eddies results from lowdimensional chaos. More precisely, the techniques are an attempt to falsify the null hypothesis that the observed flows are highdimensional systems as proposed by Landau, 1944. First, we can use the deterministicversusstochastic technique to look for nonlinear structure in the time series. The best models (those with the lowest error) for the data set are purely linear (Fig. 5.11), which does not contradict the hypothesis that this turbulence results from linear stochastic processes. Figure 5.11 Forecasting error as a function of number of neighbors for a time series from a pulsating lateral separation eddy in the Colorado River, Grand Canyon (Rubin and McDonald, 1995). This time series is modeled most accurately using linear stochastic models. The time series was recorded during a time when release from Glen Canyon Dam (approximately 220 km upstream) was nominally steady, and stagegauge data indicate that discharge 96 km upstream from the measurement site was 430 m3/s with a standard deviation of 2 m3/s. The second technique is an attempt to determine the number of degrees of freedom of the lateral separation eddy. The best forecasting models employ an embedding dimension approaching 100, too high to support the idea of lowdimensional chaos (and probably too high to be meaningful for the length of the time series). Thus, this result fails to contradict Landau's hypothesis that turbulence is highdimensional, and the deterministicversusstochastic technique fails to identify nonlinearity. In some situations, lowdimensional chaos can be masked by stochastic effects such as noise or measurement error. Stochastic effects resulting from as little as 10% measurement error can be sufficient to mask lowdimensional dynamics (Rubin, 1992). In the recirculating flows in the lab, the flow forcing and measurement error had a standard deviation of only 2% of the mean, suggesting that the nonperiodicity results from highdimensional flow processes (linear or nonlinear) rather than from unsteady forcing or measurement noise. 5.3C Dripping handrail spatial patterns Forecasting can be applied to spatial patterns as well as to time series (Rubin, 1992). For this purpose, equation 5.7 can be modified to (5.8) where z(_{x,y)} is the value of the pixel to be predicted, x and y are coordinates measured relative to the lower left corner of the image, h and w are the height and width of a plaquette (rectangular region) of pixels used to make each prediction, and X and Y are the coordinates of the lower left pixel in the plaquette (Fig. 5.11). For the examples presented here, each predicted pixel is centered lefttoright with respect to the plaquette (X=x(w1)/2). The distance from the pixel that is being predicted to the closest pixel in the plaquette can be equal to 1 pixel (Y=yh, as in Figure 12 and the computations for Figure 13) or can be greater (as in the case of the windripple computations discussed later). In a spatial pattern, the embedding dimension m is equal to wh (the number of pixels in each plaquette); the subscripts of a, (0,0)...(w,h), identify the wh+1 regression coefficients. Figure 5.12 Diagrammatic representation of the spatial forecasting procedure. An image is divided into two parts, one part for use as a learning set and one half to test predictions. A small area between these two regions is not used in either set, to avoid lateral correlation. The image in this example is 100 pixels square; predictee plaquettes are 10 pixels square (h=10, w=10), corresponding to an embedding dimension of 100. Each predictee plaquette is compared with all 10by10 plaquettes in the learning set, and nearest neighbors are evaluated by the least squared differences of corresponding pixels in the predictee and learning set plaquettes. Predictions are made using equation (5.8). The predictive process is illustrated using a twodimensional image, but the procedure is computationally identical to representing each plaquette as a point in 100dimensional space, finding nearest neighbors in this space, and making predictions by tracking the trajectories of those points to their future locations. The image being forecast is a grayscale image of a quasiperiodic pattern created by superimposing 3 trains of sine waves having differing orientations, wavelengths, and planform sinuosities; values of pixels range from zero (white) to 255 (black). Figure 5.13 Forecasting error as a function of embeddingdimension height h and width w for the dripping handrail image (Fig. 2.8). Onestepahead forecasts for this spatial pattern are most accurate using plaquettes 3 pixels wide and 1 pixel high, corresponding to the known dynamics of the equations used to create the image. By applying equation 5.8 to an image of the computergenerated dripping handrail image (Fig. 2.8), we can extract information about the processes used to create the image. For example, for a prediction distance of 1 pixel, forecasting error is a minimum for a plaquette having a height of 1 and a width of 3 (Fig. 5.13), precisely the dimensions of the algorithm used to create the image. For longer prediction distances, the best predictions are made using a larger plaquette, to account for water that flows along the handrail (in or out of the left and right sides of the plaquette). The nonlinearity of the handrail algorithm can also be documented using exploratory forecasting: deterministic nonlinear models outperform the linear models by approximately 5 orders of magnitude (Fig. 5.13). If uncorrelated noise is added to the dripping handrail image, forecasting accuracy decreases, particularly for nonlinear forecasts. Noise levels approaching or exceeding 10% preferentially degrade the nonlinear predictions to such an extent that nonlinear models become less accurate than linear models (Fig. 5.14). Evidently, sensitivity to initial conditions of the nonlinear model has become less an advantage than noisereducing capability of the linear model. Figure 5.14 Forecasting error for the dripping handrail, with varied amounts of additive uncorrelated noise. Forecasts (208 predictees) are for distances of 1 pixel and are based on plaquettes 3 pixels wide by 1 pixel high. For noise levels of 0% and 1%, the best models are nonlinear, employing few of the 19,656 neighbors in the learning set. For noise levels of 10% and 100%, errors are much greater, and the best models are at the linear stochastic extreme. 5.4D Windripple geometry Most studies of ripple and dune geometry have been directed at determining how mean geometry varies for differing flow conditions, but another question is equally interesting: what causes differences in geometry of sequential bedforms in a train created by a uniform, steady flow. Hypotheses to explain this variation include quasiperiodicity, randomness, or deterministic chaos resulting from modification of the flow or sandtransport field by the bedform immediately upstream (Rubin, 1992). Most previous models of ripples and dunes have treated such bedforms as periodic (Allen, 1978; Rubin, 1987) or random (Paola and Borgman, 1991), but several morphologic and behavioral characteristics suggest that the complexity is selforganized. Even where flow is uniform (when averaged on a scale that is large relative to individual ripples), geometric variation of bedforms is ubiquitous. A deterministic cause for this complexity is suggested by experiments and computational models demonstrating that ripples perturb the local boundary layer or sandtransport field, thereby modifying the conditions that shape the next ripple downstream (Raudkivi, 1963; Southard and Dingler, 1971; Anderson, 1990; Forrest and Haff, 1992). In wind, ballistic grain impacts are believed to be more important than fluid effects, and selforganization has been suggested to arise from a sorting process that causes adjacent ripples to attain similar sizes and migration speeds (Anderson, 1990; Forrest and Haff, 1992). In flowing water, downstream coupling occurs because the local flow near the bed is influenced directly by the bedform immediately upstream. In some flows, placing a single artificial ripple or obstacle on a flat bed can be sufficient to induce formation of a train of ripples downstream (Raudkivi, 1963; Southard and Dingler, 1971), even in a flow where ripples otherwise would not form (Fig. 5.15, left). Similar spatial patterns can be simulated using the dripping handrail model starting from initial conditions that are uniform except for a slight perturbation (Fig. 5.15, right). In the model, each horizontal row in the image represents a step through time and is computed from the conditions at the previous time. In the real ripple pattern, each row represents an increasing distance downstream and develops in response to upstream conditions. In both the real and computational examples, the patterns that develop illustrate a similar sensitivity to initial (previous or upstream) conditions. Figure 5.15 Development of instabilities beginning from a slight artificial perturbation. (Left) Planform view of an experiment to create ripples on a flat bed in flowing water. No sandtransport occurred until a mound of sand was placed on the bed. The mound disturbed the flow, producing another bedform, which in turn disturbed the flow, and so on downcurrent (from top to bottom). Photograph from Southard and Dingler (1971), digitally processed to remove perspective distortion. (Right) Computer simulation of spatial differences using the dripping handrail model, beginning with initial conditions (top of image) that were uniform except for a single pixel. Image is a recomputation of Crutchfield and Kaneko (1987, fig. 45). In both images, conditions at the top of the image determine the structure at lower locations. A complete treatment of ripple dynamics requires an additional dimension of complexity beyond that implied above. The discussion above implies that the downcurrent ripple geometry is a function of upcurrent geometryunchanging through time. Although this is true for some flows (as illustrated in Figure 5.14a), in most flows ripple dynamics is more properly considered as a twodimensional spatial system that evolves through time as individual ripples interact while migrating downcurrent. In some ripple fields, downcurrent coupling produces currentparallel lanes of ripples with abrupt discontinuities between lanes (Allen, 1968). Such structure implies that acrosscurrent coupling of the real ripples is weak relative to downcurrent coupling, an hypothesis that is supported by computational experiments with the dripping handrail model. In addition to these reported spatial variations in individual ripples in uniform flows, both field observations and laboratory experiments have found that the planform geometry of dunes becomes more complex as flow strength increases (Allen, 1968; Middleton and Southard, 1984). This increasing spatial complexity is analogous to the increasing complexity observed at increasing flow strengths in Couette cylinders (Gollub and Swinney, 1975) and convection cells (Gollub, Benson, and Steinman, 1980). A photograph of ripples formed by wind (Fig. 5.16) was digitized and analyzed using the spatial forecasting techniques discussed above; pixel intensity was used to represent ripple geometry. Unlike the synthetic images, in which intensity is proportional to elevation, intensity in the digitized photograph image is more nearly a function of slope. Figure 5.16 Photograph of ripples formed by wind blowing over a sand bed that is otherwise flat (Rubin, 1992). Wind direction is from top to bottom; ripple wavelength is approximately 10 cm (corresponding to 8.3 pixels after the photograph was digitized). The digital version of this image was used as input for the forecasts in Figures 5.16 and 5.17. As in the case of onedimensional time series, forecasting errors of spatial patterns vary with embedding dimension (Fig. 5.17). The best forecasts are obtained using a plaquette with a height of 9 pixels (slightly greater than the mean wavelength of 8.3 pixels) and a width of 5 pixels. This result is in good agreement with the hypothesis that the geometry of any one ripple depends on the geometry of the ripple immediately upstream. Figure 5.17 Predictability of wind ripples in Figure 5.15 as a function of embedding dimension. The best forecasts are based on plaquettes that extend approximately 1 wavelength downwind and onehalf wavelength acrosswind. Forecasting error was measured as a function of the number of nearest neighbors used to make the predictions (Fig. 5.17). The ripple pattern is forecast most accurately using a nonlinear stochastic model, which offers a 60% improvement over the best linear model. This property of the forecasting errors suggests that the ripple pattern contains a nonlinear component. Forecasting errors for the original image differ significantly from surrogate images that were created to have the same FFT magnitude but randomized phases. The difference is as great as 8 sigmas (Theiler et al., 1992). That is, the difference between the error for the original and the mean error for the surrogates is 8 times the standard deviation of the errors for the surrogates (Fig. 5.18). These results (most accurate predictions with a nonlinear model and significant differences in forecasting error between the real and a linear surrogate pattern) are inconsistent with the null hypothesis that the structure is linear and nondeterministic. Figure 5.18 Forecasting error as a function of number of nearest neighbors for wind ripples and 11 surrogates. Plaquette size is 9 pixels (downwind) by 5 pixels (acrosswind), as determined to be optimum in Figure 5.16. The best nonlinear model offers a 60% improvement over the best linear model, and the original differs considerably from the surrogates (values of sigma as large as 8). 5.4E Surfzone transport Despite considerable study, virtually no sediment transport models are capable of accurately predicting the instantaneous suspended sediment concentration in response to wavegenerated currents in the surf zone. Most models of unsteady sediment transport merely treat the unsteady flow as a timevarying river, and use relations developed for steady flow to predict sediment concentration; any such approach must result in a predicted concentration time series that mimics the associated velocity time series. Merely inspecting simultaneous time series of nearbed flow velocity and sediment concentration demonstrates that this is not the case (Fig. 5.19); concentration shows no visible relation to the flow. As might be expected for time series with such an appearance, the correlation coefficient between sediment concentration and flow velocity (or absolute value of velocity) is zero. One hypothesis to explain this lack of predictability is that the system is noisy. For example, sediment might become suspended somewhere upstream and then be advected to the measurement device. Although such unpredictable processes probably occur, the inputoutput techniques described above can be used to predict sediment concentration relatively accurately, which demonstrates that concentration is controlled by the flow but that the relation is more complicated than previously believed (Jaffe and Rubin, 1995). Figure 5.19 Results of inputoutput modeling of suspended sediment concentration in the surf zone (from Jaffe and Rubin, 1995). Concentration does not mimic the flow, but nevertheless can be predicted from the flow. The best models are nonlinear (small number of nearest neighbors k), 1.5 seconds between measurements, an embedding dimension m of 3 (during a time interval of 3.0 seconds), and a lag n of 1.5 seconds. The resulting correlation coefficient between predicted and observed concentration is 0.6. For comparison, the correlation between simultaneous values of flow velocity and concentration is zero. To make accurate predictions requires using three techniques described above: (1) embedding (relating concentration to a sequence of velocity measurements rather than to a single value), (2) nonlinear deterministic modeling (using a nonlinear relationshipa low value of kto relate forcing to response in equation 5.7), and (3) incorporating a lag time between the input (flow) and output (sediment concentration) measurements. This approach leads to models that can predict concentration relatively accuratelythe correlation coefficient between predicted and observed concentrations is 0.6. The models even predict pulses of suspended sediment at times when velocities are relatively weak, as illustrated in Figure 5.19. In retrospect, the benefits of all three modeling essentials can be appreciated intuitively. First, a sequence of velocity measurements provides considerably more information than a single value. For example, a single velocity measurement of 60 cm/second might occur at the peak of a small wave, or during either the accelerating or decelerating phase of a large wave. By using a sequence of three velocity measurements (spread over approximately 1/4 wave period), it is possible to distinguish all of these situations. Forecasting was also able to disprove the hypothesis (for this set of data) that the concentration during one wave was controlled by previous waves, because inclusion of such a long velocity history led to poorer forecasts. Second, a nonlinear model is able to use a different relation between flow and concentration for each of the three flow situations. Timing of the suspended sediment pulses suggests that they are caused by turbulence, which tends to be enhanced during decelerating flow and suppressed during accelerating flow (Jaffe and Rubin, 1995). Nonlinear models allow different relations to be used for accelerating and decelerating phases of a wave, as well as for waves with different peak velocities. The third modeling requirement is a lag time between the end of each observed flow sequence and the observed concentration. Intuition might suggest that lag time should vary with height above the bed, because more time is required to mix sediment upward greater distances. Forecasting can be used to document and quantify this effect. The lag times that lead to the most accurate forecasts range from 1.5 seconds (to mix sediment to an elevation of 13 cm) to 8.5 seconds (to mix sediment to an elevation of 60 cm). This example illustrates several uses of forecasting: (1) demonstrating that a time series has a deterministic structure (not merely due to stochastic or unpredictable processes), (2) developing a model that can make accurate forecasts, and (3) identifying the important processes that are at work, so that they can be studied, quantified, and incorporated in other models. 5.4F Global climate As discussed above, most geological systems are subject to unsteady forcing, which makes it difficult to separate the behavior of the system from the behavior of the forcing in an observed time series. Hunter and Theiler (1992) developed a technique for forecasting such inputoutput systems and used global climate to illustrate this technique; their replicated results are discussed below. In this example, the history of global climate (output) is represented by Imbrie et al.'s (1984) time series of d^{18}O measured in fossil foraminifera in marine cores, and solar forcing (input) is given by the astronomically calculated insolation at 65deg. N. latitude in July (Berger and Loutre, 1991). The approach is to formulate a variety of models employing different values of embedding dimension m and different number of nearest neighbors k. The results indicate that climate can be predicted from the insolation (Fig. 5.20), with a correlation coefficient of 0.64. To achieve this level of performance requires using a nonlinear model with an embedding dimension of 4 (during a time interval of 6000 years). The best linear model does not perform as well (correlation coefficient of 0.52). Because models based on other spectral analysis techniques are linear, they are not as accurate as the nonlinear models developed using equation 5.7. Figure 5.20 Predicted and observed climate (d18O) forecast from insolation. The best models are nonlinear and employ and embedding dimension of 4 (during a time interval of 6000 years). The correlation coefficient between predicted and observed d18O is 0.64. The correlation coefficient for the best linear model is 0.52. Insolation calcualtions are from (Berger and Loutre, 1991), and time is plotted relative to 1950 (negative in the past and positive in the future). One interesting aspect of this inputoutput modeling is that it allows climate to be predicted for long times in the future, with little loss in expected accuracy. Although the relation between insolation and climate is nonlinear, insolation, which is controlled by periodic (linear) astronomical processes, can be predicted accurately for long times into the future (Berger and Loutre, 1991). Nonlinear modeling using equation 5.7 can then be used to predict the climate from the predicted insolation. 5.4F Summary of uses and limitations of forecasting The discussion and examples in this chapter illustrate a variety of applications of forecasting. The two main uses are to predict the future and to characterize systems. Making predictions (such as global climate or sediment transport) has obvious practical value; the value of system characterization is less obvious but includes:
Forecasting has a number of serious requirements and limitations. Many of these requirements are difficult or impossible to meet in the earth sciences:
Despite the substantial limitations listed above, the concepts developed for the study of dynamical systems can help us understand geological systems. Even if an earth scientist never applies the techniques in this publication, understanding the multiple causes of complicated nonperiodic time series is essential to interpreting earth history. Continue to next section.
