Data Sets

DATA SETS

For a description of Fractional Gaussian Noise, click here.

GRAPHICAL OUTPUT

These are time series plots of FGN corresponding to different values of H.

H=0.5 (Gaussian white noise)
H=0.7
H=0.9

IMPLEMENTATION

Fractional Gaussian Noise series have been simulated using a version of the Durbin-Levinson Algorithm, implemented in S-Plus using C routines. This algorithm is described for example in Chapter 8.2 of Time Series: Theory and Methods, by P.J. Brockwell and R.A. Davis, Springer-Verlag, New York, 2nd edition, 1991. The source code for this algorithm is available here.

FARIMA

FARIMA series are fractionally differenced auto-regressive moving average series. They are used at great length in time series analysis. For reference see Brockwell and Davis, Time Series: Theory and Methods, Springer-Verlag, 1991. . S-Plus has a function for simulating FARIMA series, arima.fracdiff.sim. This function uses a version of the Durbin-Levinson algorithm to produce the series. Since the algorithm involves variances and covariances, it is inappropriate when stable series are used.
A more detailed description of FARIMA series.

SAMPLE RUN of arima.fracdiff.sim.
# Denotes comments added after the session.

# In S-Plus:

> X11()	                         # Enable  graphics window.
> source("farima.generate")	 # Read in the program.
> temp_arima.fracdiff.sim(model=list(d=0.3),n=10000)
                                 # Generates a Gaussian FARIMA(0,d,0) series
				 # with d=0.3, length=10000.
> tsplot(temp)			 # Time series plot of the simulated series.
> q()                            # Quit S-Plus.
Graphical output.

OTHER GRAPHICAL OUTPUTS

FARIMA(0,d,0) with stable innovations. d=0.3, alpha=1.5.
Close-Up of previous plot (from t=3580 to t=3600).
FARIMA(0,d,0) with Pareto innovations. d=0.3, alpha=1.5.

IMPLEMENTATION

A different way of simulating the FARIMA series can be used. It is especially appropriate if stable series are involved. The algorithm ( farima.generate ) generates whatever innovations are needed, and then passes them through a differencing filter. Ideally this filter should have a summation up to infinity. In practice, it is truncated to a value n. The function available here currently can only generate FARIMA(1,d,1) series with stable innovations, but it should be easy to modify for whatever innovations and parameters are necessary. To get a FARIMA(0,d,0), set theta =0, phi =0 below.
Following is a brief description of some of the variables used in the calls to the functions.

n is the length of the summation in the filter routine.
N is the length of the wanted time series.
d is the differencing exponent.
alpha is the parameter governing the exponent of the stable innovations (index).
theta and phi are the moving average and auto-regressive coefficients.
sigma is the scale parameter of the innovations.
beta is the skewness coefficient in stable distributions (Default = 0).
One should choose larger n's for larger d's. For example, we have used n = 2000 for d=0.4, n=500 for d=0.3. The farima.generate.pareto function generates FARIMA(0,d,0) with Pareto innovations. It uses the parameters n, N, d, alpha, sigma , and is part of the source code below.

ETHERNET DATA

The Ethernet series used here are part of a data set collected at Bellcore in August of 1989. They correspond to one "normal" hour's worth of traffic, collected every 10 milliseconds, thus resulting in a length of 360,000. One data set measures the number of bytes per unit time, and one measures the number of packets per unit time. These data sets have been widely used.

Plot of byte data.
Plot of packet data.

They were first analyzed in W. E. Leland, M. S. Taqqu, W. Willinger and D. V. Wilson, "On the self-similar nature of Ethernet traffic (Extended version)", IEEE/ACM Transactions on Networking, 1994, 2, pp. 1-15.
Ethernet traffic sets and information can be obtained at the Internet traffic archive, here .

IMPLEMENTATION

The byte and packet data sets are available.