Summary Report on an Initial Correlation and Spectral Analysis of the Complete ABFM Merged File Set
Francis J. Merceret/NASA/YA-D
First Draft 12 June 2003
1. Introduction. This report presents a summary of the results of initial correlation and spectral analyses of all available ABFM “merged” files. For this initial analysis, the complete file was used in each case. No attempt was made to select anvil segments, or in-cloud regions, or even to remove ferry flight, take off or landing segments. The goal was to get an overview of the correlation and spectral behavior of the various variables as guidance for possible future stratifications.
The variables considered were electric field magnitude and 2DC cloud particle concentration as measured from the Citation II and the following three radar variables measured by the Melbourne NEXRAD where available, otherwise by the Patrick AFB WSR-74C: Reflectivity at the position of the aircraft, average reflectivity in an 11x11 Km box horizontally centered on the aircraft, and average reflectivity in an 21x21 Km box horizontally centered on the aircraft. The box averages are computed from 5 to 20 Km in the vertical regardless of aircraft altitude. Table 1 below shows the data set and which radar was used for the radar variables.
Date / Radar / Date / Radar04 June 2000 / NEXRAD / 27 May 2001 / NEXRAD
07 June 2000 / 74-C / 28 May 2001 / NEXRAD
11 June 2000 / NEXRAD / 29 May 2001 / NEXRAD
12 June 2000 / NEXRAD / 02 June 2001 / NEXRAD
13 June 2000 / NEXRAD / 04 June 2001 / NEXRAD
14 June 2000 / 74-C / 05 June 2001 / NEXRAD
17 June 2000 / NEXRAD / 06 June 2001 / NEXRAD
20 June 2000 / NEXRAD / 07 June 2001 / NEXRAD
23 June 2000 Flt1 / NEXRAD / 10 June 2001 / NEXRAD
24 June 2000 Flt1 / 74-C / 15 June 2001 / NEXRAD
24 June 2000 Flt2 / 74-C / 18 June 2001 / NEXRAD
25 June 2000 / 74-C / 23 June 2001 / NEXRAD
28 June 2000 Flt1 / NEXRAD / 24 June 2001 / NEXRAD
28 June 2000 Flt2 / NEXRAD / 25 June 2001 / NEXRAD
22 May 2001 / NEXRAD / 27 June 2001 / NEXRAD
25 May 2001 / NEXRAD / 28 June 2001 / NEXRAD
Table 1. The data used for this work.
Correlation analysis was performed to assess the degree of independence of each ten-second record in the merged data files. This is necessary in order to determine the effective sample size for statistics generated from these data. It also provides information on the “time” scales over which the various variables are auto or cross-correlated. Time is placed in quotation marks here because although the data are sequenced based on the time they were collected, the aircraft was moving at about 100 m/s and thus the correlations are actually being taken along a path in space-time. Hereafter, the independent variable will be referred to simply as time without the quotation marks, but this caveat needs to be kept in mind.
Spectral analysis was performed in order to assess the degree to which the variance and co-variance of the quantities of interest exhibit preferential scales. This could provide clues about the nature of the physical processes involved.
2. Correlation Analysis. Lagged auto and cross correlations were performed for lags from –20 to 20 records (-200 to 200 seconds) for each file, averaging over the entire file. Since the correlation coefficients are inherently normalized variables (ranging from –1 to 1), data from multiple files may be directly compared or used in an ensemble average regardless of differences in the variance or mean of the data from day to day. Since this is a summary report, only results from the ensemble analysis will be presented.
2.1 Electric Field. The maximum, mean, median and minimum value of the autocorrelation of the E-field as a function of lag is shown in Figure 1.
Figure 1. E-Field autocorrelation.
The maximum at each lag is the largest value at that lag for any of the days in the ensemble. The minimum value is the smallest value for any day in the ensemble. The mean and median are the ensemble mean and median. Clearly the mean and median do not significantly differ, suggesting a symmetrical distribution about the mean.
The mean autocorrelation can be modeled closely by a first order autoregressive (AR1) process (Wilks, 1995, Chapter 8) with a coefficient of 0.91 as seen in Figure 2.
Figure 2. Measured and modeled (AR1, = 0.91) E-field auto correlations.
For an AR1 process with parameter , the number of effective degrees of freedom for estimating the process mean is given by
N = N (1- )(1a)
and the number of degrees of freedom for estimating the standard deviation is given by
N = N (1-2)(1b)
(Bayley and Hammersley, 1946). For the electric field, the effective sample size is thus an order of magnitude smaller than the actual sample size.
2.2 Cloud Concentration. The 2DC cloud particle concentration is slightly less well behaved than the electric field magnitude. Figure three shows the statistics of the ensemble in the same style used in Figure 1.
Figure 3. 2DC autocorrelation.
The AR1 model is harder to fit to the 2DC data. The value of that matches the data best for small lags does not work well at the larger lags. The measured autocorrelation initially decays faster than predicted by an AR1 model that fits the autocorrelation for lags in the mid range or larger. To match the larger lags, = 0.91, same as that for the electric field, would be required. For the smaller lags, = 0.88 works better.
Figure 4 shows the model with = 0.88.
Figure 4. Measured and modeled (AR1, = 0.88) 2DC concentration auto correlations.
Fortunately, equations (1) are not overly sensitive to small differences in . The effective number of degrees of freedom is similar for cloud concentration and e-field.
2.3 Radar Reflectivity. The major difference between radar reflectivity and the two previously discussed variables is that there are three separate radar variables and two of them are spatial averages. Spatial averages should be more highly autocorrelated the larger the averaging volume. The data are consistent with this expectation. The radar ensemble statistics are well behaved, and only the means and models will be discussed here. Figure 5 shows all three radar variables and their corresponding AR1 models.
Figure 5. Measured and modeled (AR1) radar reflectivity aurocorrelations. See text for the AR1 parameter values.
For the reflectivity at the aircraft (averaging volume = 1Km), = 0.885. For the 11x11 Km box average, = 0.965. For the 21x21 Km box average, = 0.98. For the box averages, the reduction in the number of effective degrees of freedom approaches two orders of magnitude.
2.4 Cross Correlations. The cross correlations behave very much like the autocorrelations except that the value at zero lag is less than 0.5 rather than equal to 1. The decay is exponential with time scales very close to those found for the autocorrelations, suggesting the same degree of dependence and associated reduction in effective degrees of freedom. The ensemble statistics for the cross-correlation between the e-field and the radar at the aircraft position are shown in Figure 6 as an example.
Figure 6. Ensemble statistics for the e-field-radar cross-correlation.
3. Spectral Analysis. Spectral analysis was performed using fast Fourier transforms (FFT) of 32 record subsets of data. Given the ten second sample spacing, this resulted in spectra covering frequencies up to 0.05 Hz with a resolution of 0.00313 Hz. The corresponding spatial scales at an aircraft speed of 100 ms-1 are 2 to 32 Km. Within each file, the resulting Fourier components were used to compute power, co and quad spectra for each subset. The subset power spectra were averaged to produce the average power spectra for the file. The subset co and quad spectra were also averaged. The averaged co, quad and power spectra were used to generate the coherence spectra for the file.
As with the correlation analysis presented in section 2 above, the daily average spectra were assembled into an ensemble and only the ensemble results will be presented in this summary. In order to make that possible, the daily power spectra were normalized by dividing each spectral estimate by the sum of the estimates over all frequencies. This is equivalent to normalizing by the signal variance. The coherence is an inherently normalized variable ranging from 0 to 1.
In order to perform the FFTs, the 32 records in each subset had to be contiguous. There could be no QC flags affecting any of the variables of interest during a continuous 320-second period in order to have a usable subset. This resulted in a considerable reduction in the sample sizes used in this analysis compared with that used for generating the correlations for which the data did not have to be grouped in contiguous blocks. Some of the daily averages contained only a single spectrum. The maximum number of spectra averaged in a single file was 25. Averages of less than 9 spectra were not used. This will be discussed further below as it relates to the confidence limits on the spectral estimates.
The sample size depended on which radar variable was being used since the QC flags for the three radar variables are not identical. The merged files treat reflectivities below the noise floor of the radar as missing data, and they are flagged. As a result, many records of the radar at the aircraft are flagged while the corresponding box averages are not because the aircraft is locally in low reflectivity or a scan gap that does not affect the box average. After discarding files in which less than 9 spectra contributed to the average, the ensemble for the radar at the aircraft contained only 8 files. The ensemble for the 11Km box contained 22 and the ensemble for the 21 Km box contained 24. This will be discussed further below as it relates to the differences in spectra between the various data sets.
3.1 Power Spectra. All of the power spectra exhibited power law dependence with an exponent between -1.5 and –2. This is consistent with the spectral behavior of other atmospheric variables such as wind and temperature in the mid-troposphere (Merceret, 2000; Nastrom et al., 1997; Wilfong et al., 1997). Figure 7 shows the electric field as an example.
Figure 7. Power spectra of e-field with power law fits to aircraft position and 21Km box data sets.
Although the e-field power spectral density (PSD) was computed three times on three different data sets (in order to compute the coherence with each of the three radar variables), the results cluster very tightly. The 95% confidence limits for the spectral estimates are +/- 40 % for the AC data and +/- 10 % for the box averages. The cloud concentration PSDs look like those for the electric field and are not shown here.
The radar PSDs showed some difference between the aircraft position data and the box averages as seen in Figure 8. This is due to the low-pass filtering effect of the box averaging process. It increases the slope of the spectrum by damping the small scale fluctuations.
Figure 8 Power spectra of radar reflectivity with power law fits for the aircraft position and 21Km box data sets.
3.2 Coherence. Signals are conventionally considered "coherent" when their coherence exceeds 0.25, although 0.5 is also sometimes used (Merceret, 2000). With the exception of the coherence between the e-field and the radar at the aircraft position, and the e-field and cloud concentration (for the box average data sets only), there is no significant coherence between any of these variables. Cloud concentration does not cohere with any of the radar variables at any scale as shown in Figure 9.
Figure 9. Coherence between 2DC cloud concentration and radar reflectivity at the aircraft position (AC), 11 Km box average (11) and 21 Km box average (22).
The electric field coherences are shown in Figures 10 and 11.
Figure 10. Coherence between e-field and radar reflectivity at the aircraft position (AC), 11 Km box average (11) and 21 Km box average (22).
Figure 11. Coherence between e-field and 2DC cloud concentration for data sets using radar reflectivity at the aircraft position (AC), 11 Km box average (11) and 21 Km box average (22).
The 95% confidence limits for the coherence depend not only on the sample size, but also on the coherence value, and in a highly non-linear way (Otnes and Enochson, 1978). Table 2 presents the upper and lower 95% confidence limits for the three data sets (one for each radar variable) for three coherence values spanning the observed range.
Limit(coherence) / Aircraft Position / 11x11 Km Box / 21x21 Km BoxUpper (0.4) / 0.506 / 0.458 / 0.453
Lower (0.4) / 0.280 / 0.338 / 0.344
Upper (0.2) / 0.309 / 0.257 / 0.252
Lower (0.2) / 0.098 / 0.144 / 0.149
Upper (0.05) / 0.127 / 0.088 / 0.084
Lower (0.05) / 0.005 / 0.021 / 0.023
Table 2. 95% confidence levels for selected coherence values and data sets
These limits range from 20% to over 100%, and in the case of the small data set using the radar at the aircraft position, they overlap. Accordingly, conclusions should be drawn from apparent differences or trends in the data only with great caution.
Subject to the caution just expressed, it does seem likely that at the larger scales the e-field is more coherent with the radar at the aircraft position than with the box averages. That is not surprising since the box averages act as a low-pass filter which introduces non-local effects and causes changes in both the amplitude and phase of the signal.
Subject to the same caution, It also seems likely that the e-field and cloud concentrations become coherent on the larger scales. The agreement between the 11 Km and 21 Km box average data sets is not accidental, since these two ensembles contained nearly the same files. Although the e-field-cloud coherence should not depend on which radar variable is examined (radar does not enter this coherence calculation), it does depend on the data set being used. The ensemble for the aircraft position contained only one third as many files as those for the box averages. Since all three curves exhibit the same trend, and since the confidence intervals for the smaller data set are so large, the lower coherence for the aircraft position data are not a cause for major concern.
4. Discussion. The power spectral analysis presented above shows that there is no dominant or preferred scale for cloud particle concentration, whether measured in situ or remotely by radar, or for electric field in the range from 2 to 32 Km. Indeed, as noted above the power spectra have the same characteristics as spectra of wind, temperature or humidity in random, turbulent flow. The spectral slopes are consistent with an AR1 model having the same parameter values as obtained in the correlation analysis (Wilks, 1995). The spectra support the use of the AR1 model to estimate the effective sample size for a given actual sample size. This further confirms that at the scales of interest here, we are dealing with scale-independent random red-noise processes.
The coherence analysis provides considerably less guidance. The confidence limits are so large that only limited conclusions may be drawn. Certainly the results indicate that at scales smaller than 10 Km there is no significant coherence between any of the variables discussed. At scales approaching 30 Km, the electric field appears to become somewhat coherent with cloud particle concentration and with the radar reflectivity at the position of the aircraft. This occurs at the long wavelength end of the analysis range and the number of points showing this tendency is too small to attempt fitting any kind of model to the data.
Acknowledgments. The data for this study were downloaded from the exquisite ABFM website developed and maintained by the National Center for Atmospheric Research. Jennifer Ward of NASA/Kennedy Space Center processed most of the data and compiled it onto spreadsheets for analysis.
REFERENCES
Bayley, G.V. and J.M. Hammersley (1946): The "Effective" Number of Independent Observations in an Autocorrelated Time Series, J. Stat. Soc. London, 8, 184 - 197.
Merceret, F.J. (2000): The coherence time of midtropospheric wind features as a function of vertical scale from 300m to 2 km, J. Appl. Meteor., 39, 2409-2420.
Nastrom, G.D., T.E. Van Zandt and J.M. Warwick (1997): Vertical wavenumber spectra of wind and temperature from high-resolution balloon soundings over Illinois, J. Geophys. Res., 102, 1567-1575.
Otnes, R.K. and L. Enochson (1978): Applied Time Series Analysis, Vol. 1, John Wiley and Sons, NY, NY, 449 pp.
Wilfong, T.L, S.A. Smith and C.L. Crosiar (1997): Characteristics of high-resolution wind profiles derived from radar-tracked jimspheres and the rose processing program, J. Atmos. Oceanic Technol., 14, 318-325.
Wilks, D.S. (1995): Statistical Methods in the Atmospheric Sciences, Academic Press, NY, NY, 467 pp.