5.3. Weighting the Data for Realistic Uncertainties

As noted in Section 2.2, the dominant errors in GPS observations are not random, but rather have correlation times ranging from a few minutes (multipath, water vapor) to days or months (station motions due to monument instability and atmospheric, hydrological, and ocean loading; non-gravitational forces on the satellites). Ignoring these correlations will result in uncertainties that are too optimistic, mildly so for observations of a few days, more strongly so for continuous data. In this section we assume that you understand the theoretical and empirical basis for various approaches, and we describe how to implement them in GLOBK. (For some theoretical background see the “Error Analysis” presentation from a recent workshop under the “Workshops” menu item on the GAMIT/GLOBK web site.)

Studies of long-term GPS coordinate time series have shown that the noise over periods of a few days is nearly white but increases steadily for longer periods. With continuous data we can analyze the time series and, based on the character of the noise over periods of weeks to years, infer the noise at the long periods relevant to estimating the error in site velocities or long-term decay following an earthquake. A Kalman filter such as globk cannot without significant complication realize all possible error models (in particular a “flicker noise” model that best fits most GPS time series), but it can easily realize the two “end-member” models—white noise and a random walk—that together allow us to achieve meaningful short-term statistics for data editing and realistic uncertainties for estimated velocities.

The \(\chi^2\) (chi-square) increments (chii) computed by globk when h-files are combined reflect the inconsistency in the coordinate estimates from the current solution and the estimates from the new h-file and can usually be seen in the time series. If the data from all sites are properly weighted for their short-term scatter, the nrms of the time series and the chi-square increments should be ~ 1.0. Chi-square increments less than 1 will occur when there is little redundancy or if you have down-weighted the data to allow for more realistic velocity estimates. Values greater than one will occur if the sigmas on the h-files are too small or there are uncompensated outliers among the coordinate estimates. These must be found using the time series and corrected as described in Section 5.2.

We have noted earlier that the weighting of the phase data in GAMIT is such that the coordinate uncertainties in daily h-files are mildly pessimistic compared to the daily scatter, so you should expect chi-square values of 0.3 to 1.0 when you stack these files in globk. If you combine many days together, however, the h-file uncertainties will get smaller, by the square root of the number of days combined, and the chi-square values increase linearly with the number of days. You can compensate for this by adding white noise to the coordinate estimates using the sig_neu command. The values to be added can be determined from the short-term scatter in the time series. Adding white noise for a uniform network of continuous stations is, strictly speaking, not necessary since the true uncertainty in velocity estimates will be dominated by the long-period noise added using mar_neu (random-walk) command, but it can be useful in editing by establishing a more realistic ratio between the uncertainties within an h-file that has a combination of continuous and survey-mode data. For example, if some position estimates in the h-file are based on thirty 24-hr sessions and others on two 24-hr sessions, the position uncertainties on the h-file for the 30-day sites will be a factor of four smaller than the 2-day sites even though their true uncertainties are nearly the same because of the dominance of correlated noise. To take an example, suppose the formal uncertainty and scatter in horizontal coordinates for all sites from a 24-hr session is 1.0 mm. In the combined h-file, the uncertainty for the 30-day site will be 0.2 mm and for the 2-day site 0.7 mm. If we add quadratically 0.75 mm of white noise to the horizontal coordinates of all sites on the h-files (sig_neu all .00075 .00075 0), then the uncertainties used in the solution will become 0.8 mm for the 30-day sites and 1.0 mm for the 2-day site, more closely reflecting their scatter (and hence contribution to \(\chi^2\)). You may also want to add white noise for individual stations that show a higher level of short-term scatter than is reflected in their uncertainties—that is a higher nrms in the daily time series. Use of the sig_neu command for reweighting is preferred over the variance factor used with h-files in the .gdl list since an elevated noise level is usually associated with only a subset of sites in the h-file. An exception would be when you have h-files generated from GAMIT solutions using different sampling times, in which case the variance factor is useful to balance the weight with other h-files in your solution.

The primary way we account for the long-period noise that most influences velocity estimates is by adding a random walk to the error model used by globk. For sites with ~ 100 or more position estimates, the appropriate value of the random walk for each component can be determined using the First-Order Gauss-Markov Extrapolation (“FOGMEx”) or “realistic sigma” algorithm described by Herring [2003] and Floyd and Herring [2020], and in the “Error Analysis” presentations from the workshops. The algorithm uses the scatter in the time series for all possible averaging times (e.g. 1 day to 600 days for a 1200-day time series) to determine how the time series statistics depart from white noise. If the noise is white, the rms should decrease as the square root of the averaging time, and the nrms stay the same. In practice, however for GPS time series, the rms decreases much more slowly with averaging time, and nrms increases (since the formal uncertainty decreases). By fitting nrms (actually \(\chi^2\)) versus averaging time to an exponential function (expected for a first-order Gauss-Markov process) and evaluating this function for infinite averaging time, the algorithm determines both the scale factor to be applied to the white-noise velocity sigma to get a more realistic sigma and the value of the random walk that will produce this sigma in the globk solution. To get the mar_neu commands, run

$ sh_gen_stats -ts SUM.tsfit

where SUM.tsfit is the output of an sh_plot_pos run invoking tsfit (or tsfit directly); or

$ sh_gen_stats -ir va[root].ens

where {[root]} is an identifier of your choosing and the input file has the name va[root].ens, obtained by renaming the VAL file from running sh_plotcrd (or ensum or enfit directly). You can also invoke the FOGMEx algorithm from within tsview by selecting “Realistic Sigma”. For sites with fewer than ~ 100 estimates, the values you use for the random walk will necessarily be less precise, so you might use, for example, the median value from sh_gen_stats for the network or an “eyeball” estimate of long-period systematics in the time series.