** Example of processing a complex set of multiyear observations ** # Last edited 170620 # Data set: Three years for surveys in multiple parts of the eastern-Mediterranean/Caucasus # Strategy: - combine daily sGPS GAMIT processing with cGPS h-files - aggregate the processing by survey length up to about 20 days - use survey-combined H-files to get long-term repeatabilities and velocities # Assumed directory structure: /EMed /2004 /2006 /2008 /tables /vsoln /emed_rinex Under each year (experiment) directory is the usual structure for sh_gamit and sh_glred (e.g. /tables, /[day-number], /rinex, /gsoln, etc.. Under /emed_rinex are the survey-mode RINEX files organized by country and year, e.g., /azerbaijan/2004. # Step 1: Determine the temporal and spatial distribution of the data For example, if the data are stored in directories by country and year, within each RINEX directory you can use sh_get_times to obtain a list, e.g. in rinex_data/azerbaijan/2004, type 'sh_get_data -sort -f *.04o > times.azer04.sorted', where the file times.azer04.sorted will have shik2672.04o 2004 267 13:31 269 13:42 48:11 5784 3 khid2672.04o 2004 267 13:39 269 13:41 48:2 5766 3 kurd2731.04o 2004 273 05:06 275 05:13 48:7 11553 3 yevl2735.04o 2004 273 07:58 275 08:01 48:3 11536 3 ayaq2730.04o 2004 273 09:44 275 08:38 46:54 5630 3 kebe2821.04o 2004 282 09:36 284 07:59 46:23 11137 3 qurd2821.04o 2004 282 11:32 284 08:02 44:30 10681 3 shek2841.04o 2004 284 10:13 286 07:05 44:52 10771 3 kate2840.04o 2004 284 13:23 286 11:32 46:9 11046 3 and in emedrinex/georgia/2004/rinex, type 'sh_get_times -sort -f *.04o > times.georgia.sorted' zinv2730.04o 2004 273 13:15 274 13:14 23:59 2878 2 zinv2740.04o 2004 274 14:23 275 14:19 23:56 2873 2 pasn2750.04o 2004 275 18:01 276 14:24 20:23 2447 2 pasn2760.04o 2004 276 14:25 277 14:19 23:54 2869 2 pasn2770.04o 2004 277 14:51 277 19:01 4:10 502 1 kazb2780.04o 2004 278 06:31 279 06:29 23:58 2876 2 kazb2790.04o 2004 279 06:30 279 18:04 11:34 1389 1 kazb2791.04o 2004 279 18:05 280 06:50 12:45 1531 2 kudi2810.04o 2004 281 10:11 282 06:29 20:18 2436 2 kudi2820.04o 2004 282 06:30 282 18:04 11:34 1389 1 kudi2830.04o 2004 282 18:05 283 10:22 16:17 1955 2 ingu2840.04o 2004 284 14:46 285 14:24 23:38 2836 2 ingu2850.04o 2004 285 14:25 286 14:19 23:54 2869 2 So we have data 267, 273–279, and 281-286. Note that all of the Azeri data span parts of three days, so the default setting 'rx_doy_minus 1' in process.defaults will not be sufficient to pick up the data two days earlier. We can change this in process.defaults or make it explicit in the sh_gamit command line. # Step 2: Set up the control files for the first year Run sh_setup to put templates or actual files you'll need into EMed/2004/tables. From the /2004 level type 'sh_setup -yr 2004' Edit process.defaults and sites.defaults for this year. Make appropriate changes to the template process.defaults: # Directory path for RINEX archives (search all levels) set rnxfnd = "/emedrinex" # H-files archive set hfnd = /net/everest/raid6/ftp/pub/MIT_GLL/H04 # Set compress (copts), delete (dopts) and archive (aopts) options. # Possible d-, c-, and a- opts: D, H, ao, ac, as, b, c, d, e, g, h, i, j, k, l, m, o, p, q, t, x, ps, all" set copts = ( o q mkx ) set dopts = ( D ao b c g i j k m p ps t x y ) The copt/dopt settings shown here represent a space-saving approach, retaining only the files you are likely to view unless something goes wrong. We rarely use the option to put some files in an archive directory, preferring to save only the minimal files in the day directories themselves. Make appropriate changes to the template sites.defaults to automatically download and include in the processing (by being present in 2004/rinex) the ITRF08 sites you wish to use to tie the local h-files to the MIT global files (case insensitive, 4- or 8-character names): # Tie stations for MIT global h-files, used also for GAMIT constraints # North bucu emed ftprnx # West ankr emed ftprnx # Central nssp emed ftprnx # South tela emed ftprnx # East kit3 emed ftprnx Make the appropriate changes to the template sittbl. to provide moderate constraints in the GAMIT processing to help in resolving ambiguities: SITE FIX --COORD.CONSTR.-- << default for regional stations >> ALL NNN 100. 100. 100. << IGS core stations >> BUCU BUCU_GPS NNN 0.050 0.050 0.05 TELA TELA GPS NNN 0.050 0.050 0.05 KIT3 KIT3_GPS NNN 0.050 0.050 0.05 Copy into 2004/gsoln the globk_comb.cmd and glorg_comb.cmd templates from gg/tables. # Step 3: Create station.info In 2004/tables, which will contain the SOPAC/MIT version of station.info sh_upd_stnfo -ref station.info -w station.info.mit -l sd which will result in the output file having only the entries for sites listed with ftprnx in sites.defaults. Using the log sheets as a guide, in emed_rinex/azerjaiban/2004/rinex and emedrinex/georgia/2004/rinex, create survey-specific station.info entries manually (editor or using 'make_stnfo') or using 'sh_upd_stnfo -f *.04o' then editing any entries that need to be changed. The starting file must be named station.info since it is continually concatenated with the information from each new RINEX file; and the starting file must have a legitimate set of headers though these do not need to match those in files you wish to merge such as the MIT or SOPAC station.info file. Create this starting template from an existing station.info file by removing with an editor all entries except the headers. Rename these survey-specific files to , e.g. station.info.azer04, and copy them to 2004/tables. Then merge the files: sh_upd_stnfo -ref station.info.mit -merge station.info.azer04 station.info.georgia04 which will create a merged file, station.info.new, which you should rename to station.info for processing. # Step 4: Do the GAMIT processing sh_gamit -expt emed -d 2004 267 -pres ELEV -rx_doy_minus 2 >&! sh_gamit.log sh_gamit -expt emed -s 2004 273 279 -pres ELEV -rx_doy_minus 2 >&! sh_gamit.log sh_gamit -expt emed -s 2004 281 286 -pres ELEV -rx_doy_minus 2 >&! sh_gamit.log Check the emailed summary files for sites included, nrms, ambiguities resolved, and failed editing (rms 0.). Fix any problems and rerun those days. # Step 5: Generate the daily time series for editing If not there already, edit /2004/gsoln/globk_comb.cmd to add a "contingent/option" line for saving an output h-file: COMB out_glb H------_EMED.GLX This command will be executed only if you include '-globk_cmd_prefit COMB' in the sh_glred command line. Edit glorg_comb.cmd to specify the stabilization you want for the time series (apr_file, stab_site list, parameters to be estimated). Run sh_glred from the experiment (year) level (as with sh_gamit and different from a direct running of glred): sh_glred -s 2004 267 2004 286 -expt emed -local —net MIT opt H LC G T >&! sh_glred.log Setting '-local' instructs sh_glred to use MIT global h-files only for days on which a day directory (i.e. field data) exists. The output of this command in the /gsoln directory will be a gdl file, log file, and org file for each day, a summary of time-series residuals (tsfit_2004-267_2004-286,sum, and two new directories: /res_2014-267_2004-286 and /plots_2004-267_2004-286 containing respectively the residuals and time-series plots for each site on each day. Check the stabilization by typing in /gsoln grep 'POS STAT' *.org The number of sites in the stabilization should be what you expected (4 or more for translation-only, 8 or more well-distributed sites for translation+rotation), the wrms values should be 1-3 mm, and the nrms values less than ~2. View all the plots ( 'gs *.ps’ ) and note any outliers or noisy segments. Notes: ardt 267-269 downweight (DW) by adding 6 mm; check long-term plots later arga 282 U-30, DW 100 atht 269 N-5 E-10 308 E+5 boyl 270-272 DW 4 bucu 286 N+3 287 U-10 cagl 265-268 DW 5 291 E+7 293 U+15 madr 282-283 E+5 290 E+5 ozdm 267-269 N/E slope DW 15 or DW 267 15 LT E favors 267 N favors 268-269 reun 257 U+20 torl 273-275 DW 5 triz 269 U+40 zeck 286 N+5 287 U-15 Experience has shown that with the standard data weighting in GAMIT. using elevation-dependent weighting based on the autcln statistics, the median nrms is about 0.7. Hence we might treat any single value or nrms that is greater than 1.5 (twice the median) as a two-sigma (95% level) outlier and apply a downweight by adding white noise with sig_neu, either enough to make the nrms ~1 or to increase the error bar to a value greater than the value of the outlier. In the case of an outlier, adding only enough white noise to equal the residual is not as objective as adding enough noise to make the outlier of negligible weight, so you should pay attention to how much the reweighted outlier is biasing the position estimate. If your primary interest is horizontal deformation, you might be very generous in the noise added to vertical outliers so as to prevent their having any effect on the horizontals. Generate conveniently sig_neu reweights using 'grw' to convert yr/doy and mm to yy/mm/dd and meters to match the globk convention. ‘grw’ will write the converted lines into a file ‘tempout’ from which you can paste them directly into globk_comb.cmd or by reference using the ‘source’ feature, e.g. ’ source ../../tables/emed_comb.reweights'. # Step 6: Generate the combined h-file for the survey sh_glred -s 2004 267 2004 286 -ncomb 20 -globk_cmd_prefix COMB -expt emed -opt G >&! sh_glred.log which will create H040924_EMED.GLX and globk_emed_04267-286.org, the latter to be checked for appropriate chi2 increments in the stacking. With the usual GAMIT weighting plus the downweighting (sig_neu) of outliers, the MIT GLX files will typically produce chi2 increments of 1 to 3, and regional networks < 1. We do not need to specify -net or opt LC in the sh_glred command since the script will automatically use all the h-files present or linked in /glbf. *** Steps 1-6 to be repeated for each survey. *** # Step 7: Generate the multi-year time series Working in a directory parallel to the survey/year directories, e.g. EMed/vsoln, create the multi-year gdl file by, e.g. ls ../????/gsoln/H*GLX > emed.gdl Copy into /vsoln the template files globk_long.cmd and glorg_long.cmd from gg/tables and edit them for your analysis, generally needing to change only the use_site list (or 'sourced' file) and a list of edits in the globk command file, and the stab_list list in the glorg command file. As a check on severe conflicts due to misnamed sites, and on the application of renames for instrumental changes and earthquakes, it's a good idea at this point to run 'glist', which reads the h-files, applies renames, and compares the coordinates with those in the apr_file: glist emed.gdl emed.glist +1 ../tables/itrf08_comb.eq:../tables/emed_renames.eq ' ' ../tables/itrf08_comb.apr >! glist.out The glist.out file will echo coordinate differences, and the emed.glist file will list all of the sites and their epochs. If you want to generate an explicit use_site list at this point, based on proximity and adequate epochs and time span, you can run 'glist2cmd' (type to see help). Now to generate the time series: rm globk_rep.prt globk_rep.org globk_rep.log glred 6 globk_rep.prt globk_rep.log emed.gdl globk_long.cmd >&! glred.out mv globk_rep.org globk_rep_130708a.org where we’ve used the current date and a letter extent to identify the solution. The 'rm' step is remove any leftover files since the new ones will be concatenated onto the end of any old ones of the same name. Again check the stabilization quality with grep 'POS STAT' globk_rep_130708a.org Then generate the plots: mkdir plots cd plots tssum . mit.final.itrf08 -R ../globk_replong_130708a.org sh_plot_pos -f *.pos -r -t tsfit.cmd -t1 2004_001 -t2 2009-001 -u w Make note of any problems: bgnt 04 E+8 blot 04(1) N-50 E-35 typo? bucu 04 systematic cf later, DW 5 cnca N nrms 1.6 DW N only, 5 garb 06 N-10 gr02 N nrms 1.8 DW 5 both hoss DW N 4 As noted in the lecture, the small number of epochs (even for the cGPS, after aggregation) means that you have to be careful to remove or down-weight any outliers or noisy segments. As with the daily time series, the typical long-term nrms median is about 0.7, so you can use the same "1.5 threshold" to indicate a 2-sigma outlier. The sig_neu commands to applied to the long-span solution are best created manually and put into a separate ‘source’d file, e.g. emed.vel.reweight. Be careful that the date range given in the sig_neu commands fully encompasses the dates of the H-file to which you wish to apply the downweights. # Step 8: Perform a velocity solution As with continuous data, we must consider the effect of temporally correlated noise in the estimates and uncertainties of the velocities. So although the small number of observations precludes using the First-Order Gauss-Markov Extrapolation (FOGMEx, “realistic sigma”) algorithm to the appropriate level of random walk noise for each site, we must nevertheless include some in the velocity solution. The value to use could be determined from prior experience, from the median of the values estimated for cGPS stations in your region, or, iteratively from a validation of the velocities against a model. A reasonable value to start would be 1 mm/sqrt(yr) for the horizontals and 3 mm/sqrt(yr) for the verticals, e.g. mar_neu all .000001 .000001 .000009 0 0 0 where the units are m**/yr. Note that sig_neu noise is additive (site- specific added to “all”), a site-specific mar_neu value supersedes “mar_neu all”. Do the solution: rm globk_vel.prt globk_vel.org globk_vel.log globk 6 globk_vel.prt globk_vel.log emed.gdl globk_long.cmd VEL >&! glred.out mv globk_vel.org globk_vel_130708a.org where the VEL option refers to the VEL keywords in globk_long.cmd and glorg_long.cmd which turn on velocity estimation. This scheme allows you to use the same command files for repeatabilities as for velocities without any explicit editing. With the org_opt setting in the template (CMDS GLDF PSUM VSUM FIXA RNRP), all of the information you need to evaluate both the stacking (globk) and the stabilized solution (glorg) will appear in the glorg print file, globk_vel.org. Things to check are the chi2 increments from the stacking (all should be of order ~1.0), the stabilization (number of sites retained in the iteration and the N E U rms and nrms values), the chi2 increments for equated parameters, and the uncertainties of the velocity estimates (1-2 mm/yr or less horizontal for a 3-year span of data using at least 12 hrs of data each year and a random-walk of 1 mm/sqrt(yr). To repeat the solution with changes to the equates, stabilization, or ‘plate’ definitions, run glorg separately: glorg globk_vel_170708b.org PSUM:VSUM:GDLF:CMDS:FIXA:RNRP glorg_vel.cmd vel.com To get velocity fields (vel-files) in the reference frames of the plates listed in the glorg command file, use, e.g.: sh_org2vel -file globk_vel_170708b.org -plate eurasia nubia arabia # Step 9: Generate the time series for the velocity solution In principle, the statistics relevant to the velocity solution should flow from time series generated with a stabilization that includes all of the well-determined sites in the solution. If your solution has a strong IGS cGPS component over the full span, then the improvement using the regional stations will be minor. On the other hand, if some or all of your epochs are weak in IGS coverage, using the regional stations in the stabilization is quite important to evaluate the actual scatter of each station's epoch-by-epoch positions without confusion added by a weak stabilization at some epochs. To get the new time series, create an apr file from your velocity solution, sh_exglk -f globk_vel_120708a.org -apr vel_120708a.apr Create a stab_site list with all of the sites that have reasonably good position and velocity uncertainties. A utility to help with this is 'vel2stab'. Then repeat Step 7 with the new apr-file (only) and the new stabilization list.