** Example of processing a complex set of multiyear observations ** # Last edited 130705 # Data set: Three years for surveys in multiple parts of the eastern-Mediterranean/Caucasus # Strategy: - combine daily sGPS GAMIT processing with cGPS h-files - aggregate the processing by survey length up to about 30 days - use survey-combined H-files to get long-term repeatabilities and velocities # Assumed directory structure: /EMed /2004 /2006 /2008 /tables /vsoln /emed_rinex Under each year (expermiment) directory is the usual structure for sh_gamit and sh_glred (e.g. /tables, /[day-number], /rinex, /gsoln, etc.. Under /emed_rinex are the survey-mode RINEX files organized by country and year, e.g., /azerbaijan/2004. # Step 1: Determine the temporal and spatial distribution of the data For example, if the data are stored in directories by country and year, within each RINEX directory you can use sh_get_times to obtain a list, e.g. in rinex_data/azerbaijan/2004, type 'sh_get_data -sort -f *.04o > times.azer04.sorted', where the file times.azer04.sorted will have shik2672.04o 2004 267 13:31 269 13:42 48:11 5784 3 khid2672.04o 2004 267 13:39 269 13:41 48:2 5766 3 kurd2731.04o 2004 273 05:06 275 05:13 48:7 11553 3 yevl2735.04o 2004 273 07:58 275 08:01 48:3 11536 3 ayaq2730.04o 2004 273 09:44 275 08:38 46:54 5630 3 kebe2821.04o 2004 282 09:36 284 07:59 46:23 11137 3 qurd2821.04o 2004 282 11:32 284 08:02 44:30 10681 3 shek2841.04o 2004 284 10:13 286 07:05 44:52 10771 3 kate2840.04o 2004 284 13:23 286 11:32 46:9 11046 3 saly2960.04o 2004 296 07:51 298 07:59 48:8 11554 3 bile2960.04o 2004 296 09:13 298 09:16 48:3 11535 3 lenk2960.04o 2004 296 11:57 298 06:02 42:5 5051 3 and in emedrinex/georgia/2004/rinex, type 'sh_get_times -sort -f *.04o > times.georgia.sorted' zinv2730.04o 2004 273 13:15 274 13:14 23:59 2878 2 zinv2740.04o 2004 274 14:23 275 14:19 23:56 2873 2 pasn2750.04o 2004 275 18:01 276 14:24 20:23 2447 2 pasn2760.04o 2004 276 14:25 277 14:19 23:54 2869 2 pasn2770.04o 2004 277 14:51 277 19:01 4:10 502 1 kazb2780.04o 2004 278 06:31 279 06:29 23:58 2876 2 kazb2790.04o 2004 279 06:30 279 18:04 11:34 1389 1 kazb2791.04o 2004 279 18:05 280 06:50 12:45 1531 2 kudi2810.04o 2004 281 10:11 282 06:29 20:18 2436 2 kudi2820.04o 2004 282 06:30 282 18:04 11:34 1389 1 kudi2830.04o 2004 282 18:05 283 10:22 16:17 1955 2 ingu2840.04o 2004 284 14:46 285 14:24 23:38 2836 2 ingu2850.04o 2004 285 14:25 286 14:19 23:54 2869 2 shua2870.04o 2004 287 10:31 288 10:14 23:43 5692 2 shua2880.04o 2004 288 10:15 289 06:29 20:14 2429 2 shua2881.04o 2004 289 06:30 289 10:21 3:51 464 1 So we have data 267-269, 273-289. Note that all of the Azeri data span parts of three days, so the default setting 'rx_doy_minus 1' in process.defaults will not be sufficient to pick up the data two days earlier. We can change this in process.defaults or make it explicit in the sh_gamit command line. # Step 2: Set up the control files for the first year Run sh_setup to put templates or actual files you'll need into EMed/2004/tables. From the /2004 level type 'sh_setup -yr 2004' Edit process.defaults and sites.defaults for this year. Make appropriate changes to the template process.defaults: # Directory path for RINEX archives (search all levels) set rnxfnd = "/emedrinex" # H-files archive set hfnd = /net/everest/raid6/ftp/pub/MIT_GLL/H04 # Set compress (copts), delete (dopts) and archive (aopts) options. # Possible d-, c-, and a- opts: D, H, ao, ac, as, b, c, d, e, g, h, i, j, k, l, m, o, p, q, t, x, ps, all" set copts = ( o q mkx ) set dopts = ( D ao b c g i j k m p ps t x y ) The copt/dopt settings shown here represent a space-saving approach, retaining only the files you are likely to view unless something goes wrong. We rarely use the option to put some files in an archive directory, preferring to save only the minimal files in the day directories themselves. Make appropriate changes to the template sites.defaults to automatically download and include in the processing (by being present in 2004/rinex) the ITRF08 sites you wish to use to tie the local h-files to the MIT global files: # Tie stations for MIT global h-files, used also for GAMIT constraints # North bucu_gps emed ftprnx # West ankr_gps emed ftprnx # Central nssp_gps emed ftprnx # South tela_gps emed ftprnx # East kit3_gps emed ftprnx Make the appropriate changes to the template sites.defaults to provide moderate ITRF08 constraints in the GAMIT processing to help in resolving ambiguities: SITE FIX --COORD.CONSTR.-- << default for regional stations >> ALL NNN 100. 100. 100. << IGS core stations >> BUCU BUCU_GPS NNN 0.050 0.050 0.05 TELA TELA GPS NNN 0.050 0.050 0.05 KIT3 KIT3_GPS NNN 0.050 0.050 0.05 # Step 3: Create station.info In 2004/tables, which will contain the SOPAC/MIT version of station.info sh_upd_stnfo -ref station.info -w station.info.mit -l sd which will result in the output file having only the entries for sites listed with ftprnx in sites.defaults. Using the log sheets as a guide, in emed_rinex/azerjaiban/2004/rinex and emedrinex/georgia/2004/rinex, create survey-specific station.info entries manually (editor or using 'make_stnfo') or using 'sh_upd_stnfo -f *.04o' (continually concatenated output file must be named 'station.info') then editing any entries that need to be changed. Rename these survey-specific files to , e.g. station.info.azer04, and copy them to 2004/tables. Then merge the files: sh_upd_stnfo -ref station.info.mit -merge station.info.azer04 station.info.georgia04 which will create a merged file, station.info.new, which you can rename to station.info for processing. # Step 4: Do the GAMIT processing sh_gamit -expt emed -s 2004 267 269 -pres ELEV -rx_doy_minus 2 >&! sh_gamit.log sh_gamit -expt emed -s 2004 273 289 -pres ELEV -rx_doy_minus 2 >&! sh_gamit.log Check the emailed summary files for sites included, nrms, ambiguities resolved, and failed editing (autcln rms 0.). Fix any problems and rerun those days. # Step 5: Generate the daily time series for editing Edit /2004/gsoln/globk_comb.cmd to add a "contingent/option" line for saving an output h-file: COMB out_glb H------_EMED.GLX This command will be executed only if you include '-globk_cmd_prefit COMB' in the sh_glred command line. Edit glorg_comb.cmd to specify the stabilization you want for the time series (apr_file, stab_site list, parameters to be estimated). sh_glred -s 2004 267 2004 289 -expt emed -ncomb 23 -opt H G E >&! sh_glred.log Setting '-local' instructs sh_glred to use MIT global h-files only for days on which a day directory exists. The output of this command will be a gdl file, log file, and org file for each day and the usual output of sh_plotcrd: VAL and SUM files and postscrit plots of time series for each station (psbase) and histogram (pshist) of nrms and wrms. Check the stabilization by typing in /gsoln grep 'POS STAT' *.org View all the plots ( 'gs psbase*' ) and note any outliers or noisy segments. (With a data set that has multiple surveys within a year, you might create sub-directories below /gsoln, e.g. /gsoln/plots_267-289, in which to store the plots.) Notes: ardt 267-269 downweight (DW) by adding 6 mm; check long-term plots later arga 282 U-30, DW 100 atht 269 N-5 E-10 308 E+5 boyl 270-272 DW 4 bucu 286 N+3 287 U-10 cagl 265-268 DW 5 291 E+7 293 U+15 madr 282-283 E+5 290 E+5 ozdm 267-269 N/E slope DW 15 or DW 267 15 LT E favors 267 N favors 268-269 reun 257 U+20 torl 273-275 DW 5 triz 269 U+40 zeck 286 N+5 287 U-15 Experience has shown that with the standard data weighting in GAMIT using the autcln rms produces short-term nrms with a median value of about 0.7 (see pshist_nrms.SUM.emed), I treat any single value or nrms that is greater than 1.5 (twice the median) as a two-sigma (95% level) outlier and apply a downweight by adding white noise with sig_neu, either enough to make the nrms about 1.0 or to increase the error bar to about the value of the outlier. In the case of an outlier, adding only enough white noise to equal the residual is not as objective as adding enough noise to make the outlier of negligible weight, so you should pay attention to how much the reweighted outlier is biasing the position estimate. If you primary interest is horizontal deformation, you might be very generous in the noise added to vertical outliers so as to prevent their having any effect on the horizontals. Generate sig_neu reweights using 'grw' and add these to globk_comb.cmd using, e.g. 'source ../../tables/emed_comb.reweights'. # Step 6: Generate the combined h-file for the survey sh_glred -s 2004 267 2004 289 -ncomb 32 -globk_cmd_prefix COMB -expt emed -opt G >&! sh_glred.log which will create H040923_EMED.GLX and globk_emed_04267-289.org, the latter to be checked for appropriate chi2 increments ( ~1.0 ) in the stacking. *** Steps 1-6 to be repeated for each survey. *** # Step 7: Generate the multi-year time series Working in a directory parallel to the survey/year directories, e.g. EMed/vsoln, create the multi-year gdl file by, e.g. ls ../????/gsoln/H*GLX > emed.gdl Copy into /vsoln the template files globk_vel.cmd and glorg_vel.cmd from ~?gg/tables and eid them for your analysis, generally needing to change only the inclusion of a a 'use_site' list and a list of edits in the globk command file,and the stab_list list in the glorg command file. As a check on severe conflicts due to misnamed sites, and on the application of renames for instrumental changes and earthquakes, it's a good idea at this point to run 'glist', which reads the h-files, applies renames, and compares the coordinates with those in the apr_file: glist emed.gdl emed.glist +1 ../tables/itrf08_comb.eq:../tables/emed_renames.eq ' ' ../tables/itrf08_comb.apr >! glist.out The glist.out file will echo coordinate differences, and the emed.glist file will list all of the sites and their epochs. If you want to generate an explicit use_site list at this point, based on proximity and adequate epochs and time span, you can run 'glist2cmd' (type to see help). Now to generate the time series, run glred: rm globk_rep.prt globk_rep.org globk_rep.log glred 6 globk_rep.prt globk_rep.log emed.gdl globk_long.cmd >&! glred.out mv globk_rep.org globk_rep_130708a.org sh_plotcrd -f ../globk_rep_130708a.org -mb -s long -res -o 1 -vert -minnum 3 -col 1 -x 1994.0 2013. where I've used the current date and a letter extent to identify the solution. The 'rm' step is remove any leftover files since the new ones will be concatenated onto the end of any old ones of the same name. Again check the stabilization quality with grep 'POS STAT' globk_rep_130708a.org Make note of any problems: bgnt 04 E+8 blot 04(1) N-50 E-35 typo? bucu 04 systematic cf later, DW 5 cnca N nrms 1.6 DW N only, 5 garb 06 N-10 gr02 N nrms 1.8 DW 5 both hoss DW N 4 As noted in the lecture, the small number of epochs (even for the cGPS, after aggregation) means that you have to be careful to remove or downweight any outliers or noisy segments. As with the daily time series, the typical long-term nrms median is about 0.7, so I use the same "1.5 threshold" to indicate a 2-sigma outlier. # Step 8: Perform a velocity solution Copy into /EMed/vsoln from ~/gg/tables, the template files globk_long.cmd and glorg_long.cmd and edit the files to point to your downweights, use_site list, and stab_site list. As with continuous data, we must consider the effect of temporally correlated noise in the estimates and uncertainties of the velocities. So although we cannot determine objectively, using the 'realistic sigma' algorithm, the appropriate level of random walk for sites that have only a few epochs, we must nevertheless include some in the velocity solution. The value to use could be determined from prior experience in other regions, from the median of the values estimated for cGPS stations in the region, or, interatively from a validation of the velocities against a model. A reasonable value to start would be 1 mm/sqrt(yr) for the horizontals and 3 mm/sqrt(yr) for the verticals, e.g. mar_neu all .000001 .000001 .000009 0 0 0 Do the solution: rm globk_vel.prt globk_vel.org globk_vel.log globk 6 globk_vel.prt globk_vel.log emed.gdl globk_long.cmd >&! glred.out mv globk_vel.org globk_vel_130708a.org With the org_opt setting in the template, all of the information you need to evaluate both the stacking (globk) and the stabilized solution (glorg) will appear in the glorg print file, globk_vel.org. Things to check are the chi2 increments from the stacking (all should be of order ~1.0), the stabilization (number of sites retained in the interation, and the N E U rms and nrms values), and the uncertainties of the velocity estimates (1-2 mm/yr or less horizontal for a 3-year span of data using at least 12 hrs of data each year and a random-walk of 1 mm/sqrt(yr). # Step 9: Generate the times series for the velocity solution In principle, the statistics relevant to the velocity solution should flow from time series generated with a stabilization that includes all of the sites in the solution. If your solution has a strong IGS cGPS component over the full span, then the improvement using all the regional stations will be minor. On the other hand, if some or all of your epochs are weak in IGS coverage, using the regional stations in the stabilization is quite important to evaluate the actual scatter of each station's epoch-by-epoch positions without confusion added by a weak stabilization at some epochs. To get the new times series, create an apr file from your velocity solution, sh_exglk -f globk_vel_120708a.org -apr vel_120708a.apr Create a site_site list with all of the sites that have reasonably good position and velocity uncertainties. A utility to help with this is 'vel2stab'. Then repeat Step 7 with the new apr-file (only) and the new stabilization list.