E.5 Detailed flagging

The calibration you have done to this point has been degraded by RFI which has not yet been flagged. However, you need to do the above in order to bring all spectral channels and all antennas and sources into the same flux scale. Now automatic tasks may be used — and they are needed for the large volumes of data produced by the EVLA.

In 31DEC17, the task FGSPW may be used to flag IFs (aka ‘spectral windows”) that have been rendered useless by RFI or other problems. It performs a scalar average of the amplitudes in all spectral channels on a per IF per polarization per basdeline per scan basis. Those IFs where amplitudes have overflowed the hardware due to RFI tend to stand out with this statistic. LISTR with OPTYPE=’MATX’ DPARM=1,1 will also compute and print this statistic to let you determine what are reasonable values and what are not.

A very promising relatively new tool flags RFI on the assumption that it is either quite variable in time or in frequency. This task, called RFLAG, computes the rms over short time intervals in each spectral channel and IF individually and flags the interval whenever the rms exceeds a user-controlled threshold. Optionally, it will also use a sliding median window of user-specified width over the spectral channels to the real and imaginary parts of the visibility separately. Any channel deviating from the median in either part by more than a user-specified amount will also be flagged. If DOPLOT> 0, RFLAG will make plots of normal and cumulative histograms and of the mean and rms of the time and spectral computations as a function of channel. It will also make a flag table only if requested (DOFLAG> 0). These plots will suggest threshold parameters and allow you to choose values to use. A flag table is made for any value of DOFLAG if no plots are requested (DOPLOT0).

In detail, RFLAG is run using

> DEFAULT RFLAG ; INP  C R

to clear and review the adverbs.

> INDI n; GETN m  C R

to select the data set on disk n and catalog number m.

> SOURCES source_1’, ’source_2’,  C R

to select sources of similar flux level.

> DOCALIB 1 ; DOBAND 1  C R

to apply continuum and bandpass calibration.

> STOKES ’FULL’  C R

to examine all polarizations.

> DOPLOT 15 ; DOTV 1  C R

to examine all kinds of plots on the TV.

> FPARM 3 , x , -1, -1  C R

to examine spectral rms over 3 time intervals each a bit longer than x seconds. The -1’s cause the program to use other adverbs for the cutoffs and to do a spectral solution as well as the time one.

> FPARM(9) = 4.0 ; FPARM(10) = 4.  C R

to set the cutoff values as 4 times the median rms plus deviation found in the spectral plots as a function of IF. The default is 5.

> FUNCTYPE ’LG’  C R

to plot the histograms on a log scale.

> NBOXES 1000  C R

to use 1000 boxes in the histograms.

> INP  C R

to re-examine the inputs. VPARM will let you control aspects of the plotting.

> GO

to run the program.

This will produce plots and set cutoff levels in adverbs NOISE and SCUTOFF. Another run, with DOPLOT = 0 will apply these cutoffs and create a new flag table. Note that the flux cutoff levels may depend on the source flux, calling for different levels for strong calibrators, weak calibrators, and very weak target sources. Different cutoff levels for STOKES=’RRLL’ and STOKES = ’RLLR’ may also be needed. A strong, resolved target source may require different levels for different UVRANGEs. If so, you will need to break up wide bandwidth data into separate files each containing only one IF so that UVRANGE is applied properly. VBGLU may be used later to put the IFs back together. RFLAG is a new task, so experiment a bit. Note that, if you set DOFLAG=1, the creation of a new flag table will happen after the plots in the same execution of RFLAG. If a channel is found bad at a time in any one polarization, all polarizations are flagged. If you have a significant spectral line signal in your data, use DCHANSEL to have the affected channels ignored throughout RFLAG.

There are a lot of adverbs to RFLAG. FPARM(5) allows you to speed up the spectral part of the flagging by testing more than just the central channel in the sliding median filter. FPARM(6) allows you to expand all flags to adjacent channels. FPARM(7), 8, 11, and 12 control the extending of flags to additional channels, baselines, or antennas if too large a fraction of channels, baselines, or baselines to an antenna are flagged in the basic time and spectral operations. Similar adverbs also occur in the new task REFLG whose job it is to compress the enormous flag tables generated generated by RFLAG. REFLG does not handle flags generated by CLIP, TVFLG, and SPFLG since they vary with polarization. REFLG can extend a flag to all times if too large a fraction of time is flagged for a given channel, baseline, etc. REFLG may not reduce your flag table enough, although it is inexpensive to run and so worth the effort. The application of 10 million flag entries to a data set repetitively is rather expensive. Copying the data, applying the flags once and for all, is the best solution. UVCOP has been the traditional method to do this. However, TYAPL which needs to be run next and must make a new copy of the data has been given the option of applying a large flag table to avoid having to copy the data set twice. Task FGCNT lets you see how much of your data is flagged by any particular flag table.

A new tool which may help identify bad data at this early stage is the task REWAY described in E.8. It checks the data for spectral windows with particularly low and particularly high rms levels. It must copy the selected data to a new file, so it is not particularly recommended at this point. Run it with no flagging of the output for bad values of the spectral rms. Then plot the weights with VPLOT or ANBPL to look for weights that are seriously abnormal (high or low). Those data may need to be flagged. High weights mean that the data are of abnormally low amplitude, whilst low weights mean that the data are very noisy. REWAY uses robust methods to find the rms and so a few channels of RFI may not cause very low weights, but lots of RFI or receiver failures will make the weights abnormally low. REWAY now displays statistical information to help you assess what weights are “high” and “low”.

POSSM may be used again to see if serious RFI remains after RFLAG and it may be appropriate to run TVFLG to look at groups of a small number of spectral channels (or even every channel) on your calibration source. Task FLGIT (8.1) is an older task that attempts to flag RFI that is both channel- and time-dependent in a non-interactive fashion. SPFLG (10.2.2) is labor and time intensive but would be the most reliable method to deal with the problem. FTFLG is like SPFLG except that all baselines are added together in a single plot (per polarization). It probably is not good for much flagging, but will provide a quick and sensitive way to look for widely distributed RFI. CLIP will flag particularly high amplitudes and is capable of running in a normalized fashion dividing by source flux including spectral index (and curvature if desired). (UVPLT also has the DOSCALE option, letting you see the normalized visibilities prior to flagging with CLIP.) However, RFLAG should get the high fluxes in most realistic cases and even has a clip option (FPARM(13)).

The auto-editing task FLAGR is also of some use here. It averages the spectral channels to get an estimate of the mean and rms and uses those numbers evaluated over time and baseline in a variety of algorithms to further flag the data. RFI which is rather wide spectrally and long lived may be found in this way.