5.4 Self-calibration

The task CALIB was described in some detail in Chapter 4 as the tool to determine the instrumental gains on calibrator sources which were then interpolated in time and applied to your program sources. If you have sufficient signal to noise in the latter, you may now use CALIB to improve the dynamic range of your images. The assumption is made that your images have been degraded by antenna-based (complex) gain errors which vary too rapidly with time or direction to have been fully calibrated with the calibrator sources. CALIB compares the input uv data set with the predictions of a source model — a point-source initial guess or your current best set of Clean components — in order to compute a set of antenna-based amplitude and phase corrections as a function of time which would bring the data into better agreement with your current model. For an n-element array, there are (n - 1)2 times more observations than unknown antenna gains at any time, so the process is well-determined when n is reasonably large. Since this process uses the data to calibrate themselves, it is called self-calibration.

Do not use CALIB unless your data have enough signal-to-noise to warrant improvement. Ask yourself whether your externally-calibrated Clean images contain un-Cleanable artifacts well above the noise, and whether your source meets the criteria for self-calibration given by Tim Cornwell and Ed Fomalont in Lecture 9 of Synthesis Imaging in Radio Astronomy. Note that if your images are limited by receiver noise, self-calibration may produce erroneous results, including the fabrication of weak sources!

5.4.1 Self-calibration sequence and SCMAP or SCIMG

If you decide to use self-calibration, a good sequence of steps is:

  1. Use UVPLT to make a plot file showing the shape of the visibility function as a function of baseline length in the externally-calibrated data set. (See §6.3, especially §6.3.1, for information about plotting in AIPS.) To reduce the scatter, you may average the data over time and/or over spectral channel before plotting. UVPLT normally prepares the plot in memory before writing it to the TV or plot file. In this way, even very large data sets do not require large amounts of time. Use LWPLA to get hard copy of the plot file.
  2. If you can use a point-source model for the first iteration, i.e., if a range of baselines sufficient to calibrate all antennas is dominated by a single component (flat visibility function well above the noise), go to step 6 directly. This is frequently done with VLBI data, but is less common with arrays for which the initial calibrations are better, such as the VLA.
  3. If you must use a more complicated model, obtain a Clean-component representation of it by making and Cleaning an image of the externally-calibrated data using IMAGR. Leave the uv data in “TB” sort order for CALIB; IMAGR will sort them if it has to. Note that you may want to use a somewhat higher loop GAIN in a Clean to be used as an input model for an early iteration of self-calibration than you would for final deconvolution of a very extended structure. Task FACES can prepare an initial model for wide-field observations particularly at long wavelength.
  4. Consider running CCMRG to reduce the number of components in the model. This improves the speed of the calibration and makes the first negative component be a real negative rather than a minor correction to previous positive components. Remember that merging the components does alter the model which is used to compute the gains unless you were going to include all components anyway. Note that IMAGR often merges the components automatically.
  5. Use PRTCC or TAPLT (as in the example in §5.3.6) to help you decide how many components from this Clean to include in the CALIB model. CCFND is also helpful. When you have decided this, determine the appropriate uv-limits for the gain solution by referring to the hard copy of the visibility function you made at step 1.
  6. Plan your CALIB inputs using the information given in the following two sections. The first few iterations are usually used to correct only phases; amplitude is normally corrected only in the last one or two iterations.
  7. Use CALIB to calculate the gain corrections. If DOAPPLY 0, it will apply them to produce a new, (hopefully) improved data set, and will also catalog the gain corrections as an SN extension to the input uv data file. You may set DOFLAG to produce and use new data flags based on closure failures.
  8. Use SNPLT on the input data file with DOTV = TRUE to review the gain corrections before proceeding further. Use DOBLANK = 1 to plot failed solutions as well as good ones. To take hard copy for future reference, run SNPLT with DOTV = FALSE and then run LWPLA on the plot files (usually more than one) produced. To plot the extrema of the gains use OPTYPE = ’SUM’ in SNPLT.
  9. Ask whether the gain corrections were believable — were they smaller than at the previous iteration of CALIB, if any? If not, is there a good reason why not? Did you change input parameters such as the model, the type of solution, or the solution interval, in a way that may have forced larger corrections than before? Proceed only if you are reasonably sure you understand what is happening at this point — otherwise consult a local expert at your site.
  10. If the corrections were believable, run IMAGR to produce a new Clean image. Lower GAINs and higher NITER to produce deeper and more careful Cleans are appropriate as the self-calibration progresses.
  11. Go back to step 4 and repeat the whole process if your new Clean image is a significant improvement over the previous one (with comparable Cleaning parameters on both occasions). You may want to go back to step 1 and repeat the process from there if you have been using amplitude self-calibration and wish to check that your amplitude calibration has not drifted significantly. If the new Clean image differs little from the previous one, do not continue on with further iterations of steps 4 through 10 unless you feel you can make an informed change to the CALIB input parameters at step 6. Tasks UVDIF (§6.2.1) and UVADD may help you to decide whether there have been significant changes to your data due to the previous iteration of CALIB.
  12. For very high dynamic range imaging, it may be useful to check for baseline-dependent (non-closing) multiplicative complex gains. Tasks BLCAL and BLCHN perform this calibration which is normally run to get a single solution for the entire range of times in the data set. In 31DEC23, procedure BLPOLCAL extends this operation to the cross-hand visibilities. See §9.5.9 for additional discussion.
  13. Images may also be used as models (CMODEL=’IMAG). This works well (in 31DEC21) if the image is a “model image” rather than a Clean image. The Fourier transform of the latter does not agree with the visibility data due to the convolution with the Clean beam. One could experiment, in 31DEC21, with a Clean image in UVSUB with OPCODE=’MODL’ followed by DGAUS which (partially) corrects for the deconvolution. A UVADD with OPCODE=’DIV’ will construct an “already divided” data set which is understood by CALIB.

If you have used the spectral-index (§5.3.4.4) and/or primary beam (§5.3.5.1) options in IMAGR, you may continue to use those options in self-calibration. The procedure OOCAL will divide your multi-facet, channel-dependent model into the visibility data with task OOSUB. It then runs CALIB on that output and copies the resulting SN table back to the original input file.

The tasks SCMAP and SCIMG attempt to implement this sequence inside a single task. SCIMG contains almost all of IMAGR and all of CALIB. SCMAP is similar, but limited to a single field for simplicity. They attempt to make the decision about the number of merged components and the range of uv spacings to use in each self-calibration based on σ times the rms in the residual image of the current Clean, where you provide the σ. The process is somewhat less flexible, but also less painful, than running CALIB and IMAGR multiple times. They do not let you change imaging parameters while they are running, but they do provide automatic and interactive methods to change and save Clean boxes and to set a variety of Cleaning and self-calibration parameters including loop gain, solution interval, solution smoothing interval, and whether negative components are included or terminate the list. They let you switch from phase-only to amplitude and phase self-calibration or they will do it automatically when the phase only stops converging. Both tasks offer the full editing options of task EDITR (see §5.5.2) displaying the input and current residual uv data with a wide variety of data selection and editing options. The CALIB, IMAGR, and EDITR process is similar, but conceptually simpler, so it is the one described here.

5.4.2 Self-calibration with CALIB

CALIB is the heart of the AIPS calibration package. The inputs to CALIB are extensive and spread over several screen pages. This is because the routines in CALIB are used in many situations — general calibration, real-time interferometry and VLBI. The task solves for antenna-based complex gains i.e., “self-calibration,” whether the source being calibrated is a “calibrator” source (usually taken to be a point) or a “program” source (usually taken to be complex). The solutions that CALIB generates are stored in SN “solution” tables which are attached to the input data file. The SN tables can be plotted with SNPLT and listed with LISTR. They can be edited themselves with SNEDT or be used to edit the uv data with EDITA. If the corrections are usually small, the OPTYPE=’A&P’ option in SNFLG may perform the flagging you would do with EDITA, but with very little effort on your part.

The following input parameters are used by CALIB for self-calibration of a single-source uv data set:

> TASK CALIB’ ; INP  C R

to specify the task and review the inputs.

> INDI n1 ; GETN ctn1  C R

to select the ’TB’ sorted uv database.

> IN2D n2 ; GET2N ctn2  C R

to select the Clean model image(s) to use.

> INVERS m  C R

to specify the CC file version number to use from every model image; 0 means the highest.

> NMAPS q  C R

to specify the number of images with CC files to use for the model. If q > 1, the image class names are assumed to have the first three characters of IN2CLASS with the field number one given in the last three characters as is done by IMAGR.

> NCOMP = n1,n2,  C R

to cut off the model at the nith Clean component in the ith image.

> SMODEL S,x,y,m  C R

to specify a point-source (or Gaussian or uniform spherical) model rather than a Clean component model. CALIB uses a source model (type m) of S Jy located at x, y arc-sec with respect to the pointing center. For a point model m = 0; see the help for details of the other types.

> SUBARRAY s  C R

to select the appropriate sub-array — SUBARRAY = 0 implies all sub-arrays.

> UVRANGE = x1, x2  C R

to give full weight (in doing the gain solutions) only to data from projected baselines between x1 and x2 in kilo wavelengths.

> WTUV w  C R

to set the weight for projected baselines outside the range UVRANGE(1) UVRANGE(2). WTUV = 0 is interpreted as zero weight and should not be used.

> REFANT nr  C R

to select the reference antenna; for best results, choose one known to be good over most of the time range.

> SOLMODE ’A&P’  C R

to solve for amplitude and phase corrections simultaneously.

> SOLMODE ’P’  C R

to solve for phase weighted by amplitude, the default for single-source files.

> SOLMODE ’P!A’  C R

to solve for phase ignoring amplitude.

> SOLTYP    C R

to use a normal (non-linear) least squares solution.

> SOLTYP ’L1’  C R

to use an “L1” solution method in which a weighted sum of the moduli of the residuals is minimized. The computed gain solutions are less influenced by wild data points, but there is some loss of statistical efficiency and a modest increase in compute time. See F. R. Schwab, VLA Scientific Memo #136 for further details.

> SOLTYP ’GCON’ ; SOLMOD ’GCON’  C R

to solve for amplitude and phase using least squares with a gain constraint — this requires GAINERR and SOLCON as well; see the help file.

> ANTWT w1,w2,w3,  C R

to apply additional weights to each antenna (in order) in generating the solutions; 0 implies 1.

> APARM(1) = x5  C R

to reject solutions from fewer than x5 antennas; default is 6.

> APARM(2) = x6  C R

to tell CALIB whether the data have already been divided by a model (x6 > 0) or not.

> APARM(3) = x7  C R

to solve for RR and LL separately (x7 0) or to average RR and LL correlators before solving (x7 > 0).

> APARM(5) = x8  C R

to make separate solutions for each IF (x8 0) or to average all IFs to make a single solution (x8 > 0). It is better to do separate solutions unless you are desperate for signal to noise.

> APARM(6) = x9  C R

to set the level of diagnostic information as 0 (very little), 1 (some including time and closure error statistics), 2 (more including individual closure failures), 3 (even more including S/N ratio), or more (too much or much too much).

> APARM(7) = x10  C R

to discard solutions having S/N ratios < x10; default is 5.

> SOLINT = x11  C R

to set the length of the solution interval (in minutes); default is 10 seconds for single-source files.

> NORMALIZE = 1  C R

to scale the gain corrections by the mean modulus of all gains to keep the flux density scale from drifting; 0 lets the gains float free. NORMALIZE = 3 to normalize over antennas and polarizations, but not subarrays and IFs, is especially useful for wide bandwidth data.

> MINAMPER a1  C R

to set the level of amplitude closure error regarded as “excessive” to a1 per cent. If APARM(6)1, summaries of the number of excessive errors by antenna are printed and, if APARM(6) > 1, up to 1000 of the individual failures are printed. 0 means do not check or report amplitude closure errors of any sort. Note that amplitude closure errors are accumulated using logarithms so that gains of 0.5 and 2.0 are both errors of 100%. errors are reported only if thy are “significant” following CPARM(7).

> MINPHSER p1  C R

to set the level of phase closure errors regarded as “excessive.” APARM(6) controls the display as for MINAMPER.

> CPARM(3) = a2  C R

to display a line when the average absolute value of amplitude closure errors is > a2 % if a2 > 0 and APARM(6) 1.

> CPARM(4) = p2  C R

to display a line when the average absolute value of phase closure errors > p2 degrees if p2 > 0 and APARM(6)1.

> CPARM(5) = 1  C R

to form scalar averages of amplitudes before doing solutions. This is useful only if the phases are bad, but the amplitudes have high signal to noise.

> CPARM(7) = Nσ  C R

to display excessive amplitude and phase errors individually only if they exceed MINAMPER or MINPHSER) and if their significance exceeds Nσ times the uncertainty implied by the data weights.

> DOAPPLY -1  C R

to solve for a new SN table without writing a new corrected uv data set.

Other parameters are defaulted sensibly — type EXPLAIN CALIB  C R for further information. In general, the AIPS philosophy is such that if you don’t know what value to set for an adverb, leave it at the default — this will usually give you what you want, or at least something reasonable! There are two approaches to the self-cal loop. In one, a new uv file is written at each iteration and the next SN table should then be an incremental improvement over the previous. In the other, the initial single-source file is the input file in each iteration and the new SN table contains all of the corrections so far found. Both methods have things to recommend them; DOAPPLY controls the choice.

5.4.3 Considerations in setting CALIB inputs

In many cases, only a few input parameters to CALIB need be set, other than those selecting the uv data and the input model. The key parameters are NCOMP, UVRANGE, SOLINT and, if you are interested in polarization, REFANT.

It pays to be conservative when using NCOMP to select the number of Clean components which will comprise the input source model. Setting NCOMP too high will fossilize errors from the earlier calibrations in the model for the next one; after this, you are stuck with them as long as you continue feeding CALIB a model with as much Cleaned flux density. When calibrating Stokes I images, consider setting NCOMP in CALIB so that few negative Clean components are included. The first few iterations of CALIB should be phase-only calibration, since the tropospheric and ionospheric phase errors will almost always dominate amplitude errors due to the atmosphere or to system drifts. In these first iterations, it is prudent to be even more conservative, setting NCOMP so that the total Cleaned flux included in the model is between 50% and 80% of that at which the first negative Clean component appeared. CCFND will help you with this (§5.3.6). If your field is dominated by a few very strong, small-diameter regions, it is a good idea to make the first iterations of CALIB work on Clean components from these regions alone, restricting the range of baselines suitably by setting UVRANGE(1). Setting Clean windows in IMAGR or using CCEDT (§5.3.6) suitably will help you do this. Even later in the self-calibration cycle, it is probably still a good idea to eliminate weak, isolated Clean components. Try CCSEL for this.

It is always important to restrict the high-weight domain of the CALIB solution to the part of the uv plane that is described well by the model. In the early stages of self-calibration, the trustworthy part of your Clean model will almost always contain less flux density than was measured in the visibility function at the shorter baselines. Another way of putting this is that the large-scale structure of the source will be poorly represented by the model. You should therefore set UVRANGE(1) so that the total flux density in the input model (the sum of the Clean components up to the Clean iteration selected by NCOMP) exceeds the peak visibility amplitude in your data at a baseline of UVRANGE(1) kilo wavelengths (read this off a plot file output from UVPLT). It is also important to give some slight weight to the rest of the uv plane so that some solution may be found for most all antennas including those having no baselines in the high-weight region.

SOLINT sets the length of the time interval, in minutes, over which the model and the data are averaged when computing the gain corrections. This must be short enough that the gain corrections can track the fluctuations produced by the atmosphere over the longer baselines with sufficient accuracy. It must be long enough that the variances of the computed gain corrections (which depend on the signal-to-noise ratios in the data over the uv range in which the model is being compared with the data) are acceptably small. These constraints vary from source to source, frequency to frequency, and (because of the “weather”) from day to day. They may not in fact be reconcilable for weak sources, especially in the wider VLA configurations and/or at the higher frequencies. In many combinations of these circumstances, you may not be able to self-calibrate your data. See Lecture 9 in Synthesis Imaging in Radio Astronomy for details of how to make this assessment. In VLBI imaging, it may be helpful to use a point-source model and quite small SOLINT for the first iteration of self-calibration to remove the gross and rapid changes due to atmospheric fluctuations. With that problem removed, it may then be possible to use longer SOLINTs and more complicated models.

REFANT selects the number of the reference antenna for the gain solutions. For total intensity continuum calibration, the choice of this CALIB input is unimportant. It is always best, however, to choose a reference antenna that was stable and present in all data throughout the run, if only because this prevents propagation of noise or glitches in the reference antenna through the gain solutions (and plots of them) for the other antennas. For polarization work, it is important to select an antenna for which both polarizations were always present; otherwise any polarization calibration which preceded CALIB may be seriously compromised.

Note that CALIB should almost always be run with SOLMODE set to phase-only calibration for the first iteration or two. Consider turning on amplitude calibration by setting SOLMODE ’A&P’ only when either the phase adjustments being made are generally small (i.e., the worst cases being a few tens of degrees) or the new re-Cleaned image is clearly dominated by amplitude errors — which will give symmetric Y-shaped patterns around strong point sources for VLA observations. In general, you will want to set CPARM(2) = 1 when using SOLMODE ’A&P’, to prevent drifting of the flux-density scale during amplitude self-calibration.

CALIB has a number of new options to deal with difficult data. The adverb WEIGHTIT controls how the data are weighted when being processed by the gain-fitting routines. The default is w = 1∕σ2 which may cause too much contrast between the highest weighted points and the lowest. This problem is much worse when self-calibrating extended sources than when doing the primary calibration on point sources. If you encounter many failed solutions, try WEIGHTIT = 1 which uses w = 1∕σ. If you have trouble with bad data and failed solutions, consider trying the “robust” forms of SOLTYPE selected with ’R’, ’L1R’, and ’GCOR’. A robust solution is one in which a solution is found, outlier data are temporarily flagged, a new solution found, and the process repeated while gradually tightening the omission criteria. This should make for more reliable solutions when there are bad correlators or antennas and, as a side benefit, allows more permanent flagging of the data under control of adverb DOFLAG.

CALIB will also edit out bad data according to the following criteria:

  1. there are too few antennas (APARM(1)) to form a solution,
  2. the solution does not converge, or
  3. the signal-to-noise ratio for a given antenna (APARM(7)) is too low.

The signal-to-noise ratio is calculated from the post-fit scatter of the residuals from the gain model. Note that the scatter will contain contributions from thermal noise and unmodeled source structure. This is a good reason to restrict the uv range of the data. For further guidance and information on other CALIB inputs, type EXPLAIN CALIB  C R and/or read Lectures 9 and 16 in Synthesis Imaging in Radio Astronomy.

We note again the recommendation from Chapter 4. It is best to self-calibrate the parallel-hand data of the target source before applying the polarization corrections (D terms) to the cross-hand data. The correction to the cross-hand data is the product of the D terms found by PCAL and the parallel-hand data. The latter are not fully correct until after the self-calibration step. There is almost always adequate signal for self-calibration when there is measurable polarization.

5.4.4 Evaluating the quality of the imaging

If your imaging were perfect, then subtracting the Clean component model from the visibility data should produce visibilities differing from zero only by the noise. Similarly, dividing the data by the model should produce “gains” of (1,0), namely a 1 “Jy” point source at the coordinate origin. There is a task called EVAUV to help you find out how close you have come to these ideal results. EVAUV first produces two data sets by subtracting the model from, and dividing the model into, your visibility data. It then determines the mean and rms of the real and imaginary parts of both data sets using a robust algorithm. Then it reports on the number of “bad” samples more than 3 rms from the mean. In 31DEC22, if SOLINT is not zero, it will report rms closure phase and amplitude of the divided data set.

Various optional plots may be produced. They are: (1) The average and rms of the model subtracted data in bins of uv-plane radius. (2) The average and rms of the model divided - 1.0 data in bins of uv-plane radius. (3) The histogram of image pixels over a pixel value range selected by the user or the first of the model images. (4) The histogram of image pixels over a pixel range of plus/minus 5 times the rms of the first of the model images. (5) The real vs imaginary visibilities of the model subtracted data over a range of APARM(1) times the amplitude rms. The plot is a contour in logarithmic intervals of the 2-D histogram image controlled by APARM. Contours are drawn at 0.5, 1, 1.5, 2, 2.5, 3, etc in the log of the counts. (6) The real and imaginary gains minus (1,0) of the model divided data over a range of APARM(2) times the amplitude rms. Contours are again logarithmic and are displayed on the plot.

Closure phases and amplitudes are useful in evaluating the quality of a calibration source or a gain file produced by dividing a model into the visibility data with UVSUB, OOSUB, or EVAUV. Tasks CLPLT and CAPLT will plot closure phases and amplitudes, respectively, as functions of time for specified phase triangles and amplitude quadrangles. In 31DEC22, task EVACL may be used to determine the rms closure phase and amplitude as a function of IF (spectral window). A good point source (or gain) should have values near zero for both. Task CLOSE will plot the rms phase or amplitude closure as a function of spectral channel. This may be useful in determining which channels should be used in continuum observations and which should be omitted.

5.4.5 Experimental extension of multi-field self-calibration

High-quality, multi-field images often suffer from position-dependent calibration effects. At lower frequencies, the strongest sources may well lie outside the main portion of the primary beam and so be effected by different instrumental gains including pointing errors, differences in atmosphere or ionosphere, etc. An experimental RUN file PEELR compiles a procedure by that name. One “interfering” field at a time, it performs a self-cal to improve that facet. After a list of fields is processed, it restores the original multi-field model to the corrected residual uv data. See HELP PEELR for details. It seems almost magical, but it really does improve the final images. A slight increase in the image rms is the price one pays for removing larger, more systematic problems.