11.3 Solving problems in using AIPS

On all computer systems things go wrong due to user error, program error, or hardware failure. Unfortunately, AIPS is not immune to this. The section below reviews several general problem areas and their generalized solutions. Refer to §Z.1.5 for the details appropriate to NRAO’s computer systems. Some well-known possibilities follow.

11.3.1 “Terminal” problems

If your workstation window is alive, but AIPS has “disappeared” you may have “suspended” it by typing CTRL Z. The AIPS can be left in a suspended state, placed into the “background” with bg, or returned to the “foreground” again with fg after which it will resume accepting terminal input. If your AIPS appears to be “suspended”, try typing jobs to see which jobs are attached to your window and then use fg %n to bring back job n where n is the job number of the suspended AIPS. If no AIPS job is suspended from the current window, check all other windows you have running on the workstation for the missing simian before starting a new AIPS. Otherwise, you may run out of allowed AIPSes and/or encounter mysterious file locking problems.

If your workstation window (or terminal on obsolete systems) is “dead”, i.e., refuses to show signs of talking to your computer, you have a problem. There are numerous possible causes. If typed characters are shown on the screen, but not executed, then

  1. Are you executing a long verb, e.g.REWIND, AVFILE, RESCALE? If so, be patient.
  2. Are you executing some interactive TV or Tek verb which is waiting for input from the cursor or buttons? If so, provide the input.
  3. Have you started a task with DOWAIT set to TRUE (+1.0)? If so, wait for the task to finish. Most tasks report their progress on the message monitor window (or your input window).
  4. Is AIPS waiting while a tape rewinds or skips files or is it waiting to open some disk file currently being used by one of your tasks? Be patient.

If typed characters do not appear, then

  1. Have you stopped output to your window accidentally by hitting the appropriate NO SCRL or other XOFF control sequence? If so, hit the XON control sequence. (These are CTRL S and CTRL Q, respectively.)
  2. Do other windows connected to the computer appear to be “alive”? If so, use one of them and inquire about the status of your AIPS program and tasks; on Linux and Berkeley Unix try ps aux  C R and on Linux, Solaris and other Bell Unix try ps -elf  C R. It might be necessary to stop your old AIPS session from your new window and then use that window to start a new AIPS.
  3. Can you abort AIPS at your window using the appropriate system commands (i.e.CTRL C on Unix machines)?
  4. If all windows appear dead, then your computer or its X-Windows server may have “crashed.” Try a remote login from another computer. If that works, check on your processes and try to kill the server and other tasks. This should return your computer to a login state. Otherwise, report the problem to your AIPS Manager or System Administrator. If you feel you must reboot the system, do so only after checking that all current users and the System Administrator (if available) agree that that action is required.
  5. If even a reboot fails, report the problem to the System Administrator or hardware experts and go do something else. UNDER NO CIRCUMSTANCES SHOULD YOU ATTEMPT TO REPAIR ANY HARDWARE DEVICES. Such repairs must be performed by trained personnel.

11.3.2 Disk data problems

If you encounter the message CATOPN: ACCESS DENIED TO DISK n FOR USER mmm, it means that user mmm has not been given access to write (or read) on limited-access disk n. The access rights for all disks can be checked by typing FREESPAC in the AIPS session. In the list of mounted disks, the Access column can say Alluser, Scratch (scratch files only), Resrved (limited access including you), and Not you (limited access not including you). If you feel that you should have access to that particular disk, resume using your correct user number or see your AIPS Manager about enabling your user number.

If your data set seems to have disappeared, consider

  1. Have you set INDISK et al. (especially INTYPE) correctly before running CAT? Type INP CAT  C R to check. Is USERID not set to 0 or your user number?
  2. Are you connected to the right AIPS computer, if your site has more than one?
  3. Are the desired disks mounted for your AIPS session? Type FREE  C R to see which disks are currently running and which numbers they are assigned in this session. When you attach disks from other computers (using the da= option of the aips command — §2.2.3), they are assigned numbers which depend on the list of computers and which may thus vary from session to session.
  4. Did you leave your file untouched for a “long” time on a public disk? System managers may have had to delete “old” files to make room for new ones. In this case your data are gone and we hope you made a backup on tape.

The message write failed, file system is full will appear when the search for scratch space encounters a disk or disks without enough space. (AIPS usually emits messages at this time as well.) This is only a problem when none of the disks available for scratch files has enough space, at which point the task will ”die of unnatural causes.” Run the verb FREESPAC to see how much disk is available and then review the inputs to the task to make sure that OUTDISK and BADDISK are set properly. Change them to include disks with space. Check the other adverbs to make sure that you have not requested something silly, such as a 2000-channel cube 8096 on a side. Then try again.

If there simply is not enough space, try some of the things suggested in §3.6, such as SCRDEST to delete orphan scratch files, DISKU to find the disk hogs, and, if all else fails, ZAP to delete some of your own files. DISKU may be run with DOALL = n to list catalog entries that occupy more than n Megabytes. This will help identify those files which will yield the most new space when deleted. Your AIPS Manager may help you by removing non-AIPS files from the AIPS data disks. Do not do this yourself unless they are your files.

11.3.3 Printer problems

All AIPS print operations now function by writing the output to a disk text file, then queuing the file to a printer, and then sometime later, deleting the file. After the job is queued, the AIPS task or verb will display information about the state of the queue. Read this carefully to be sure that the operation was successful and to find out the job number assigned to your print out. If you are concerned that your print job may be lengthy, or expect that you will only need a few numbers from the job, please consider using the DOCRT option to look at the display on your terminal or the OUTPRINT option to send the display to a file of your choosing without the automatic printing. See §Z.1.5.3 for information about printing such files later.

To find out what jobs are in the spooling queue for the relevant printer, type, at the monitor level:

$  lpq -Pppp  C R

to show printer ppp.

$  lpstat ppp  C R

to show printer ppp under Solaris, HP, SGI (Sys V systems).

where ppp is the name of the printer assigned to you when you began AIPS. If the file is still in the queue as job number nn, you can type simply

$  lprm -Pppp nn  C R

to remove the job.

$  cancel nn  C R

to remove the job under Sys V systems.

lprm and cancel will announce the names of any files that they remove and are silent if there are no jobs in the queue which match the request.

Since modern printers are capable of swallowing large amounts of input, your job may still be printing even though it is no longer visible in the queue. If you turn off the printer at this stage, you are likely to kill the remainder of your print job and quite possibly one or more other print jobs that followed yours. Use discretion. Do not turn the printer back on if the job is still in the queue. Most systems will start the print job over again after you turn the power back on without doing a lprm or cancel.

If your printout fails to appear

  1. Did the print queuing actually work? Review the messages at the end of the verb or task.
  2. Did the printout go to a printer other than the one you expected? Was it diverted to a printer used for especially long print jobs or one used for color plots? The messages at the end of the verb or task should show this.
  3. Was the printer not working or backed up for so long that the file was deleted before it could be printed? The delay time for deletion is shown at the end of the verb or task. It can be changed by your AIPS Manager for future jobs.
  4. Was your print job, or that of a user in the queue ahead of you, a large plot? These can take a long time in some PostScript printers (usually indicated by a blinking green light), so be patient.

11.3.4 Tape problems

When AIPS does a software MOUNT of a magnetic tape, it actually reads the device on most systems. An error messages along the lines of ZMOUN2: Couldn’t open tape device usually means that you have attempted the MOUNT before the device was ready. Wait for all whirring noises and blinking lights to subside and try again. Remote tape mounts are more fragile. If you get a message such as ZVTPO2 connect (INET): Connection refused, then the tape dæmon TPMON is probably not running on the remote host. EXIT and restart AIPS, specifying the remote host in the tp= option (see §2.2.3). If you are told AMOUNT: TAPE IS ALREADY MOUNTED BY TPMON, then there is a chance that you are trying to mount the wrong tape or that someone left the tape device in a mounted state. See §Z.1.5.7 for advice on curing this stand-off between AIPS, which knows that the tape is not mounted, and TPMON which knows that it is.

If you are having problems reading and writing a tape, consider

  1. Did you actually mount the tape in software from the AIPS level with the MOUNT verb. A message like ZTPOPN: NO SUCH LOGICAL DEVICE = AMT0n: indicates that you have not.
  2. Have you specified the INTAPE or OUTTAPE number to correspond with the drive you mounted the tape on?
  3. Does your computer have access to tapes on the remote host? The message AIPS TAPE PERMISSION DENIED ON REMOTE HOST suggests not. See the AIPS Manager for the remote host.
  4. Is the tape correctly loaded in the drive and is the drive “on line” (check the ON LINE light)?
  5. Have you set the density correctly? Some drives need the density to be set by a switch, others have software control. Some try to read the tape and sense the density automatically. Be aware that some drives do not set the density until you actually read or write the tape. Under these circumstances, the density indication on the drive can be misleading. If in doubt, consult your local AIPS Manager about the meaning of the tape density indicator lights on the drive you are using.
  6. Are you using the correct program to read the tape? If you are unsure of the format of a tape, use the task PRTTP to diagnose it for you. It will recognize any format that AIPS is able to read.
  7. Are you writing to a completely blank tape? This fails sometimes. Or are you writing to an old tape which is new to you? In both cases, try specifying DOEOT FALSE  C R and then rerunning the tape-writing program.
  8. Has the drive been cleaned recently? Do not attempt to clean a drive yourself. Using the wrong cleaning fluid or cleaning the wrong parts of a drive can do serious damage. If you have any doubts, use another drive.
  9. Is your tape defective? Tapes can lose oxide or become stretched, creased, or dirty, all of which will cause problems. Try using another tape, if possible.