[AIPS][NRAO] AIPS Managers Frequent Questions

This page updated on $Date: 2014/06/30 21:24:14 $ (UT) (but it probably needs more work)

The purpose of this page is to address some of the more common questions an AIPS installer is likely to have. It is not intended to be all-encompassing. However, suggestions for specific problems (especially if you've faced it more than once) are welcome (see email address at end of document).


Problems with binary installation via rsync

The INSTEP1 and FTPGET scripts are not used much and probably no longer work correctly. We retain them in the hopes that they are useful on occasion, but local installation with install.pl is now the only well-tested form of AIPS installation. These old scripts used ftp to transfer the AIPS files.

The modern perl script install.pl with option -n (for network) allows a binary installation of all of AIPS without the need for local compilers and a local compilation. Such binary installations may then run a binary version of the Midnight Job as well. These scripts use rsync to transfer the files. Some sites have trouble doing this due to a (misguided) local decision to block the rsync port. If you attempt install.pl -n and time-out, hang, or otherwise fail, try

         rsync rsync://ftp.aoc.nrao.edu/

which should produce an output similar to

***************************************************************************
 National Radio Astronomy Observatory computing facilities are exclusively
  for the use of authorized personnel, who are expected to abide by the
     terms of the NRAO Computing Security and Computing Use Policies.
***************************************************************************


aips            AIPs OSX Binaries
31DEC08SUL      AIPS area
31DEC07SUL      AIPS area
31DEC06SUL      AIPS area
31DEC08LINUX    AIPS area
31DEC07LINUX    AIPS area
31DEC06LINUX    AIPS area
31DEC08         AIPS area
31DEC07         AIPS area
31DEC06         AIPS area
TEXT            AIPS text area
asg             Array Support Group CVS root
casadata        CASA data repository
casadata-core   CASA data repository
casadata-core2  CASA data repository
casadata-nraogbt        CASA data repository
casadata-nraovla        CASA data repository
casadata-vla2   CASA data repository
casadata-nrao   CASA data repository
casadata-atnf   CASA data repository
casadata-alma   CASA data repository
casadata-bima   CASA data repository
casadata-protopipe      CASA data repository
casadata-demo   CASA data repository
casadata-regression     CASA data repository

If it does not, then the rsync port (873) is probably blocked for your computer or site router.

Problems starting message and TeK servers on binary installations

The message and TeK display servers run as programs in xterms forked from the parent process. These xterms must know the added $LD_LIBRARY_PATH needed for the binary versions of AIPS. Therefor all aips users will need to put in the .cshrc, .tcshrc, .bashrc, or other login files some appropriate text. The AIPS-referencing text must be executed for both interactive and non-interactive logins. The text below sources the AIPS logicals, but only for those machines mentioned in HOSTS.LIST. For machines in c shell environments running as LAPTOP=YES or in environments in which all machines you might use are AIPS machines, use simply

  setenv AIPS_ROOT the_local_aips_root_name
  source $AIPS_ROOT/LOGIN.CSH
For machines running aips under their own names in a complicated environment (in which some machines you might use do not run AIPS):
  #                                       AIPS setup
  #                                       Set variable HOST.  Strip off
  #                                       any domain name, use simple
  #                                       hostname.  Also uppercase it.
  setenv HOST `uname -n | tr '[a-z]' '[A-Z]' | awk -F. '{print $1}'`
  setenv AIPS_ROOT the_local_aips_root_name
  set xxx = `grep "^[-+]  $HOST" $AIPS_ROOT/HOSTS.LIST`
  if ( "$xxx" != "" ) then
     source $AIPS_ROOT/LOGIN.CSH
  endif
  unset xxx
For bash logins, the simple script is
  AIPS_ROOT=the_local_aips_root_name
  export AIPS_ROOT
  . $AIPS_ROOT/LOGIN.SH
Under bash, for machines running aips under their own names in a complicated environment (in which some machines you might use do not run AIPS), use in your .bashrc file
  #                                       AIPS setup
  #                                       Set variable HOST.  Strip off
  #                                       any domain name, use simple
  #                                       hostname.  Also uppercase it.
  HOST=`uname -n | tr '[a-z]' '[A-Z]' | awk -F. '{print $1}'`
  export HOST
  AIPS_ROOT=the_local_aips_root_name
  export AIPS_ROOT
  xxx=`grep "^[-+]  $HOST" $AIPS_ROOT/HOSTS.LIST`
  if [ "$xxx" != "" ] ; then
    . $AIPS_ROOT/LOGIN.SH
    $CDTST
  fi

Problems with Midnight Job and cvs

Most Linux systems come with the cvs (code versioning system) software installed, but many Macs do not. One of our users solved this problem following the advice at http://apple.stackexchange.com/questions/108240/cvs-on-os-x-mavericks This produced the following:
The easiest was for most programs is to install with a package
manager like Macports, Homebrew, or Fink.  The porters will have
worked out the issues about porting the code to OSX, so it should be a
simple as follows (for Macports)
   1. Install macports, download the .pkg file, and run it
   2. Install Xcode and its command line tools
   3. Add /opt/local/bin to your path - the installer attempts to do
      this for you so you probably just need to check
   4. Install the port with
           sudo port install cvs
   5. Run cvs either with the full path /opt/local/bin/cvs or use your
      $PATH
   6. For AIPS, edit the file $SYSLOCAL/UPDCONFIG to put this full
      path in for the cvscmd

Problems after upgrading your operating system

When you upgrade the operating system, some system files get replaced. These almost certainly include the /etc/services file which needs AIPS services to be added, the /etc/system or /etc/rc file which needs larger shared memory segments, and the X configuration files which need to specify 8-bit PseudoColor or 24- or 32-bit TrueColor rather than the Linux default 16-bit TrueColor. Each OS upgrade will almost certainly force you to repeat some of the syste modifications described below.

Problems with the TV, message, TEK and tape servers

The instructions to change your /etc/services file are often overlooked. The Inet versions of XAS with its TVSERV lock daemon, MSGSRV, and TEKSRV all require that predictable node numbers be reserved for them. The remote tape services also require these and do not offer a UNIX (non-network socket) option. In both cases, if you need to communicate between two computers (or more), the following must be installed in your /etc/services (or YP services):

sssin           5000/tcp        SSSIN      # AIPS TV server
ssslock         5002/tcp        SSSLOCK    # AIPS TV Lock
msgserv         5008/tcp        MSGSERV    # AIPS Message Server
tekserv         5009/tcp        TEKSERV    # AIPS TekServer
aipsmt0         5010/tcp        AIPSMT0    # AIPS remote FITS disk access
aipsmt1         5011/tcp        AIPSMT1    # AIPS remote tape 1
aipsmt2         5012/tcp        AIPSMT2    # AIPS remote tape 2
aipsmt3         5013/tcp        AIPSMT3
aipsmt4         5014/tcp        AIPSMT4
aipsmt5         5015/tcp        AIPSMT5
aipsmt6         5016/tcp        AIPSMT6
aipsmt7         5017/tcp        AIPSMT7
You do not need to install all the tape services unless you have a large number of tape devices on some computer.

Other problems include shared memory size problems on Macs and Solaris and Linux's insistence on 16-bit "TrueColor" as its default visual in XWindows. Both of these are addressed below.

RedHat Enterprise and some other systems have shown a problem characterized by TV cursor reading appearing sporadic. This issue almost certainly arises from internal Internet loopback problems in the operating system. We have found that the AIPS Inet TV works fine on these systems when actually talking to a different computer; it only goes bad when talking to itself. The aips command-line option tv=local (see the man page or help aips or the CookBook chapter 2 for details) uses a local Unix socket and does not seem to suffer from the bad performance problems.


Other common questions concerning problems

AIPS doesn't recognize any symbols

When a user exits AIPS his symbol table is saved in a SAVE/GET file called LASTEXIT. When that user starts AIPS again, the LASTEXIT file is read and becomes the user's vocabulary including adverb values and procedures. Unfortunately, this file seems to get damaged under some circumstances usually related to machine crashes. 31DEC04 AIPS tries to detect damaged LASTEXIT files and recovers with a default vocabulary (RESTORE 0). For older versions or if this does not work, delete the file in the first data area named SGDuuu001.uuu\; where uuu is the user number in the extended HEX (base 36) nomenclature.

My TV works, but the Tek and Message Servers Don't

First check that /etc/services has the right services in it (sssin, msgserv, tekserv). Then make sure you don't have a file or subdirectory in the current working directory that matches the current hostname. The AIPS routines incorrectly interpret the presence of any file (including directories) with the same hostname as the host as being a Unix domain socket. A Fix is being worked on for 15APR97. Workaround: rename the file or directory, or cd somewhere else before starting AIPS.

AIPS works, tekserver works, but the TV doesn't come up

This is seen most often on Linux systems. Almost certainly your X Windows configuration is set to use a 16-bit display. The AIPS TV can only support 8 and 24 bit displays (32 and 24 should be equivalent). Type "xdpyinfo | more" and if you see this:

      
        default visual id:  0x20
        visual:
          visual id:    0x20
          class:    TrueColor
          depth:    16 planes
      
... then this is the problem. If it says 8 or 24, then the TV should work. If it says 16, then you should alter your X configuration to allow either 8 or 24 bit display. You should use the supplied tools, e.g. XConfigurator under Red Hat Linux, to do this; only edit the XF86Config file directly if you know exactly what you're doing!

My TV still won't come up

Check the messages that show the DISPLAY variable. It may not be what you want, especially on the latest "leopard" Mac operating system. If you just want to run locally, set this variable before starting aips:

            export DISPLAY=localhost:0
or
            setenv DISPLAY localhost:0
in the xterm from which you run aips.

Shared memory id failure on Macs: Invalid Argument

After you follow the instructions below appropriate to your release of the Mac operating system, you must re-boot the computer. The control file for shared memory is read at boot time only. Note that a re-boot is not simply logging the current user out and then back in. You must do a full restart.

The default Mac system limits shared memory pages to 4 Mbytes. When XAS starts it tells you that it is making a screen x pixels by y pixels. The memory you will need is at least 4 x y bytes, but this rounds upward rapidly. For the new large screens this is more than 8 Mbytes. On 10.3 and 10.4 systems, you can change this limit by changing (as root or admin) the rc file in /etc, adjusting the kern.sysv.shm* line to

         #Setting the shared memory to something a bit more reasonable.
            sysctl -w kern.sysv.shmmax=10485760
            sysctl -w kern.sysv.shmmin=1
            sysctl -w kern.sysv.shmmni=32
            sysctl -w kern.sysv.shmseg=8
            sysctl -w kern.sysv.shmall=4096
         
If you are really lucky and have a 30-inch screen (2550 by 1500 pixels) then you will have to make the shmmax line even larger
            sysctl -w kern.sysv.shmmax=16777216
         

Note that these are upper limits, so it does not hurt to set a value that might be larger than necessary for your system. The shmmax must be an integer multiple of the shmall which must be a power of 2 >= 1024. A 3190 by 958 screen was found to require the larger value above. I think this comes by n times (4096 / 4 bytes/word) has to be > 3190 leading to 4096 words per row. Then 958 * 4096 * 4 bytes = 15695872 or just a bit less than the 16777216.

On the latest "leopard", "snow leopard", "lion", and "mountain lion" (X 10.5-10.8) systems, /etc/rc is gone and creating it will have no effect. You need to create an /etc/sysctl.conf file and put the values in it,

            kern.sysv.shmmax=10485760
            kern.sysv.shmmin=1
            kern.sysv.shmmni=32
            kern.sysv.shmseg=8
            kern.sysv.shmall=4096
         
You should use the values you had when you were running tiger. Those could be in /Previous\ System/etc/rc, assuming you have "Previous System". So three different OS upgrades and three different ways to adjust the default shared memory. Note: You will need to reboot the system for the change in shared memory to take place. You can check if the shared memory changes happened by typing "sysctl kern.sysv" in a terminal or xterm window. Look for the kern.sysv.shm* values. If the values have not changed, make sure you haven't inadvertently left in "sysctl -w" in the /etc/sysctl.conf file or mis-typed one of the values. If the /etc/sysctl.conf file is not properly formatted, or shmmax is not an integer multiple of shmall, the shared memory will not be adjusted after the reboot.

On older Jaguar systems (X 10.2), you can change this limit by changing the SystemTuning file in

 
            /System/Library/StartupItems/SystemTuning
         
Look for the lines
            sysctl -w kern.sysv.shmmax=4194304
            sysctl -w kern.sysv.shmall=1024
         
Change the 4194304 to 10485760 (for 10 Mbytes) and change the 1024 to 4096 (allows 16 Megabytes). You must then re-boot the computer to have these changes take effect.

Shared memory id failure on Solaris: Invalid Argument

If you see this when the system is trying to fire up the AIPS TV (XAS) on a Solaris system, then your X11 display does not support more than the default of 1 Megabyte maximum for shared memory segment. If your monitor displays 1280x1024 or larger, the sizes of the shared memory segments XAS wants will exceed a Megabyte. Solution: have your sysadmin edit /etc/system and put this line somewhere near the end:

set shmsys:shminfo_shmmax=8388608

While there, you may want to also add these if you have more than 64 Mbytes of real memory:

set ufs:ufs_HW=6291456
set ufs:ufs_LW=4194304
set priority_paging=1

Only add the last one if you are running Solaris 7 or later. These three settings will boost your overall AIPS performance.

My Mac will not do TV displays from a compute server

Macs usually run a firewall that blocks outside access to most numbered ports. Traditionally AIPS uses ports 5000, 5002, and 5008 - 5010+NTAPE to let other AIPS computers talk to the local TV, message, TEK, and tape servers. To open these ports:

AIPS works but the XWindows tools can't open display

AIPS uses the display assignment hostname:0 to avoid all sorts of problems especially those related to ssh. To avoid the following problem, AIPS will now (on/after 20 April 2004) use a display of simply :0 when the host, tvhost, and tvdisp variables all point at the same computer. hostname:0 will still be used under other circumstances. This seems to cause problems with some machines, especially laptops. For this to work, the computer must be running networking including a "loopback" function that allows a computer to talk to itself on the internet. This is automatically present when connected to the internet, but is missing on some laptops when not connected. Try

                /sbin/ifconfig -a
which should show (among other sections) a loopback section:
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:18279881 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18279881 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1598819254 (1524.7 Mb)  TX bytes:1598819254 (1524.7 Mb)
Our system experts have written a script which they then have executed. The following may be of use:
This is a script we put in /etc/init.d

Then we use the following command to activate it...
  chkconfig --level 2345 loopback on

mind you this is redhat9 specific and also specific to how we do things.
Honestly I don't know why someone wouldn't have a loopback device *shrug*

#!/bin/sh
#
#File: /users/krowe/loopback 
#Author: K. Scott Rowe 
#Time-stamp: <09/22/2003 13:48:44 krowe@rastan.aoc.nrao.edu>
#
# chkconfig: 2 10 90
# description: Activates/Deactivates just the loopback interfaces

. /etc/init.d/functions

# See how we were called.
case "$1" in
  start)
    # bring up loopback interface
    action $"Bringing up loopback interface: " /sbin/ifup ifcfg-lo
    ;;
  stop)
    action $"Shutting down loopback interface: " /sbin/ifdown ifcfg-lo
    ;;
  *)
    echo "Usage: $0 {start|stop}"
    exit 1
    ;;
esac
Error message from ZLOCK claiming "no locks available"

File locking requires the certain daemon processes (services) be running in the operating system. Which processes these are depends on the OS. Linux RedHat 7.2 systems seem to require lockd, statd, and nfsd. More modern Linux distributions have services turned off uless the installer explicitly turns them on. These systems need nfs and nfslock services. Enter

          /etc/init.d/nfs status
          /etc/init.d/nfslock status
     
to see if they are running. The rpc.statd is often the rpc that has been overlooked.

My machine keeps asking for passwords

AIPS uses ssh to run the START_TVSERVERS and START_TPSERVERS scripts on whatever machines are to provide TV and tape services. This can include the machine on which you are currently running, the display server machine in front of which you are sitting, and other compute and tape servers as well. To avoid having to type passwords all the time, you must have ssh setup to allow this. To test this try

            ssh  host  echo "testing ssh"
         
where host is the machine which you want to use. If it asks for a password, you must set it up.

On your home machine, change to your ~/.ssh directory. Run the program

            ssh-keygen  -tdsa
         
entering as many carriage returns as it takes to make it finish. It will create a file called id_dsa.pub. Copy the contents of this file into the file authorized_keys2 in your ~/.ssh directory on the current machine and on any other machine to which you wish to ssh without passwords.

This may still not be enough. ssh frequently requires that you be the only account allowed to write in your home directory (~), to write in your ~/.ssh directory, and to write any of the files in that directory. Furthermore, authorized_keys2 must be readable only by your account. These rules apply to your accounts on all the machines between which you wish to talk.


Common questions concerning configurations

How should I set up a new, multi-user configuration?

At the NRAO Array Operations Center we have over 100 workstations, mostly Linux with a few old Solaris machines, running AIPS. I will describe what we do:

We have a single AIPS source code area on a file server called /home/AIPS. Multiple AIPS versions and architectures (actually only SUL and LINUX at present) are below this directory point. Every computer that is to run AIPS mounts this directory. Because the 2 architectures we support do not have compatible binary formats, the HOSTS.LIST file in /home/AIPS (or $AIPS_ROOT) has 2 site names NRAOAOC and COAOARN (for Linux). Each computer that is to run AIPS must be listed there. In $NET0 (which is $AIPS_ROOT/DA00)-, we have a DADEVS.LIST file for Solaris boxes and a DADEVS.LIST.COAOARN file for the other "site". One NETSP file suffices for both. Each AIPS data area is listed in one of the DADEVS files and in the NETSP file.

The DA00 areas for each of the computers in HOSTS.LIST must appear in $NET0 as a directory, but we strongly recommend that the actual directory and files be on each particular hostname. Thus the files in $NET0 should be link files, e.g.

              cd $NET0
              ln -s /home/primate/AIPS/PRIMATE PRIMATE
for a machine called primate (mine). Note that we do this because Linux file locking over NFS has troubles when the file is not on the machine doing the locking.

At the NRAO AOC, we have a global AIPS data defining place. This must be maintained by the people with root privilege, so I do not recommend that. Instead, in $DATA (same as $AIPS_ROOT/DATA) I would, for the same file locking reasons, make link files to the desired data locations e.g.

             cd $AIPS_ROOT
             ln -s /home/primate/AIPS/PRIMATE_1
             ln -s /home/primate2/AIPS/PRIMATE_2
             ln -s /home/primate/AIPS/PRIMATE_3
             ln -s /home/primate2/AIPS/PRIMATE_1
             ln -s /home/primate3/AIPS/WEEMONKEY_1
Note that this defines 4 aips data areas for primate to use on 2 actual disks and a data area for a machine called weemonkey that does not have a large data disk. The performance on weemonkey will suffer since NFS file reading and writing is relatively slow and may suffer problems with file locking.

If you want there to be a compute server, used by multiple users, then there are several choices. The users log in to SERVER and run aips there. AIPS will attempt to start the TV, message, and TEK servers on the user's desktop if that desktop is listed in HOSTS.LIST. Alternatively, on SERVER the user could specify tv=local:n on the aips command line. This will open a remote window on the user's desktop but can be slow since any expose event forces re-writing the entire window across the LAN. At the NRAO, we have users share the data areas on SERVER, which requires that there be some agreement about who uses what AIPS number(s). Note that this does allow some access to a user's data by another user. Be sure to set the TIMDEST time limits to 365 days or more (in $NET0/NETSP) to avoid potential problems. AIPS does allow a user to set his own AIPS password which can protect his data to some degree.

A trickier method would be for user's to have in their home directory a dadevs file named .dadevs which points to separate areas on the server, e.g.

          /home/server/AIPS/user1/SERVER_1     in user1's area
          /home/server/AIPS/user2/SERVER_1     in user2's area
          /home/server/AIPS/user3/SERVER_1     in user3's area
          /home/server/AIPS/user4/SERVER_1     in user4's area
You could vary permissions on these directories to assist in keeping user1 out of user2's data. To avoid warning messages and default TIMDEST limits, every possible data area will have to be listed in $NET0/NETSP.

Note that the host name running AIPS must appear in the directory name at least as seen in the list of names in the alternative dadevs files. What that directory links to can be anything, although that gets confusing in a hurry.

In Socorro, we make a $FITS directory in each host's first data area and the $AIPS_ROOT/AIPSASSN.*SH procedures have an if $SITE is NRAOAOC or COAOARN then change $FITS to point to this. You could add your site to this if or add an if of your own. There is a small chance that the MNJ might overwrite these files so keep a copy somewhere and watch the MNJ reports. In Charlottesville, they have a cron that deletes files from a public FITS area after they are more than n days old. Users are warned about that.

A better alternative for the public FITS problem is to have each user set aside his/her own data area and define e.g. $MYFITS in their login procs. Then run FITAB, FITTP, etc. with OUTFILE='MYFITS:filename which leaves them responsible for their own FITS files, PostScript plots, printouts, etc.

How do I configure a new AIPS host machine?

How do I configure a new AIPS architecture?

How do I configure new AIPS data areas?

How do I configure a new AIPS TV display?

If the system is a full-blown AIPS host, you don't need to if you have already set it up as outlined above. If it is going to be a system that only displays the AIPS TV (XAS) from another AIPS host, edit $AIPS_ROOT/HOSTS.LIST and make an entry for the system with an = sign in column 1.

How do I revise the configuration files for a new AIPS user disk?

Edit $NET0/DADEVS.LIST and $NET0/NETSP. If any host in your system has a host-specific DADEVS.LIST file (in $NET0/$HOST/), and you want the user disk (data area) to be accessible therefrom, edit that file too.

The AIPS distribution is HUGE! What files may I safely strip from the system (or at least gzip) after installation to conserve disk space?

Refer to the end of the AIPS Unix Installation Summary. Before doing anything, BACK UP THE SYSTEM! After that, here are some hints:

How do I set up a remote tape?

The easiest way is to make the remote host an AIPS system. Then make sure that the TPMON daemons are running on it (you can start them on the host by starting an AIPS session there, or by running the START_TPSERVERS script there). If and only if you cannot make it a full AIPS host, then try this:

  1. Make sure the remote system has mounted the AIPS_ROOT area from your original system via NFS.
  2. Make an entry in HOSTS.LIST for the remote system. BE CAREFUL! If the two systems are different endian flavours, you MUST put them in different "sites". Alpha and Intel systems are little endian; Sparc, SGI, IBM RS/6k, PowerPC, Mac OS/X, and HP/Risc systems are all big endian. Little and big endian systems have different byte order and can't interchange AIPS system files.
  3. Create a $AIPS_ROOT/DA00/$HOST area where $HOST expands to the uppercase host name for the machine with the tape drive. Populate it with the contents of the SUL area for big endian systems, and the LINUX area for little endian systems. Or if you already have a TEMPLATE area set up ($AIPS_VERSION/$ARCH/TEMPLATE/) and populated, copy the files from it.
  4. Create $AIPS_VERSION/$ARCH/LOAD where $ARCH is the architecture of the machine on which the tape drive sits, and put the appropriate version of TPMON.EXE there. Get it from our ftp site the SUL area for big endian systems, and the LINUX area for little endian systems. Make two hard links to it:

    cd $AIPS_VERSION/$ARCH/LOAD
    ln TPMON.EXE TPMON1
    ln TPMON.EXE TPMON2
    Warning: if you are doing this because you do not have a compiler for the tape-host machine, then the load modules may well be missing necessary run-time libraries as well.

    I think this is all that's needed. You then try it out by starting the daemons on the tape machine:

    	tapehost3% source /AIPS/LOGIN.CSH        (if you use csh/tcsh, or...)
    	wise3$ . /AIPS/LOGIN.SH		         (for bash, ksh, zsh, etc.)
         

    then regardless of shell:

            wise3% /AIPS/START_TPSERVERS -d
         

    You of course replace /AIPS with whatever your $AIPS_ROOT is in the above examples. The "-d" causes the script to be a lot more verbose (with debug messages) and is not required for normal use; the first time though you want to see these to make sure things are working.

Very old notes on specific architectures

This section is still incomplete and is now very very dated. It probably is no longer of use.


Created by Pat Murphy at the suggestion of Joe Mazz at Caltech/IPAC. Thanks, Joe!

Eric W. Greisen