AIT Alpha Farm: DQS job submittal


Steps in DQS job creation, submital, monitoring, and job deletion if necessary

   
For simple single commands requiring small memory, you can use fcmd. 
Otherwise, follow these steps:

1) add farm   
     To get access to DQS commands. This is only needed once per login.

2) Create DQS script (assume it is called myjob)
     An easy way is to use DQSwriter and copy the web page text to a file.

3) qsub myjob
     This command submits the DQS script in the file myjob. You may submit several
     jobs in succession if they use different output files, but only one will run at 
     a time. Jobs run fairly with jobs from different users taking turns executing.

4) qstat
    Prints a report of all farm jobs. If you don't see your job, it has completed
    execution. Look at the output file (DQSwriter names the output file FARM_OUTPUT).

5) qdel  jobid
    Deletes a job.  The jobid is the number under the request column for your job
    as returned by qstat.


 * Restrictions and gotchas :
   1) The DQS script and all data used must be in AFS files. This means
      your home directory , the scratch disk, a rental AFS locker, 
      a private AFS disk or any subdirectory of these. Not included are
      /tmp , and local disk files, or any data on a private NFS disk.

   2) Your Kerberos tickets must remain valid until the job completes.
      Issuing relogin just before submitting works if the job will complete
      within 8 to 12 hours. Issuing relogin -l 2d will give you two day
      long tickets, and should be sufficient for all farm jobs.
      To see when your tickets will expire, use the vincent command klist.

   3) Any AFS lockers used must be added. All private lockers, the sas locker, 
      etc. need to be added. Only your home directory gets attached 
      automatically. 
      The way that DQS authenticates to the AFS file system has caused problems 
      for the add command. If the command "add locker_name" fails, use 
        set aenv = `attach -c -n locker_name` ; eval $aenv
      instead. 
      

Notes:

      $TMPDIR can be used for local temporary disk space. This is deleted at 
      the end of the job. The size of this varies from 225 Mbytes to 30 Gbytes.
      See the Queues and Queue Limits page for disk configuration details.