WSA Interface Control Document
------------------------------
------------------------------

Document control table(s)

Abstract:

This Interface Control Document (ICD) for the WFCAM Science Archive (WSA)
describes the data flow subsystem interface between the data processing
centre (CASU at the IoA, Cambridge) and the archive centre (WFAU at the
IfA, Edinburgh). Details of the types and specifications of processed
WFCAM data to be transfered, along with the transfer protocols
(file naming, transfer method and procedure), are given. The details of
this ICD have been agreed between CASU and WFAU; the formalities are 
being overseen by the JAC and the VISTA Data Flow System (VDFS) project.

Table of contents


1.0 INTRODUCTION:

1.1 Scope

This Interface Control Document (ICD) is intended to be a formal interface
control agreement between the WFCAM data processing centre at the Cambridge 
Astronomy Survey Unit (CASU) and the archive centre at the Wide Field 
Astronomy Unit (WFAU) in Edinburgh. The processing centre/archive centre
interface is the final subsystem interface in the WFCAM data flow chain, and 
is subject to the rules laid out herein. The ICD concerns WFCAM data only;
all other data ingested into the WFCAM Science Archive (WSA) are outside the
scope of interface control (the WSA will also ingest publicly released data
products, eg. SDSS and 2MASS etc., from other non-CASU sources).

The ICD is meant to be a technical reference: its intended audience is
software engineers and scientists working on processing and archiving in
the data flow. It takes the form of a formal agreement between CASU and
WFAU, but must also satisfy other external bodies, namely JAC, the UKIDSS
survey science consortium and the VISTA Data Flow System. The ICD is,
therefore, a major component of the documentation for the WSA critical
design review.

1.2 Overview

This document is structured as follows. In Section 2, we describe the
fundamental rules that the interface will adhere to, including a
statement of the primary data format, FITS. Then, in Section 3, we
describe the top-level specifications for data that will be transfered
between Cambridge and Edinburgh, including a description of FITS
conventions, keywords, file naming conventions, units, systems of
physical quantities and consistent unified column descriptors. Section 4
goes on to describe in explicit detail the data structures that will be
transfered. Then, Section 5 describes the transfer methods and procedures
that will achieve the data flow from Cambridge to Edinburgh. Backups and
other security issues are dealt with in Section 6, and finally a summary
is presented in Section 7.

1.3 Reference docs

Generally, this document is modelled on the ESO Data Interface Control
Document [1], and with the exception of the ESO hierarchical FITS
keyword definition, follows as closely as possible the specifications
provided therein. A data flow system overview is provided in [2].
Fundamental "meta" data description (ie. FITS frame headers and keywords)
are described in [3]. The JAC/CASU interface is defined and described in
[4]; CASU pipeline processing is described in [5]. Relevant documents
within the WSA project at WFAU, including the subset presented for the
purposes of the archive CDR, are available from [6].


2.0 FUNDAMENTALS

2.1 WFAU Ingest

The WSA at WFAU will ingest WFCAM data from CASU only; there will be no
transfer of WFCAM data between JAC and WFAU for example.

2.2 Data transfer method

The WSA will ingest data via the internet; tapes and/or "pluggable" disks
will not be employed. The implications for required network bandwidth are
discussed in [7].

2.3 Format

Data output from CASU will be provided in standard FITS format (as 
specified in [8]) only. Data will not be expressed in any "hierarchical"
system, eg. ESO hierarchical FITS, or the UK Starlink Hierarchical Data
Structure format (NDFs). The FITS standard is mature, universally accepted
and ideal for transporting both bulk pixel and catalogue data.

2.4 Content

Data transfered from CASU will consist of processed pixels (where the
processing steps are specified by the observing protocol used), confidence
maps, derived source catalogues and associated description data; no raw   
pixel data will be transfered to (or held in) the WSA.  Where
irreversible stages such as stacking or mosaicing have been done as part
of the reduction procedure, the individual component images and catalogues
will also be transferred.


3.0 DATA SPECIFICATION

3.1 Preliminaries

Processed frames will be stored in FITS format, following the guidelines
set out in [1]:

  o  The images comprising a WFCAM processed frame will be stored in
     different image extensions of the same FITS container file (a
     multi-extension FITS, or MEF, file); data pixels belonging to one
     image will be stored in one image extension (guideline-2).

  o  The primary data array in the MEF file will be empty (guideline-3)

  o  Keywords describing the dataset in the MEF file as a whole will be
     written into the primary header, while keywords that are related to
     the data in a particular extension will be written into the HDU of
     that extension (guideline-5)

Derived source catalogues corresponding to each image extension will be
written as FITS binary tables in extensions of a single, separate
MEF file with a similarly empty primary array.  The headers for the 
catalogue MEF will contain all the information of image MEF headers plus 
ancilliary processing keywords and values.


3.2 General FITS keywords

Keywords will follow the standards set out in [1] and [8] as described
(for WFCAM data) in [3]. All keywords and associated values written to
the HDS container files produced by the WFCAM DAS must be propagated
through the JAC/CASU interface, through the data processing pipeline
and into the WSA.

The first keyword in any extension HDU must be XTENSION, and it's value
will take on only 'IMAGE   ' or 'BINTABLE'; the EXTNAME keyword will be
used to identify the extension with a particular device detector. Binary
tables will have every column described by keywords TTYPEn, TFORMn
and TUNITn (see later).

World Co-ordinate System (WCS; ie. astrometric) information will be
propagated using a set of keywords described in the latest FITS WCS 
proposals [9,10] by Greisen and Calabretta. 

Error and statistics information will be expressed following the
convention described in [1], whereby the quantity in question has its
statistical auxilliary expressed via a keyword containing the first five
characters of the root name (or less if necessary) plus a three character 
suffix ERR (for a unit standard deviation), MIN/MAX, RMS, AVG etc. 


3.3 Physical units

Physical units will comply with SI units and their derivatives with a few
exceptions for astronomical convenience (see [1] Section 9, Table 14).

Celestial co-ordinates will be expressed in a time system described by
primary HDU keyword RADECSYS; it is anticipated that this will have
value 'FK5' (ie. Hipparcos/Tycho ICRS) over the lifetime of WFCAM, but
this may of course change for VISTA.

3.4 File naming conventions

[NB: this is a proposal from the WFAU end... let me know what you think
or if you'd like something else]

Files will be named according to the prescription given in [1]. In addition
to the requirements detailed there (Section 11.1.2), for processed WFCAM 
data it is necessary 

  o  to choose the timestamp for a superframe made up from the combination
     of several exposures with different start times;

  o  to distinguish between corrected superframes and object catalogues
     derived from them.
[NB: unless cats are written as extensions into the superframe MEF...]

FITS files will be named as follows: r..YYYY-MM-DDThh.mm.ss.fits
where  is UKWFCPIX or UKWFCOBJ and the time stamp is taken as being
the earliest start time from the set of interleave microstep exposures.

[NB: this scheme does not allow for different filenames if/when (!) pipeline
and/or source extraction are rerun over the same data...!]

4.0 DETAILED DATA SPECIFICATION

4.1 Data obtained at the time of observation

Observations will be described via the keywords OBSERVER, USERID, OBSREF, 
PROJECT, MSBID and OBJECT keywords.

Instrumental characteristics, set-ups and parameters will be described by
keywords as detailed in [3], including instrument detector configuration
(eg. array used DETECTOR; number of integrations NINT), detector
controller information (eg. camera read mode READMODE; read-out application
CAPPLICN), optical configuration (eg. filter name FILTER; base focus
position FOC_MM) and observing conditions/environment (eg. air temperature
AIRTEMP; relative humidity HUMIDITY; opacity data CSOTAU).

All these FITS keys will be propagated through the data flow chain from the
DAS to the WSA.

4.2 Data products (ie. derived data)

4.2.1 Corrected pixel data

The CASU pipeline will instrumentally correct WFCAM pixels into an 
interleaved superframe product that is instrument-signature free. The
reduction steps involved in doing so, the derived astrometric and
(first-cut) photometric calibrations and resulting DQC information
generated will be propagated into the WSA using the FITS keys
described in Table 1:

[NB: let me know what you think of this lot... is it possible to get
all of this info into the FITS headers during pipeline processing?:]

Keyword   Example   Primary HDU (p)     Data    [units] and
          value     or extension (e)    type    description

Reduction step keys:

PIPEVERS    1.0          p              char    Pipeline software version no.
LINCOR       T           p             logical  Linearity correction done?
RESETCOR     T           p               "      Reset correction done?
DARKFRAM                 p              char    Library dark frame used
BADPMASK                 p              char    Library bad pixel mask used
FLTFIELD                 p              char    Library flatfield frame used
DEFRINGE                 p              char    Library defringe frame used
SKYFRAME   NONE          p              char    Master sky subtraction frame
SEXTRVER    2.3          p              real    Source extraction s/w version

Parameter keys associated with object extraction:

RCORE       1.2          p              real    Core radius used in flux meas.
.
.
.

Derived astrometric calibration keys:

ASTREDVN    1.5          p              real    Astrometric s/w version used
REFCAT      UCAC         p              char    Astrometric reference catalogue
RADECSYS    FK5          p              char    Time system of astrometric red.
EQUINOX     2000.0       p             double   [years] Equinox of co-ords
REFSTARS    57           e               int    No. of ref stars used 
CTYPE1      RA---ZPN     e              char    Projection type
CTYPE2      DEC--ZPN     e              char        "       "
CRPIX1      -1024.5      e             double   [pixel] Ref pixel for axis 1
CRPIX2      +1024.5      e             double   [pixel] Ref pixel for axis 2
CRVAL1      +60.0        e             double   [degree] RA at ref pixel
CRVAL2      +25.0        e             double   [degree] Dec at ref pixel
CD1_1                    e             double   Co-ordinate xformation matrix
CD1_2                    e             double   Co-ordinate xformation matrix
CD2_1                    e             double   Co-ordinate xformation matrix
CD2_2                    e             double   Co-ordinate xformation matrix
PV1_1                    e             double   First-order radial dist. coeff
PV1_3                    e             double   Third-order radial dist. coeff

Derived photometric calibration (first-cut) keys:

ZEROPNT      19.0        e              real    Photometric zeropoint
.
.
.

Our current names for these, yours or new ones
----------------------------------------------

Derived DQC parameter keys:

SKYLEVEL   10000.0       e    real    [ADU] Robust median sky level
SKYNOISE     100.0       e    real    [ADU] Robust sky noise level
THRESHOL     200.0       e    real    [ADU] Detection threshold used
SEEING         1.7       e    real    [pixels]  Average stellar FWHM
ELLIPTIC      0.05       e    real    Avg. point-source ellipticity
SATURATE   60000.0       e    real    [ADU] Saturation level

APCORnn      0.213       e    real    [mags] Stellar  corrections
STDCRMS      0.458       e    real    [Arcsec]  Astrometric fit error
NUMBRMS        210       j     int    [] No. of astrometric standards used
PERCORR      0.000       e    real    [mags] Sky calibration correction
EXTINCT      0.011       e    real    [mags] Extinction for unit airmass
MAGZPT       22.64       e    real    [mags] Photometric ZP not inc. extinct
                                             or inc. unit airmass extinct
                                             whatever you fancy.
MAGZRR        0.02       e    real    [mags] Photometric ZP error


4.2.2 Source catalogue attributes

The standard set of CASU source detection parameters can be found in [5].
Table 2 lists the corresponding FITS binary table details for each
attribute:

As Jim says we currently store all these as reals for simplicity and also
currently have all the ttype stuff set in the following way

PCOUNT  =                    0 / size of special data area                   
GCOUNT  =                    1 / one data group (required keyword)            
TFIELDS =                   32 / number of fields in each row                 
TTYPE1  = 'No.     '           / label for field   1                          
TFORM1  = '1E      '           / data format of field: 4-byte REAL            
TTYPE2  = 'Isophotal_flux'     / label for field   2                          
TFORM2  = '1E      '           / data format of field: 4-byte REAL            
TUNIT2  = 'Counts  '           / physical unit of field                       
TTYPE3  = 'Total_flux'         / label for field   3                          
TFORM3  = '1E      '           / data format of field: 4-byte REAL            
TUNIT3  = 'Counts  '           / physical unit of field                       
TTYPE4  = 'Core_flux'          / Fitted flux within 1x core radius            
TFORM4  = '1E      '           / data format of field: 4-byte REAL            
TUNIT4  = 'Counts  '           / physical unit of field                       
TTYPE5  = 'X_coordinate'       / label for field   5                          
TFORM5  = '1E      '           / data format of field: 4-byte REAL            
TUNIT5  = 'Pixels  '           / physical unit of field                       
TTYPE6  = 'Y_coordinate'       / label for field   6                          
TFORM6  = '1E      '           / data format of field: 4-byte REAL            
TUNIT6  = 'Pixels  '           / physical unit of field                       
TTYPE7  = 'Gaussian_sigma'     / label for field   7                          
TFORM7  = '1E      '           / data format of field: 4-byte REAL            
TUNIT7  = 'Pixels  '           / physical unit of field                       
TTYPE8  = 'Ellipticity'        / label for field   8                          
TFORM8  = '1E      '           / data format of field: 4-byte REAL            
TTYPE9  = 'Position_angle'     / label for field   9                          
TFORM9  = '1E      '           / data format of field: 4-byte REAL            
TUNIT9  = 'Degrees '           / physical unit of field                       
TTYPE10 = 'Peak_height'        / label for field  10                          
TFORM10 = '1E      '           / data format of field: 4-byte REAL            
TUNIT10 = 'Counts  '           / physical unit of field                       
TTYPE11 = 'Areal_1_profile'    / label for field  11                          
TFORM11 = '1E      '           / data format of field: 4-byte REAL            
TUNIT11 = 'Pixels  '           / physical unit of field                       
TTYPE12 = 'Areal_2_profile'    / label for field  12                          
TFORM12 = '1E      '           / data format of field: 4-byte REAL            
TUNIT12 = 'Pixels  '           / physical unit of field                       
TTYPE13 = 'Areal_3_profile'    / label for field  13                          
TFORM13 = '1E      '           / data format of field: 4-byte REAL           
TUNIT13 = 'Pixels  '           / physical unit of field                       
TTYPE14 = 'Areal_4_profile'    / label for field  14                          
TFORM14 = '1E      '           / data format of field: 4-byte REAL            
TUNIT14 = 'Pixels  '           / physical unit of field                       
TTYPE15 = 'Areal_5_profile'    / label for field  15                          
TFORM15 = '1E      '           / data format of field: 4-byte REAL            
TUNIT15 = 'Pixels  '           / physical unit of field                      
TTYPE16 = 'Areal_6_profile'    / label for field  16                          
TFORM16 = '1E      '           / data format of field: 4-byte REAL            
TUNIT16 = 'Pixels  '           / physical unit of field                       
TTYPE17 = 'Areal_7_profile'    / label for field  17                          
TFORM17 = '1E      '           / data format of field: 4-byte REAL            
TUNIT17 = 'Pixels  '           / physical unit of field                       
TTYPE18 = 'Areal_8_profile'    / label for field  18                          
TFORM18 = '1E      '           / data format of field: 4-byte REAL            
TUNIT18 = 'Pixels  '           / physical unit of field                       
TTYPE19 = 'Core1_flux'         / Fitted flux within 1/2x core radius          
TFORM19 = '1E      '           / data format of field: 4-byte REAL            
TUNIT19 = 'Counts  '           / physical unit of field                       
TTYPE20 = 'Core2_flux'         / Fitted flux within sqrt(2)x core radius      
TFORM20 = '1E      '           / data format of field: 4-byte REAL       
TUNIT20 = 'Counts  '           / physical unit of field                       
TTYPE21 = 'Core3_flux'         / Fitted flux within 2x core radius            
TFORM21 = '1E      '           / data format of field: 4-byte REAL            
TUNIT21 = 'Counts  '           / physical unit of field                       
TTYPE22 = 'Core4_flux'         / Fitted flux within 2sqrt(2)x core radius     
TFORM22 = '1E      '           / data format of field: 4-byte REAL            
TUNIT22 = 'Counts  '           / physical unit of field                       
TTYPE23 = 'RA      '           / label for field  23                          
TFORM23 = '1E      '           / data format of field: 4-byte REAL            
TUNIT23 = 'RADIANS '           / physical unit of field                       
TTYPE24 = 'DEC     '           / label for field  24                          
TFORM24 = '1E      '           / data format of field: 4-byte REAL            
TUNIT24 = 'RADIANS '           / physical unit of field                       
TTYPE25 = 'Classification'     / label for field  25                          
TFORM25 = '1E      '           / data format of field: 4-byte REAL            
TUNIT25 = 'Flag    '           / physical unit of field                       
TTYPE26 = 'Statistic'          / label for field  26                          
TFORM26 = '1E      '           / data format of field: 4-byte REAL            
TUNIT26 = 'N-sigma '           / physical unit of field                        
TTYPE27 = 'Blank   '           / label for field  27                           
TFORM27 = '1E      '           / data format of field: 4-byte REAL            
TUNIT27 = 'Blank   '           / physical unit of field                       
TTYPE28 = 'Blank   '           / label for field  28                           
TFORM28 = '1E      '           / data format of field: 4-byte REAL             
TUNIT28 = 'Blank   '           / physical unit of field                       
TTYPE29 = 'Blank   '           / label for field  29                           
TFORM29 = '1E      '           / data format of field: 4-byte REAL             
TUNIT29 = 'Blank   '           / physical unit of field                        
TTYPE30 = 'Blank   '           / label for field  30                          
TFORM30 = '1E      '           / data format of field: 4-byte REAL            
TUNIT30 = 'Blank   '           / physical unit of field                       
TTYPE31 = 'Blank   '           / label for field  31                           
TFORM31 = '1E      '           / data format of field: 4-byte REAL            
TUNIT31 = 'Blank   '           / physical unit of field                       
TTYPE32 = 'Blank   '           / label for field  32                           
TFORM32 = '1E      '           / data format of field: 4-byte REAL             
TUNIT32 = 'Blank   '           / physical unit of field                       

cf. to your 

No.  Name                   TTYPE     TFORM  TUNIT  

 1   Seq. no.               SEQNUM      1J     -
 2   Isophotal flux         ISOPHFLX    1E   ADU
 3   X co-ordinate          XCOORD      1E   pixels
 4   Error in X             XCOORERR    1E   pixels
 5   Y co-ordinate          YCOORD      1E   pixels
 6   Error in Y             YCOORERR    1E   pixels
 7   Gaussian sigma         GAUSIGMA    1E   pixels
 8   Ellipticity            ELLIPTIC    1E   pixels
 9   Position angle         POSANGLE    1E   degrees
10   Areal profile  1       AREAPRO1    1E   pixels
.
.
.
17   Areal profile  8       AREAPRO8    1E   pixels
18   Peak height            PKHEIGHT    1E   ADU
19   Peak height error      PKHEIERR    1E   ADU
20   Core flux              COREFLUX    1E   ADU
21   Core flux error        COREFERR    1E   ADU
22   Core 1 flux            CFL01       1E   ADU
23   Core 1 flux error      CFLERR01    1E   ADU
.
.
.
42   Core 12 flux           CFL12       1E   ADU
43   Core 12 flux error     CFLERR12    1E   ADU
44   Petrosian radius       PETRORAD    1E   pixels
45   Kron radius            KRONRAD     1E   pixels
46   FWHM radius            FWHMRAD     1E   pixels
47   Petrosian flux         PETFLUX     1E   ADU
48   Petrosian flux error   PETFLERR    1E   ADU
49   Kron flux              KROFLUX     1E   ADU
50   Kron flux error        KROFLERR    1E   ADU
51   FWHM flux              FWHFLUX     1E   ADU
52   FWHM flux error        FWHFLERR    1E   ADU
53   Error bit flag         PROFLAGS    1J   
54   Sky level              SKYLEVEL    1E   ADU
55   Sky variance           SKYVAR      1E   ADU
56   Child/parent           BLENDING    1J   
57   Right Ascension        RA          1D   degrees
58   Declination            DEC         1D   degrees
59   Classification         ICLASS      1J 
60   Profile statistic      PROFSTAT    1E
61   PSF flux               PSFFLUX     1E   ADU
62   PSF flux error         PSFFLERR    1E   ADU
63   PSF fitted X           XPSF        1E   pixels
64   PSF fitted X error     XPSFERR     1E   pixels
65   PSF fitted Y           YPSF        1E   pixels
66   PSF fitted Y error     YPSFERR     1E   pixels   



[NB: may need additional celestial PA as well as item 9 (position angle wrt 
X axis) for dumb overlay progs that can't understand WCS; do you agree that
57/58 (RA/Dec) need to be doubles?]

4.2.3 Other data product conventions

  - checksums for data verification?

  - allowed/logged ranges for attributes, again for verification?

  - convention for null or n/a values?


5.0 TRANSFER METHODS & PROCEDURES

5.1 Methods

Transfer will be via the internet using standard methods.  The data to 
be transferred will reside in Cambridge on specific RAID arrays attached 
to a linux PC cluster.  WFAU will have an account on this system.  
Directories of processed nights data will be setup as the pipeline is running.
While the processing is still running a directory lock file will be used to
denote the in progress operations.  After completion the lock file will be 
unset/removed enabling a remotely controlled browser script to automatically
initiate data transfer to Edinburgh.  Tests between different locations in 
the UK in the day give sustained data transfers rates of 4 Mbyte/s and 
have beend used to copy ~100 Gbytes of data between sites in 5-6 hours.

Alternative transfer methods we have tested include, scp, grid-ftp,
sftp ........  ) (drop ftp since not secure)


5.2 Procedure

  - location of data is guaranteed by the pipeline and will be in a 
    observation date driven directory structure to which WFAU will have
    a secure direct access

  - "handshaking", eg. notification of readiness will be achieved using a
     lockfile system as outlines above; verification of successful transfer
     by no. and size of files transferred (eg. scp verifies as it goes so
     if preceding two are ok everything is fine n'est ce pas ???)     


5.3 Updates

  - reruns in case of bug fixes, improvements in instrumental correction,
    improvements in source extraction: any additional interface issues
    resulting from this possibility/liklihood(!) ?


6.0 BACKUPS AND OTHER SECURITY ISSUES

  - raw data will be held online in Cambridge as the primary UK backup.
    Raw data will be also be arhicved/stored at the JAC

  - security ......... secure transfer, restricted acces to computers
    whatever....... firewalls......

7.0 SUMMARY


REFERENCES

[1] ESO Data Interface Control Document, GEN-SPE-ESO-19940-794/2.0
    http://archive.eso.org/DICB/dic-2.0/dic-2.0.4.pdf

[2] VDFS document...?

[3] ATC WFCAM HDS container and FITS headers, WFCAM project Document No. ?

[4] JAC-CASU Interface Control Document,
    http://www.jach.hawaii.edu/JACpublic/UKIRT/instruments/wfcam/ICD/

[5] WFCAM Pipeline Design
    http://www.ast.cam.ac.uk/~wfcam/docs/wfcampipedoc_v2.ps.gz

[6] WFCAM/VISTA Science Archive Development
    http://www.roe.ac.uk/~nch/wfcam/

[7] WFCAM Science Archive hardware design document, 
    http://www.roe.ac.uk/~nch/wfcam/...

[8] Definition of the Flexible Image Transport System (FITS), document 
    NOST 100-2.0
    http://fits.gsfc.nasa.gov/fits_home.html

[9] Representations of world co-ordinates in FITS
    Greisen EW, Calabretta MR, A&A, 395, 1061 (2002)

[10] Representations of celestial co-ordinates in FITS
     Calabretta MR, Greisen EW, A&A, 395, 1077 (2002)

GLOSSARY


APPENDICES


Last modified: Wed Mar 5 12:37:31 2003