WSA Interface Control Document ------------------------------ ------------------------------ Document control table(s) Abstract: This Interface Control Document (ICD) for the WFCAM Science Archive (WSA) describes the data flow subsystem interface between the data processing centre (CASU at the IoA, Cambridge) and the archive centre (WFAU at the IfA, Edinburgh). Details of the types and specifications of processed WFCAM data to be transfered, along with the transfer protocols (file naming, transfer method and procedure), are given. The details of this ICD have been agreed between CASU and WFAU; the formalities are being overseen by the JAC and the VISTA Data Flow System (VDFS) project. Table of contents 1.0 INTRODUCTION: 1.1 Scope This Interface Control Document (ICD) is intended to be a formal interface control agreement between the WFCAM data processing centre at the Cambridge Astronomy Survey Unit (CASU) and the archive centre at the Wide Field Astronomy Unit (WFAU) in Edinburgh. The processing centre/archive centre interface is the final subsystem interface in the WFCAM data flow chain, and is subject to the rules laid out herein. The ICD concerns WFCAM data only; all other data ingested into the WFCAM Science Archive (WSA) are outside the scope of interface control (the WSA will also ingest publicly released data products, eg. SDSS and 2MASS etc., from other non-CASU sources). The ICD is meant to be a technical reference: its intended audience is software engineers and scientists working on processing and archiving in the data flow. It takes the form of a formal agreement between CASU and WFAU, but must also satisfy other external bodies, namely JAC, the UKIDSS survey science consortium and the VISTA Data Flow System. The ICD is, therefore, a major component of the documentation for the WSA critical design review. 1.2 Overview This document is structured as follows. In Section 2, we describe the fundamental rules that the interface will adhere to, including a statement of the primary data format, FITS. Then, in Section 3, we describe the top-level specifications for data that will be transfered between Cambridge and Edinburgh, including a description of FITS conventions, keywords, file naming conventions, units, systems of physical quantities and consistent unified column descriptors. Section 4 goes on to describe in explicit detail the data structures that will be transfered. Then, Section 5 describes the transfer methods and procedures that will achieve the data flow from Cambridge to Edinburgh. Backups and other security issues are dealt with in Section 6, and finally a summary is presented in Section 7. 1.3 Reference docs Generally, this document is modelled on the ESO Data Interface Control Document [1], and with the exception of the ESO hierarchical FITS keyword definition, follows as closely as possible the specifications provided therein. A data flow system overview is provided in [2]. Fundamental "meta" data description (ie. FITS frame headers and keywords) are described in [3]. The JAC/CASU interface is defined and described in [4]; CASU pipeline processing is described in [5]. Relevant documents within the WSA project at WFAU, including the subset presented for the purposes of the archive CDR, are available from [6]. 2.0 FUNDAMENTALS 2.1 WFAU Ingest The WSA at WFAU will ingest WFCAM data from CASU only; there will be no transfer of WFCAM data between JAC and WFAU for example. 2.2 Data transfer method The WSA will ingest data via the internet; tapes and/or "pluggable" disks will not be employed. The implications for required network bandwidth are discussed in [7]. 2.3 Format Data output from CASU will be provided in standard FITS format (as specified in [8]) only. Data will not be expressed in any "hierarchical" system, eg. ESO hierarchical FITS, or the UK Starlink Hierarchical Data Structure format (NDFs). The FITS standard is mature, universally accepted and ideal for transporting both bulk pixel and catalogue data. 2.4 Content Data transfered from CASU will consist of processed pixels (where the processing steps are specified by the observing protocol used), confidence maps, derived source catalogues and associated description data; no raw pixel data will be transfered to (or held in) the WSA. Where irreversible stages such as stacking or mosaicing have been done as part of the reduction procedure, the individual component images and catalogues will also be transferred. 3.0 DATA SPECIFICATION 3.1 Preliminaries Processed frames will be stored in FITS format, following the guidelines set out in [1]: o The images comprising a WFCAM processed frame will be stored in different image extensions of the same FITS container file (a multi-extension FITS, or MEF, file); data pixels belonging to one image will be stored in one image extension (guideline-2). o The primary data array in the MEF file will be empty (guideline-3) o Keywords describing the dataset in the MEF file as a whole will be written into the primary header, while keywords that are related to the data in a particular extension will be written into the HDU of that extension (guideline-5) Derived source catalogues corresponding to each image extension will be written as FITS binary tables in extensions of a single, separate MEF file with a similarly empty primary array. The headers for the catalogue MEF will contain all the information of image MEF headers plus ancilliary processing keywords and values. 3.2 General FITS keywords Keywords will follow the standards set out in [1] and [8] as described (for WFCAM data) in [3]. All keywords and associated values written to the HDS container files produced by the WFCAM DAS must be propagated through the JAC/CASU interface, through the data processing pipeline and into the WSA. The first keyword in any extension HDU must be XTENSION, and it's value will take on only 'IMAGE ' or 'BINTABLE'; the EXTNAME keyword will be used to identify the extension with a particular device detector. Binary tables will have every column described by keywords TTYPEn, TFORMn and TUNITn (see later). World Co-ordinate System (WCS; ie. astrometric) information will be propagated using a set of keywords described in the latest FITS WCS proposals [9,10] by Greisen and Calabretta. Error and statistics information will be expressed following the convention described in [1], whereby the quantity in question has its statistical auxilliary expressed via a keyword containing the first five characters of the root name (or less if necessary) plus a three character suffix ERR (for a unit standard deviation), MIN/MAX, RMS, AVG etc. 3.3 Physical units Physical units will comply with SI units and their derivatives with a few exceptions for astronomical convenience (see [1] Section 9, Table 14). Celestial co-ordinates will be expressed in a time system described by primary HDU keyword RADECSYS; it is anticipated that this will have value 'FK5' (ie. Hipparcos/Tycho ICRS) over the lifetime of WFCAM, but this may of course change for VISTA. 3.4 File naming conventions [NB: this is a proposal from the WFAU end... let me know what you think or if you'd like something else] Files will be named according to the prescription given in [1]. In addition to the requirements detailed there (Section 11.1.2), for processed WFCAM data it is necessary o to choose the timestamp for a superframe made up from the combination of several exposures with different start times; o to distinguish between corrected superframes and object catalogues derived from them. [NB: unless cats are written as extensions into the superframe MEF...] FITS files will be named as follows: r..YYYY-MM-DDThh.mm.ss.fits where is UKWFCPIX or UKWFCOBJ and the time stamp is taken as being the earliest start time from the set of interleave microstep exposures. [NB: this scheme does not allow for different filenames if/when (!) pipeline and/or source extraction are rerun over the same data...!] 4.0 DETAILED DATA SPECIFICATION 4.1 Data obtained at the time of observation Observations will be described via the keywords OBSERVER, USERID, OBSREF, PROJECT, MSBID and OBJECT keywords. Instrumental characteristics, set-ups and parameters will be described by keywords as detailed in [3], including instrument detector configuration (eg. array used DETECTOR; number of integrations NINT), detector controller information (eg. camera read mode READMODE; read-out application CAPPLICN), optical configuration (eg. filter name FILTER; base focus position FOC_MM) and observing conditions/environment (eg. air temperature AIRTEMP; relative humidity HUMIDITY; opacity data CSOTAU). All these FITS keys will be propagated through the data flow chain from the DAS to the WSA. 4.2 Data products (ie. derived data) 4.2.1 Corrected pixel data The CASU pipeline will instrumentally correct WFCAM pixels into an interleaved superframe product that is instrument-signature free. The reduction steps involved in doing so, the derived astrometric and (first-cut) photometric calibrations and resulting DQC information generated will be propagated into the WSA using the FITS keys described in Table 1: [NB: let me know what you think of this lot... is it possible to get all of this info into the FITS headers during pipeline processing?:] Keyword Example Primary HDU (p) Data [units] and value or extension (e) type description Reduction step keys: PIPEVERS 1.0 p char Pipeline software version no. LINCOR T p logical Linearity correction done? RESETCOR T p " Reset correction done? DARKFRAM p char Library dark frame used BADPMASK p char Library bad pixel mask used FLTFIELD p char Library flatfield frame used DEFRINGE p char Library defringe frame used SKYFRAME NONE p char Master sky subtraction frame SEXTRVER 2.3 p real Source extraction s/w version Parameter keys associated with object extraction: RCORE 1.2 p real Core radius used in flux meas. . . . Derived astrometric calibration keys: ASTREDVN 1.5 p real Astrometric s/w version used REFCAT UCAC p char Astrometric reference catalogue RADECSYS FK5 p char Time system of astrometric red. EQUINOX 2000.0 p double [years] Equinox of co-ords REFSTARS 57 e int No. of ref stars used CTYPE1 RA---ZPN e char Projection type CTYPE2 DEC--ZPN e char " " CRPIX1 -1024.5 e double [pixel] Ref pixel for axis 1 CRPIX2 +1024.5 e double [pixel] Ref pixel for axis 2 CRVAL1 +60.0 e double [degree] RA at ref pixel CRVAL2 +25.0 e double [degree] Dec at ref pixel CD1_1 e double Co-ordinate xformation matrix CD1_2 e double Co-ordinate xformation matrix CD2_1 e double Co-ordinate xformation matrix CD2_2 e double Co-ordinate xformation matrix PV1_1 e double First-order radial dist. coeff PV1_3 e double Third-order radial dist. coeff Derived photometric calibration (first-cut) keys: ZEROPNT 19.0 e real Photometric zeropoint . . . Our current names for these, yours or new ones ---------------------------------------------- Derived DQC parameter keys: SKYLEVEL 10000.0 e real [ADU] Robust median sky level SKYNOISE 100.0 e real [ADU] Robust sky noise level THRESHOL 200.0 e real [ADU] Detection threshold used SEEING 1.7 e real [pixels] Average stellar FWHM ELLIPTIC 0.05 e real Avg. point-source ellipticity SATURATE 60000.0 e real [ADU] Saturation level APCORnn 0.213 e real [mags] Stellar corrections STDCRMS 0.458 e real [Arcsec] Astrometric fit error NUMBRMS 210 j int [] No. of astrometric standards used PERCORR 0.000 e real [mags] Sky calibration correction EXTINCT 0.011 e real [mags] Extinction for unit airmass MAGZPT 22.64 e real [mags] Photometric ZP not inc. extinct or inc. unit airmass extinct whatever you fancy. MAGZRR 0.02 e real [mags] Photometric ZP error 4.2.2 Source catalogue attributes The standard set of CASU source detection parameters can be found in [5]. Table 2 lists the corresponding FITS binary table details for each attribute: As Jim says we currently store all these as reals for simplicity and also currently have all the ttype stuff set in the following way PCOUNT = 0 / size of special data area GCOUNT = 1 / one data group (required keyword) TFIELDS = 32 / number of fields in each row TTYPE1 = 'No. ' / label for field 1 TFORM1 = '1E ' / data format of field: 4-byte REAL TTYPE2 = 'Isophotal_flux' / label for field 2 TFORM2 = '1E ' / data format of field: 4-byte REAL TUNIT2 = 'Counts ' / physical unit of field TTYPE3 = 'Total_flux' / label for field 3 TFORM3 = '1E ' / data format of field: 4-byte REAL TUNIT3 = 'Counts ' / physical unit of field TTYPE4 = 'Core_flux' / Fitted flux within 1x core radius TFORM4 = '1E ' / data format of field: 4-byte REAL TUNIT4 = 'Counts ' / physical unit of field TTYPE5 = 'X_coordinate' / label for field 5 TFORM5 = '1E ' / data format of field: 4-byte REAL TUNIT5 = 'Pixels ' / physical unit of field TTYPE6 = 'Y_coordinate' / label for field 6 TFORM6 = '1E ' / data format of field: 4-byte REAL TUNIT6 = 'Pixels ' / physical unit of field TTYPE7 = 'Gaussian_sigma' / label for field 7 TFORM7 = '1E ' / data format of field: 4-byte REAL TUNIT7 = 'Pixels ' / physical unit of field TTYPE8 = 'Ellipticity' / label for field 8 TFORM8 = '1E ' / data format of field: 4-byte REAL TTYPE9 = 'Position_angle' / label for field 9 TFORM9 = '1E ' / data format of field: 4-byte REAL TUNIT9 = 'Degrees ' / physical unit of field TTYPE10 = 'Peak_height' / label for field 10 TFORM10 = '1E ' / data format of field: 4-byte REAL TUNIT10 = 'Counts ' / physical unit of field TTYPE11 = 'Areal_1_profile' / label for field 11 TFORM11 = '1E ' / data format of field: 4-byte REAL TUNIT11 = 'Pixels ' / physical unit of field TTYPE12 = 'Areal_2_profile' / label for field 12 TFORM12 = '1E ' / data format of field: 4-byte REAL TUNIT12 = 'Pixels ' / physical unit of field TTYPE13 = 'Areal_3_profile' / label for field 13 TFORM13 = '1E ' / data format of field: 4-byte REAL TUNIT13 = 'Pixels ' / physical unit of field TTYPE14 = 'Areal_4_profile' / label for field 14 TFORM14 = '1E ' / data format of field: 4-byte REAL TUNIT14 = 'Pixels ' / physical unit of field TTYPE15 = 'Areal_5_profile' / label for field 15 TFORM15 = '1E ' / data format of field: 4-byte REAL TUNIT15 = 'Pixels ' / physical unit of field TTYPE16 = 'Areal_6_profile' / label for field 16 TFORM16 = '1E ' / data format of field: 4-byte REAL TUNIT16 = 'Pixels ' / physical unit of field TTYPE17 = 'Areal_7_profile' / label for field 17 TFORM17 = '1E ' / data format of field: 4-byte REAL TUNIT17 = 'Pixels ' / physical unit of field TTYPE18 = 'Areal_8_profile' / label for field 18 TFORM18 = '1E ' / data format of field: 4-byte REAL TUNIT18 = 'Pixels ' / physical unit of field TTYPE19 = 'Core1_flux' / Fitted flux within 1/2x core radius TFORM19 = '1E ' / data format of field: 4-byte REAL TUNIT19 = 'Counts ' / physical unit of field TTYPE20 = 'Core2_flux' / Fitted flux within sqrt(2)x core radius TFORM20 = '1E ' / data format of field: 4-byte REAL TUNIT20 = 'Counts ' / physical unit of field TTYPE21 = 'Core3_flux' / Fitted flux within 2x core radius TFORM21 = '1E ' / data format of field: 4-byte REAL TUNIT21 = 'Counts ' / physical unit of field TTYPE22 = 'Core4_flux' / Fitted flux within 2sqrt(2)x core radius TFORM22 = '1E ' / data format of field: 4-byte REAL TUNIT22 = 'Counts ' / physical unit of field TTYPE23 = 'RA ' / label for field 23 TFORM23 = '1E ' / data format of field: 4-byte REAL TUNIT23 = 'RADIANS ' / physical unit of field TTYPE24 = 'DEC ' / label for field 24 TFORM24 = '1E ' / data format of field: 4-byte REAL TUNIT24 = 'RADIANS ' / physical unit of field TTYPE25 = 'Classification' / label for field 25 TFORM25 = '1E ' / data format of field: 4-byte REAL TUNIT25 = 'Flag ' / physical unit of field TTYPE26 = 'Statistic' / label for field 26 TFORM26 = '1E ' / data format of field: 4-byte REAL TUNIT26 = 'N-sigma ' / physical unit of field TTYPE27 = 'Blank ' / label for field 27 TFORM27 = '1E ' / data format of field: 4-byte REAL TUNIT27 = 'Blank ' / physical unit of field TTYPE28 = 'Blank ' / label for field 28 TFORM28 = '1E ' / data format of field: 4-byte REAL TUNIT28 = 'Blank ' / physical unit of field TTYPE29 = 'Blank ' / label for field 29 TFORM29 = '1E ' / data format of field: 4-byte REAL TUNIT29 = 'Blank ' / physical unit of field TTYPE30 = 'Blank ' / label for field 30 TFORM30 = '1E ' / data format of field: 4-byte REAL TUNIT30 = 'Blank ' / physical unit of field TTYPE31 = 'Blank ' / label for field 31 TFORM31 = '1E ' / data format of field: 4-byte REAL TUNIT31 = 'Blank ' / physical unit of field TTYPE32 = 'Blank ' / label for field 32 TFORM32 = '1E ' / data format of field: 4-byte REAL TUNIT32 = 'Blank ' / physical unit of field cf. to your No. Name TTYPE TFORM TUNIT 1 Seq. no. SEQNUM 1J - 2 Isophotal flux ISOPHFLX 1E ADU 3 X co-ordinate XCOORD 1E pixels 4 Error in X XCOORERR 1E pixels 5 Y co-ordinate YCOORD 1E pixels 6 Error in Y YCOORERR 1E pixels 7 Gaussian sigma GAUSIGMA 1E pixels 8 Ellipticity ELLIPTIC 1E pixels 9 Position angle POSANGLE 1E degrees 10 Areal profile 1 AREAPRO1 1E pixels . . . 17 Areal profile 8 AREAPRO8 1E pixels 18 Peak height PKHEIGHT 1E ADU 19 Peak height error PKHEIERR 1E ADU 20 Core flux COREFLUX 1E ADU 21 Core flux error COREFERR 1E ADU 22 Core 1 flux CFL01 1E ADU 23 Core 1 flux error CFLERR01 1E ADU . . . 42 Core 12 flux CFL12 1E ADU 43 Core 12 flux error CFLERR12 1E ADU 44 Petrosian radius PETRORAD 1E pixels 45 Kron radius KRONRAD 1E pixels 46 FWHM radius FWHMRAD 1E pixels 47 Petrosian flux PETFLUX 1E ADU 48 Petrosian flux error PETFLERR 1E ADU 49 Kron flux KROFLUX 1E ADU 50 Kron flux error KROFLERR 1E ADU 51 FWHM flux FWHFLUX 1E ADU 52 FWHM flux error FWHFLERR 1E ADU 53 Error bit flag PROFLAGS 1J 54 Sky level SKYLEVEL 1E ADU 55 Sky variance SKYVAR 1E ADU 56 Child/parent BLENDING 1J 57 Right Ascension RA 1D degrees 58 Declination DEC 1D degrees 59 Classification ICLASS 1J 60 Profile statistic PROFSTAT 1E 61 PSF flux PSFFLUX 1E ADU 62 PSF flux error PSFFLERR 1E ADU 63 PSF fitted X XPSF 1E pixels 64 PSF fitted X error XPSFERR 1E pixels 65 PSF fitted Y YPSF 1E pixels 66 PSF fitted Y error YPSFERR 1E pixels [NB: may need additional celestial PA as well as item 9 (position angle wrt X axis) for dumb overlay progs that can't understand WCS; do you agree that 57/58 (RA/Dec) need to be doubles?] 4.2.3 Other data product conventions - checksums for data verification? - allowed/logged ranges for attributes, again for verification? - convention for null or n/a values? 5.0 TRANSFER METHODS & PROCEDURES 5.1 Methods Transfer will be via the internet using standard methods. The data to be transferred will reside in Cambridge on specific RAID arrays attached to a linux PC cluster. WFAU will have an account on this system. Directories of processed nights data will be setup as the pipeline is running. While the processing is still running a directory lock file will be used to denote the in progress operations. After completion the lock file will be unset/removed enabling a remotely controlled browser script to automatically initiate data transfer to Edinburgh. Tests between different locations in the UK in the day give sustained data transfers rates of 4 Mbyte/s and have beend used to copy ~100 Gbytes of data between sites in 5-6 hours. Alternative transfer methods we have tested include, scp, grid-ftp, sftp ........ ) (drop ftp since not secure) 5.2 Procedure - location of data is guaranteed by the pipeline and will be in a observation date driven directory structure to which WFAU will have a secure direct access - "handshaking", eg. notification of readiness will be achieved using a lockfile system as outlines above; verification of successful transfer by no. and size of files transferred (eg. scp verifies as it goes so if preceding two are ok everything is fine n'est ce pas ???) 5.3 Updates - reruns in case of bug fixes, improvements in instrumental correction, improvements in source extraction: any additional interface issues resulting from this possibility/liklihood(!) ? 6.0 BACKUPS AND OTHER SECURITY ISSUES - raw data will be held online in Cambridge as the primary UK backup. Raw data will be also be arhicved/stored at the JAC - security ......... secure transfer, restricted acces to computers whatever....... firewalls...... 7.0 SUMMARY REFERENCES [1] ESO Data Interface Control Document, GEN-SPE-ESO-19940-794/2.0 http://archive.eso.org/DICB/dic-2.0/dic-2.0.4.pdf [2] VDFS document...? [3] ATC WFCAM HDS container and FITS headers, WFCAM project Document No. ? [4] JAC-CASU Interface Control Document, http://www.jach.hawaii.edu/JACpublic/UKIRT/instruments/wfcam/ICD/ [5] WFCAM Pipeline Design http://www.ast.cam.ac.uk/~wfcam/docs/wfcampipedoc_v2.ps.gz [6] WFCAM/VISTA Science Archive Development http://www.roe.ac.uk/~nch/wfcam/ [7] WFCAM Science Archive hardware design document, http://www.roe.ac.uk/~nch/wfcam/... [8] Definition of the Flexible Image Transport System (FITS), document NOST 100-2.0 http://fits.gsfc.nasa.gov/fits_home.html [9] Representations of world co-ordinates in FITS Greisen EW, Calabretta MR, A&A, 395, 1061 (2002) [10] Representations of celestial co-ordinates in FITS Calabretta MR, Greisen EW, A&A, 395, 1077 (2002) GLOSSARY APPENDICES
Last modified: Wed Mar 5 12:37:31 2003