WSA Interface Control Document
------------------------------
------------------------------
Document control table(s)
Abstract:
This Interface Control Document (ICD) for the WFCAM Science Archive (WSA)
describes the data flow subsystem interface between the data processing
centre (CASU at the IoA, Cambridge) and the archive centre (WFAU at the
IfA, Edinburgh). Details of the types and specifications of processed
WFCAM data to be transfered, along with the transfer protocols
(file naming, transfer method and procedure), are given. The details of
this ICD have been agreed between CASU and WFAU; the formalities are
being overseen by the JAC and the VISTA Data Flow System (VDFS) project.
Table of contents
1.0 INTRODUCTION:
1.1 Scope
This Interface Control Document (ICD) is intended to be a formal interface
control agreement between the WFCAM data processing centre at the Cambridge
Astronomy Survey Unit (CASU) and the archive centre at the Wide Field
Astronomy Unit (WFAU) in Edinburgh. The processing centre/archive centre
interface is the final subsystem interface in the WFCAM data flow chain, and
is subject to the rules laid out herein. The ICD concerns WFCAM data only;
all other data ingested into the WFCAM Science Archive (WSA) are outside the
scope of interface control (the WSA will also ingest publicly released data
products, eg. SDSS and 2MASS etc., from other non-CASU sources).
The ICD is meant to be a technical reference: its intended audience is
software engineers and scientists working on processing and archiving in
the data flow. It takes the form of a formal agreement between CASU and
WFAU, but must also satisfy other external bodies, namely JAC, the UKIDSS
survey science consortium and the VISTA Data Flow System. The ICD is,
therefore, a major component of the documentation for the WSA critical
design review.
1.2 Overview
This document is structured as follows. In Section 2, we describe the
fundamental rules that the interface will adhere to, including a
statement of the primary data format, FITS. Then, in Section 3, we
describe the top-level specifications for data that will be transfered
between Cambridge and Edinburgh, including a description of FITS
conventions, keywords, file naming conventions, units, systems of
physical quantities and consistent unified column descriptors. Section 4
goes on to describe in explicit detail the data structures that will be
transfered. Then, Section 5 describes the transfer methods and procedures
that will achieve the data flow from Cambridge to Edinburgh. Backups and
other security issues are dealt with in Section 6, and finally a summary
is presented in Section 7.
1.3 Reference docs
Generally, this document is modelled on the ESO Data Interface Control
Document [1], and with the exception of the ESO hierarchical FITS
keyword definition, follows as closely as possible the specifications
provided therein. A data flow system overview is provided in [2].
Fundamental "meta" data description (ie. FITS frame headers and keywords)
are described in [3]. The JAC/CASU interface is defined and described in
[4]; CASU pipeline processing is described in [5]. Relevant documents
within the WSA project at WFAU, including the subset presented for the
purposes of the archive CDR, are available from [6].
2.0 FUNDAMENTALS
2.1 WFAU Ingest
The WSA at WFAU will ingest WFCAM data from CASU only; there will be no
transfer of WFCAM data between JAC and WFAU for example.
2.2 Data transfer method
The WSA will ingest data via the internet; tapes and/or "pluggable" disks
will not be employed. The implications for required network bandwidth are
discussed in [7].
2.3 Format
Data output from CASU will be provided in standard FITS format (as
specified in [8]) only. Data will not be expressed in any "hierarchical"
system, eg. ESO hierarchical FITS, or the UK Starlink Hierarchical Data
Structure format (NDFs). The FITS standard is mature, universally accepted
and ideal for transporting both bulk pixel and catalogue data.
2.4 Content
Data transfered from CASU will consist of processed pixels (where the
processing steps are specified by the observing protocol used), confidence
maps, derived source catalogues and associated description data; no raw
pixel data will be transfered to (or held in) the WSA. Where
irreversible stages such as stacking or mosaicing have been done as part
of the reduction procedure, the individual component images and catalogues
will also be transferred.
3.0 DATA SPECIFICATION
3.1 Preliminaries
Processed frames will be stored in FITS format, following the guidelines
set out in [1]:
o The images comprising a WFCAM processed frame will be stored in
different image extensions of the same FITS container file (a
multi-extension FITS, or MEF, file); data pixels belonging to one
image will be stored in one image extension (guideline-2).
o The primary data array in the MEF file will be empty (guideline-3)
o Keywords describing the dataset in the MEF file as a whole will be
written into the primary header, while keywords that are related to
the data in a particular extension will be written into the HDU of
that extension (guideline-5)
Derived source catalogues corresponding to each image extension will be
written as FITS binary tables in extensions of a single, separate
MEF file with a similarly empty primary array. The headers for the
catalogue MEF will contain all the information of image MEF headers plus
ancilliary processing keywords and values.
3.2 General FITS keywords
Keywords will follow the standards set out in [1] and [8] as described
(for WFCAM data) in [3]. All keywords and associated values written to
the HDS container files produced by the WFCAM DAS must be propagated
through the JAC/CASU interface, through the data processing pipeline
and into the WSA.
The first keyword in any extension HDU must be XTENSION, and it's value
will take on only 'IMAGE ' or 'BINTABLE'; the EXTNAME keyword will be
used to identify the extension with a particular device detector. Binary
tables will have every column described by keywords TTYPEn, TFORMn
and TUNITn (see later).
World Co-ordinate System (WCS; ie. astrometric) information will be
propagated using a set of keywords described in the latest FITS WCS
proposals [9,10] by Greisen and Calabretta.
Error and statistics information will be expressed following the
convention described in [1], whereby the quantity in question has its
statistical auxilliary expressed via a keyword containing the first five
characters of the root name (or less if necessary) plus a three character
suffix ERR (for a unit standard deviation), MIN/MAX, RMS, AVG etc.
3.3 Physical units
Physical units will comply with SI units and their derivatives with a few
exceptions for astronomical convenience (see [1] Section 9, Table 14).
Celestial co-ordinates will be expressed in a time system described by
primary HDU keyword RADECSYS; it is anticipated that this will have
value 'FK5' (ie. Hipparcos/Tycho ICRS) over the lifetime of WFCAM, but
this may of course change for VISTA.
3.4 File naming conventions
[NB: this is a proposal from the WFAU end... let me know what you think
or if you'd like something else]
Files will be named according to the prescription given in [1]. In addition
to the requirements detailed there (Section 11.1.2), for processed WFCAM
data it is necessary
o to choose the timestamp for a superframe made up from the combination
of several exposures with different start times;
o to distinguish between corrected superframes and object catalogues
derived from them.
[NB: unless cats are written as extensions into the superframe MEF...]
FITS files will be named as follows: r..YYYY-MM-DDThh.mm.ss.fits
where is UKWFCPIX or UKWFCOBJ and the time stamp is taken as being
the earliest start time from the set of interleave microstep exposures.
[NB: this scheme does not allow for different filenames if/when (!) pipeline
and/or source extraction are rerun over the same data...!]
4.0 DETAILED DATA SPECIFICATION
4.1 Data obtained at the time of observation
Observations will be described via the keywords OBSERVER, USERID, OBSREF,
PROJECT, MSBID and OBJECT keywords.
Instrumental characteristics, set-ups and parameters will be described by
keywords as detailed in [3], including instrument detector configuration
(eg. array used DETECTOR; number of integrations NINT), detector
controller information (eg. camera read mode READMODE; read-out application
CAPPLICN), optical configuration (eg. filter name FILTER; base focus
position FOC_MM) and observing conditions/environment (eg. air temperature
AIRTEMP; relative humidity HUMIDITY; opacity data CSOTAU).
All these FITS keys will be propagated through the data flow chain from the
DAS to the WSA.
4.2 Data products (ie. derived data)
4.2.1 Corrected pixel data
The CASU pipeline will instrumentally correct WFCAM pixels into an
interleaved superframe product that is instrument-signature free. The
reduction steps involved in doing so, the derived astrometric and
(first-cut) photometric calibrations and resulting DQC information
generated will be propagated into the WSA using the FITS keys
described in Table 1:
[NB: let me know what you think of this lot... is it possible to get
all of this info into the FITS headers during pipeline processing?:]
Keyword Example Primary HDU (p) Data [units] and
value or extension (e) type description
Reduction step keys:
PIPEVERS 1.0 p char Pipeline software version no.
LINCOR T p logical Linearity correction done?
RESETCOR T p " Reset correction done?
DARKFRAM p char Library dark frame used
BADPMASK p char Library bad pixel mask used
FLTFIELD p char Library flatfield frame used
DEFRINGE p char Library defringe frame used
SKYFRAME NONE p char Master sky subtraction frame
SEXTRVER 2.3 p real Source extraction s/w version
Parameter keys associated with object extraction:
RCORE 1.2 p real Core radius used in flux meas.
.
.
.
Derived astrometric calibration keys:
ASTREDVN 1.5 p real Astrometric s/w version used
REFCAT UCAC p char Astrometric reference catalogue
RADECSYS FK5 p char Time system of astrometric red.
EQUINOX 2000.0 p double [years] Equinox of co-ords
REFSTARS 57 e int No. of ref stars used
CTYPE1 RA---ZPN e char Projection type
CTYPE2 DEC--ZPN e char " "
CRPIX1 -1024.5 e double [pixel] Ref pixel for axis 1
CRPIX2 +1024.5 e double [pixel] Ref pixel for axis 2
CRVAL1 +60.0 e double [degree] RA at ref pixel
CRVAL2 +25.0 e double [degree] Dec at ref pixel
CD1_1 e double Co-ordinate xformation matrix
CD1_2 e double Co-ordinate xformation matrix
CD2_1 e double Co-ordinate xformation matrix
CD2_2 e double Co-ordinate xformation matrix
PV1_1 e double First-order radial dist. coeff
PV1_3 e double Third-order radial dist. coeff
Derived photometric calibration (first-cut) keys:
ZEROPNT 19.0 e real Photometric zeropoint
.
.
.
Our current names for these, yours or new ones
----------------------------------------------
Derived DQC parameter keys:
SKYLEVEL 10000.0 e real [ADU] Robust median sky level
SKYNOISE 100.0 e real [ADU] Robust sky noise level
THRESHOL 200.0 e real [ADU] Detection threshold used
SEEING 1.7 e real [pixels] Average stellar FWHM
ELLIPTIC 0.05 e real Avg. point-source ellipticity
SATURATE 60000.0 e real [ADU] Saturation level
APCORnn 0.213 e real [mags] Stellar corrections
STDCRMS 0.458 e real [Arcsec] Astrometric fit error
NUMBRMS 210 j int [] No. of astrometric standards used
PERCORR 0.000 e real [mags] Sky calibration correction
EXTINCT 0.011 e real [mags] Extinction for unit airmass
MAGZPT 22.64 e real [mags] Photometric ZP not inc. extinct
or inc. unit airmass extinct
whatever you fancy.
MAGZRR 0.02 e real [mags] Photometric ZP error
4.2.2 Source catalogue attributes
The standard set of CASU source detection parameters can be found in [5].
Table 2 lists the corresponding FITS binary table details for each
attribute:
As Jim says we currently store all these as reals for simplicity and also
currently have all the ttype stuff set in the following way
PCOUNT = 0 / size of special data area
GCOUNT = 1 / one data group (required keyword)
TFIELDS = 32 / number of fields in each row
TTYPE1 = 'No. ' / label for field 1
TFORM1 = '1E ' / data format of field: 4-byte REAL
TTYPE2 = 'Isophotal_flux' / label for field 2
TFORM2 = '1E ' / data format of field: 4-byte REAL
TUNIT2 = 'Counts ' / physical unit of field
TTYPE3 = 'Total_flux' / label for field 3
TFORM3 = '1E ' / data format of field: 4-byte REAL
TUNIT3 = 'Counts ' / physical unit of field
TTYPE4 = 'Core_flux' / Fitted flux within 1x core radius
TFORM4 = '1E ' / data format of field: 4-byte REAL
TUNIT4 = 'Counts ' / physical unit of field
TTYPE5 = 'X_coordinate' / label for field 5
TFORM5 = '1E ' / data format of field: 4-byte REAL
TUNIT5 = 'Pixels ' / physical unit of field
TTYPE6 = 'Y_coordinate' / label for field 6
TFORM6 = '1E ' / data format of field: 4-byte REAL
TUNIT6 = 'Pixels ' / physical unit of field
TTYPE7 = 'Gaussian_sigma' / label for field 7
TFORM7 = '1E ' / data format of field: 4-byte REAL
TUNIT7 = 'Pixels ' / physical unit of field
TTYPE8 = 'Ellipticity' / label for field 8
TFORM8 = '1E ' / data format of field: 4-byte REAL
TTYPE9 = 'Position_angle' / label for field 9
TFORM9 = '1E ' / data format of field: 4-byte REAL
TUNIT9 = 'Degrees ' / physical unit of field
TTYPE10 = 'Peak_height' / label for field 10
TFORM10 = '1E ' / data format of field: 4-byte REAL
TUNIT10 = 'Counts ' / physical unit of field
TTYPE11 = 'Areal_1_profile' / label for field 11
TFORM11 = '1E ' / data format of field: 4-byte REAL
TUNIT11 = 'Pixels ' / physical unit of field
TTYPE12 = 'Areal_2_profile' / label for field 12
TFORM12 = '1E ' / data format of field: 4-byte REAL
TUNIT12 = 'Pixels ' / physical unit of field
TTYPE13 = 'Areal_3_profile' / label for field 13
TFORM13 = '1E ' / data format of field: 4-byte REAL
TUNIT13 = 'Pixels ' / physical unit of field
TTYPE14 = 'Areal_4_profile' / label for field 14
TFORM14 = '1E ' / data format of field: 4-byte REAL
TUNIT14 = 'Pixels ' / physical unit of field
TTYPE15 = 'Areal_5_profile' / label for field 15
TFORM15 = '1E ' / data format of field: 4-byte REAL
TUNIT15 = 'Pixels ' / physical unit of field
TTYPE16 = 'Areal_6_profile' / label for field 16
TFORM16 = '1E ' / data format of field: 4-byte REAL
TUNIT16 = 'Pixels ' / physical unit of field
TTYPE17 = 'Areal_7_profile' / label for field 17
TFORM17 = '1E ' / data format of field: 4-byte REAL
TUNIT17 = 'Pixels ' / physical unit of field
TTYPE18 = 'Areal_8_profile' / label for field 18
TFORM18 = '1E ' / data format of field: 4-byte REAL
TUNIT18 = 'Pixels ' / physical unit of field
TTYPE19 = 'Core1_flux' / Fitted flux within 1/2x core radius
TFORM19 = '1E ' / data format of field: 4-byte REAL
TUNIT19 = 'Counts ' / physical unit of field
TTYPE20 = 'Core2_flux' / Fitted flux within sqrt(2)x core radius
TFORM20 = '1E ' / data format of field: 4-byte REAL
TUNIT20 = 'Counts ' / physical unit of field
TTYPE21 = 'Core3_flux' / Fitted flux within 2x core radius
TFORM21 = '1E ' / data format of field: 4-byte REAL
TUNIT21 = 'Counts ' / physical unit of field
TTYPE22 = 'Core4_flux' / Fitted flux within 2sqrt(2)x core radius
TFORM22 = '1E ' / data format of field: 4-byte REAL
TUNIT22 = 'Counts ' / physical unit of field
TTYPE23 = 'RA ' / label for field 23
TFORM23 = '1E ' / data format of field: 4-byte REAL
TUNIT23 = 'RADIANS ' / physical unit of field
TTYPE24 = 'DEC ' / label for field 24
TFORM24 = '1E ' / data format of field: 4-byte REAL
TUNIT24 = 'RADIANS ' / physical unit of field
TTYPE25 = 'Classification' / label for field 25
TFORM25 = '1E ' / data format of field: 4-byte REAL
TUNIT25 = 'Flag ' / physical unit of field
TTYPE26 = 'Statistic' / label for field 26
TFORM26 = '1E ' / data format of field: 4-byte REAL
TUNIT26 = 'N-sigma ' / physical unit of field
TTYPE27 = 'Blank ' / label for field 27
TFORM27 = '1E ' / data format of field: 4-byte REAL
TUNIT27 = 'Blank ' / physical unit of field
TTYPE28 = 'Blank ' / label for field 28
TFORM28 = '1E ' / data format of field: 4-byte REAL
TUNIT28 = 'Blank ' / physical unit of field
TTYPE29 = 'Blank ' / label for field 29
TFORM29 = '1E ' / data format of field: 4-byte REAL
TUNIT29 = 'Blank ' / physical unit of field
TTYPE30 = 'Blank ' / label for field 30
TFORM30 = '1E ' / data format of field: 4-byte REAL
TUNIT30 = 'Blank ' / physical unit of field
TTYPE31 = 'Blank ' / label for field 31
TFORM31 = '1E ' / data format of field: 4-byte REAL
TUNIT31 = 'Blank ' / physical unit of field
TTYPE32 = 'Blank ' / label for field 32
TFORM32 = '1E ' / data format of field: 4-byte REAL
TUNIT32 = 'Blank ' / physical unit of field
cf. to your
No. Name TTYPE TFORM TUNIT
1 Seq. no. SEQNUM 1J -
2 Isophotal flux ISOPHFLX 1E ADU
3 X co-ordinate XCOORD 1E pixels
4 Error in X XCOORERR 1E pixels
5 Y co-ordinate YCOORD 1E pixels
6 Error in Y YCOORERR 1E pixels
7 Gaussian sigma GAUSIGMA 1E pixels
8 Ellipticity ELLIPTIC 1E pixels
9 Position angle POSANGLE 1E degrees
10 Areal profile 1 AREAPRO1 1E pixels
.
.
.
17 Areal profile 8 AREAPRO8 1E pixels
18 Peak height PKHEIGHT 1E ADU
19 Peak height error PKHEIERR 1E ADU
20 Core flux COREFLUX 1E ADU
21 Core flux error COREFERR 1E ADU
22 Core 1 flux CFL01 1E ADU
23 Core 1 flux error CFLERR01 1E ADU
.
.
.
42 Core 12 flux CFL12 1E ADU
43 Core 12 flux error CFLERR12 1E ADU
44 Petrosian radius PETRORAD 1E pixels
45 Kron radius KRONRAD 1E pixels
46 FWHM radius FWHMRAD 1E pixels
47 Petrosian flux PETFLUX 1E ADU
48 Petrosian flux error PETFLERR 1E ADU
49 Kron flux KROFLUX 1E ADU
50 Kron flux error KROFLERR 1E ADU
51 FWHM flux FWHFLUX 1E ADU
52 FWHM flux error FWHFLERR 1E ADU
53 Error bit flag PROFLAGS 1J
54 Sky level SKYLEVEL 1E ADU
55 Sky variance SKYVAR 1E ADU
56 Child/parent BLENDING 1J
57 Right Ascension RA 1D degrees
58 Declination DEC 1D degrees
59 Classification ICLASS 1J
60 Profile statistic PROFSTAT 1E
61 PSF flux PSFFLUX 1E ADU
62 PSF flux error PSFFLERR 1E ADU
63 PSF fitted X XPSF 1E pixels
64 PSF fitted X error XPSFERR 1E pixels
65 PSF fitted Y YPSF 1E pixels
66 PSF fitted Y error YPSFERR 1E pixels
[NB: may need additional celestial PA as well as item 9 (position angle wrt
X axis) for dumb overlay progs that can't understand WCS; do you agree that
57/58 (RA/Dec) need to be doubles?]
4.2.3 Other data product conventions
- checksums for data verification?
- allowed/logged ranges for attributes, again for verification?
- convention for null or n/a values?
5.0 TRANSFER METHODS & PROCEDURES
5.1 Methods
Transfer will be via the internet using standard methods. The data to
be transferred will reside in Cambridge on specific RAID arrays attached
to a linux PC cluster. WFAU will have an account on this system.
Directories of processed nights data will be setup as the pipeline is running.
While the processing is still running a directory lock file will be used to
denote the in progress operations. After completion the lock file will be
unset/removed enabling a remotely controlled browser script to automatically
initiate data transfer to Edinburgh. Tests between different locations in
the UK in the day give sustained data transfers rates of 4 Mbyte/s and
have beend used to copy ~100 Gbytes of data between sites in 5-6 hours.
Alternative transfer methods we have tested include, scp, grid-ftp,
sftp ........ ) (drop ftp since not secure)
5.2 Procedure
- location of data is guaranteed by the pipeline and will be in a
observation date driven directory structure to which WFAU will have
a secure direct access
- "handshaking", eg. notification of readiness will be achieved using a
lockfile system as outlines above; verification of successful transfer
by no. and size of files transferred (eg. scp verifies as it goes so
if preceding two are ok everything is fine n'est ce pas ???)
5.3 Updates
- reruns in case of bug fixes, improvements in instrumental correction,
improvements in source extraction: any additional interface issues
resulting from this possibility/liklihood(!) ?
6.0 BACKUPS AND OTHER SECURITY ISSUES
- raw data will be held online in Cambridge as the primary UK backup.
Raw data will be also be arhicved/stored at the JAC
- security ......... secure transfer, restricted acces to computers
whatever....... firewalls......
7.0 SUMMARY
REFERENCES
[1] ESO Data Interface Control Document, GEN-SPE-ESO-19940-794/2.0
http://archive.eso.org/DICB/dic-2.0/dic-2.0.4.pdf
[2] VDFS document...?
[3] ATC WFCAM HDS container and FITS headers, WFCAM project Document No. ?
[4] JAC-CASU Interface Control Document,
http://www.jach.hawaii.edu/JACpublic/UKIRT/instruments/wfcam/ICD/
[5] WFCAM Pipeline Design
http://www.ast.cam.ac.uk/~wfcam/docs/wfcampipedoc_v2.ps.gz
[6] WFCAM/VISTA Science Archive Development
http://www.roe.ac.uk/~nch/wfcam/
[7] WFCAM Science Archive hardware design document,
http://www.roe.ac.uk/~nch/wfcam/...
[8] Definition of the Flexible Image Transport System (FITS), document
NOST 100-2.0
http://fits.gsfc.nasa.gov/fits_home.html
[9] Representations of world co-ordinates in FITS
Greisen EW, Calabretta MR, A&A, 395, 1061 (2002)
[10] Representations of celestial co-ordinates in FITS
Calabretta MR, Greisen EW, A&A, 395, 1077 (2002)
GLOSSARY
APPENDICES
Last modified: Wed Mar 5 12:37:31 2003