CASU/ORACDR Pipeline Algorithm Comparative Tests

(Document number VDF-TRE-IOA-00007-0001)

Jim Lewis
Draft 20040119

Introduction

Before the start of WFCAM commissioning and operations, there is a need to prove that the WFCAM pipeline algorithms work to specification. However, given that no real WFCAM data exists, this is clearly not possible at present. On the other hand it is possible to use data from other IR imagers at least to prove that the software that does the data manipulation and correction works in the expected way.

Apart from getting the right answer, the WFCAM pipeline must also run at the summit with the data reduction environment that currently exists at UKIRT (ORACDR). Non-WFCAM data can also be use to test whether the implementation of the WFCAM software has been done correctly within ORACDR.

In this paper I will present the results of some of the tests we have done on the WFCAM pipeline software. As more tests are finished then they will be added, hence the word 'draft' near the title above.

The CASU pipeline has already been installed and tested at JAC within ORACDR as part of the testing programme. The source has been made available to JAC from a CVS repository.

UFTI

The UKIRT imager UFTI already has a pipeline that runs within ORACDR. The routines that do the actual processing are chosen from the Starlink collection, hence files are in Starlink's internal NDF format. As a way of learning how the ORACDR environment works I wrote a second UFTI pipeline that reduces the data in FITS format and uses the CASU data reduction module collection (CIRDR) to do the number crunching. Not all of the UFTI data reduction recipes were reproduced, as this was only intended to be a 'learning exercise'.

As part of the testing procedure on both the pipeline modules and our ability to implement a pipeline in ORACDR, we decided to run a small amount of data through both pipelines. The tests consisted of reducing some jitter observations from a single night (20020426) with both pipelines and comparing them for speed and consistency of results. Given that both pipelines reduce the data in a very similar way, then the results in terms of the sky background, image shape, photometry and positional accuracy should be roughly the same. The data were chosen so that they use a fairly simple recipe (JITTER_SELF_FLAT).

The tests were run on a Dell Inspiron 8100 laptop with 512 Mb of internal memory and a 1.2 GHz Pentium III chip. In order to cut down on the amount of time spent plotting, the reductions were done without the ORACDR GUI and without any image display. For reference, the command used was:

%oracdr -ut 20020426 -loop list -list 1:4,36:60 -log s -nodisplay

In what follows, I will refer to the standard Starlink based UFTI pipeline as the 'Standard' pipeline. The second CIRDR based pipeline will be the 'CASU' pipeline.

Briefly, the recipe in the Standard pipeline does the following:

Subtract the dark frame
Do a two-pass self flat field calibration. That is the target images are combined into a night-sky flat field image. The images are divided by this flat and objects are located and masked out. The original dark corrected target frames are then combined a second time, with the objects masked out. This second pass flat is then used to do that flat field correction for the target images.
Jitter offsets are calculated from the positions of objects on each image and using information contained in the headers
A stacked image is created by combining the target frames with the jitter offsets, using a sub-pixel resampling algorithm
Any remaining bad pixels in the image stack are interpolated out (NB: this is one step that the CASU pipeline does not do since it makes more extensive use of confidence maps. See below.)

In addition to the previous steps, the CASU recipe does the following:

Before any processing the input file is translated from NDF to FITS.
During the flat field combination phase a confidence map is generated. This is essentially a combined exposure weight and bad pixel map that can be used in further processing of the data. Bad pixels are assigned a zero confidence and are ignored in further processing.
A temporary catalogue of bright objects is generated and a first pass WCS is fit to the positions using the 2mass object catalogue as an astrometric standard grid. This is used to calculate a pointing error by comparing the predicted coordinate for the field centre using this WCS and the WCS implied by the raw data file header. The rms alignment error coupled with the number of stars used in the match provides a further diagnostic.
A final catalogue of objects is generated.
The objects in the catalogue are morphologically classified as stellar, non-stellar or noise.
A second pass WCS is fit using the objects in the final catalogue and 2mass.
A photometric zero-point is calculated using the 2mass found in the final catalogue.
A DQC index is created with measurements of astrometric accuracy, photometric accuracy, mean seeing, sky level, sky noise and mean object ellipticity

The 2MASS data used in the photometric and astrometric calibration in the CASU recipe are obtained from VizieR (either from CDS, CASU or JAC) and hence the CASU pipeline needs an internet connection to run to completion. The CASU pipeline can run from a local FITS table as well and this can be used to circumvent external dependencies.

Running these frames (essentially, two jitter sequences and a set of array tests) through both pipelines yielded the following timing results:

Pipeline Run time(s)

Standard 354

CASU 174

Pipeline	Run time(s)
Standard	354
CASU	174

Using this simple recipe the CASU pipeline runs about a factor of 2 faster. Whether this is the case generally is beyond the scope of this little test. It is, however, quite probably that a large amount of the difference is in the resampling that takes place in the Standard pipeline during the stacking phase.

In order to compare the photometric and positional consistency of the two reduction methods a catalogue was generated for each of the stacked images using the catalogue generation program from CIRDR (imcore). The catalogues were generated in the following way:

An aperture (Rcore) of radius of 10 pixels (0.91 arcsec) was used. Objects covering an area of less than 40 pixels were ignored.
Because the Standard pipeline uses an interpolation algorithm for rebinning during stacking and the current CASU pipeline rebins using the 'nearest neighbour', the sky noise estimate for the Standard pipeline stacks will be significantly less. (Interpolation redistributes the sky noise off the diagonal elements of the covariance matrix and hence the noise is no longer uncorrelated from pixel to pixel. Smoothing has a similar effect). To compensate for this, the detection thresholds (as expressed in units of the mean sky noise) were adjusted for the Standard pipeline images so that they matched that of the CASU catalogues (the latter used a detection threshold of 1.5 times the "mean" sky noise). Note that both the sky level and sky noise values are derived using robust estimators.
Because there were no confidence maps available for the Standard stack frames, the CASU catalogues were re-generated without a confidence map for the sake of consistency in the testing.

A subset of the available catalogue parameters were used for comparison. These include the following:


Parameter	Description
X coordinate	The X coordinate of the object (pixels)
Y coordinate	The Y coordinate of the object (pixels)
Isophotal flux	The flux with the detection
Total flux	The total flux of the object (Kron style)
Core flux	The flux through an aperture of radius Rcore
Core 4 flux	The flux through an aperture of radius 2sqrt(2)Rcore
Object Ellipticity	An estimate of the ellipticity for each object
Gaussian Width	The equivallent Gaussian width estimate for each object (pixels)

Note that imcore generates object brightness in terms of data numbers rather than magnitudes. Magnitudes are generated for these comparisons by 2.5*log10(counts), so that brighter objects have a higher magnitude. In the residuals below the sense is always CASU minus Standard. In what follows, for the sake of brevity, we restrict the analysis to objects on the first image stack that was reduced (group number 36). The other groups give similar results.

Parameter Mean Residual Standard Deviation

X coordinate 1.000 0.227

Y coordinate 1.070 0.146

Isophotal Magnitude 0.044 0.066

Total Magnitude -0.032 0.134

Core Magnitude 0.014 0.033

Core 4 Magnitude 0.032 0.068

Gaussian Width 0.057 0.096

Object Ellipticity 0.001 0.013

Parameter	Mean Residual	Standard Deviation
X coordinate	1.000	0.227
Y coordinate	1.070	0.146
Isophotal Magnitude	0.044	0.066
Total Magnitude	-0.032	0.134
Core Magnitude	0.014	0.033
Core 4 Magnitude	0.032	0.068
Gaussian Width	0.057	0.096
Object Ellipticity	0.001	0.013

There is a significant mean residual in the x,y coordinates. This is simply an artifact of the different the stacking algorithms used and how the origin is assigned. The standard deviations in the coordinates are also a remnant of this, in that the Standard pipeline rebins using some form of interpolation, whereas the current CASU pipeline uses the nearest neighbour.

All of the magnitude estimates reproduce to a good level of consistency. The standard deviation of the larger aperture magnitudes increases due to the larger contribution from the (noisy) background. The large mean residual and standard deviation in the total magnitude estimate is caused almost entirely by two deviant points (two fuzzy images). If these are removed then they drop to 0.016 and 0.037 respectively.

Finally, the two shape parameters we have included, the Gaussian width and the object ellipticity both agree very well. The CASU objects appear to be slightly larger than the standard pipeline objects, which is to be expected given the 'nearest neighbour' rebinning scheme used by the CASU pipeline. The mean gaussian width for this particular tile was 2.20 pixels, hence this residual corresponds to roughly 2.5%.

Below are comparison plots for the four magnitude estimates we've included.

A comparison of core magnitudes

A comparison of core4 magnitudes

A comparison of isophotal magnitudes

A comparison of total magnitudes

The background following algorithm in imcore divides the map into cells approximately 64x64 pixels in dimension. A robust iterative clipped median background value is calculated for each cell and the final background map is created by a bilinear interpolation between these cells for each pixel in the image map. As a test of the flatness of the backgrounds of the output image stacks we have created background maps for each image. Disregarding some small regions on the outside of each frame where the confidence is very low or zero (no coverage in the jitter pattern), the maximum background variation in the images is roughly 0.15%. Below is a display of the ratio of the two background maps. The range of values between the lower left-hand corner and the upper right is approximately 0.1%.

Jim Lewis <jrl@ast.cam.ac.uk>

Last modified: Mon Feb 16 16:50:22 2004