CASU/ORACDR Pipeline Algorithm Comparative Tests
(Document number VDF-TRE-IOA-00007-0001)
Jim Lewis
Draft 20040119
Introduction
Before the start of WFCAM commissioning and operations, there is a need
to prove that the WFCAM pipeline algorithms work to specification. However,
given that no real WFCAM data exists, this is clearly not possible at present.
On the other hand it is possible to use data from other IR imagers at least
to prove that the software that does the data manipulation and correction
works in the expected way.
Apart from getting the right answer, the WFCAM pipeline must also run at the
summit with the data reduction environment that currently exists at UKIRT
(ORACDR). Non-WFCAM data can also be use to test whether the implementation
of the WFCAM software has been done correctly within ORACDR.
In this paper I will present the results of some of the tests we have
done on the WFCAM pipeline software. As more tests are finished then
they will be added, hence the word 'draft' near the title above.
The CASU pipeline has already been installed and tested at JAC within ORACDR
as part of the testing programme. The source has been made available to JAC
from a CVS repository.
UFTI
The UKIRT imager UFTI already has a pipeline that runs within ORACDR. The
routines that do the actual processing are chosen from the Starlink collection,
hence files are in Starlink's internal NDF format. As a way of learning
how the ORACDR environment works I wrote a second UFTI pipeline that reduces
the data
in FITS format and uses the CASU data reduction module collection (CIRDR) to do the
number crunching. Not all of the UFTI data reduction recipes were reproduced,
as this was only intended to be a 'learning exercise'.
As part of the testing procedure on both the pipeline modules and our
ability to implement a pipeline in ORACDR, we decided to run a small amount
of data through both pipelines. The tests consisted of reducing some jitter
observations from a single night (20020426) with both pipelines and comparing
them for speed and consistency of results. Given that both pipelines
reduce the data in a very similar way, then the results in terms of the
sky background, image shape,
photometry and positional accuracy should be roughly the same. The
data were chosen so that they use a fairly simple recipe (JITTER_SELF_FLAT).
The tests were run on a Dell Inspiron 8100 laptop with 512 Mb of internal
memory and a 1.2 GHz Pentium III chip. In order to cut down on the
amount of time spent plotting, the reductions were done without the ORACDR
GUI and without any image display. For reference, the command used
was:
%oracdr -ut 20020426 -loop list -list 1:4,36:60 -log s -nodisplay
In what follows, I will refer to the standard Starlink based UFTI pipeline
as the 'Standard' pipeline. The second CIRDR based pipeline will
be the 'CASU' pipeline.
Briefly, the recipe in the Standard pipeline does the following:
- Subtract the dark frame
- Do a two-pass self flat field calibration. That is the target images
are combined into a night-sky flat field image. The images are divided
by this flat and objects are located and masked out. The original
dark corrected target frames are then combined a second time, with the
objects masked out. This second pass flat is then used to do that
flat field correction for the target images.
- Jitter offsets are calculated from the positions of objects on each image
and using information contained in the headers
- A stacked image is created by combining the target frames with the jitter
offsets, using a sub-pixel resampling algorithm
- Any remaining bad pixels in the image stack are interpolated out (NB: this
is one step that the CASU pipeline does not do since it makes more extensive
use of confidence maps. See below.)
In addition to the previous steps, the CASU recipe does the following:
-
Before any processing the input file is translated from NDF to FITS.
-
During the flat field combination phase a confidence map is generated.
This is essentially a combined exposure weight and bad pixel map that can be
used in further processing
of the data. Bad pixels are assigned a zero confidence and are ignored
in further processing.
-
A temporary catalogue of bright objects is generated and a first pass WCS
is fit to the positions using the 2mass object catalogue as an astrometric
standard grid. This is used to calculate a pointing error by comparing
the predicted coordinate for the field centre using this WCS and the WCS
implied by the raw data file header. The rms alignment error coupled with the
number of stars used in the match provides a further diagnostic.
-
A final catalogue of objects is generated.
-
The objects in the catalogue are morphologically classified as stellar,
non-stellar or noise.
-
A second pass WCS is fit using the objects in the final catalogue and 2mass.
-
A photometric zero-point is calculated using the 2mass found in the final catalogue.
-
A DQC index is created with measurements of astrometric accuracy, photometric
accuracy, mean seeing, sky level, sky noise and mean object ellipticity
The 2MASS data used in the photometric and astrometric calibration in the
CASU recipe are obtained from VizieR (either from CDS, CASU or JAC) and
hence the CASU pipeline needs an internet connection to run to completion.
The CASU pipeline can run from a local FITS table as well and this can be
used to circumvent external dependencies.
Running these frames (essentially, two jitter sequences and a set of
array tests) through both pipelines yielded the following timing results:
Pipeline |
Run time(s) |
Standard |
354 |
CASU |
174 |
Using this simple recipe the CASU pipeline
runs about a factor of 2 faster. Whether this is the case generally
is beyond the scope of this little test. It is, however, quite probably that
a large amount of the difference is in the resampling that takes place in
the Standard pipeline during the stacking phase.
In order to compare the photometric and positional consistency of the
two reduction methods a catalogue was generated for each of the stacked
images using the catalogue generation program from CIRDR (imcore). The
catalogues were generated in the following way:
-
An aperture (Rcore) of radius of 10 pixels (0.91 arcsec) was used.
Objects covering an area of less than 40 pixels were ignored.
-
Because the Standard pipeline uses an interpolation algorithm for rebinning
during stacking and the current CASU pipeline rebins using the
'nearest neighbour',
the sky noise estimate for the Standard pipeline stacks will be significantly
less. (Interpolation redistributes the sky
noise off the diagonal elements of the covariance matrix and hence the noise
is no longer uncorrelated from pixel to pixel. Smoothing has a similar effect).
To compensate for this, the detection thresholds (as expressed in
units of the mean sky noise) were adjusted for the Standard pipeline images
so that they matched that of the CASU catalogues (the latter used a detection
threshold of 1.5 times the "mean" sky noise). Note that both the sky level
and sky noise values are derived using robust estimators.
-
Because there were no confidence maps available for the Standard stack
frames, the CASU catalogues were re-generated without a confidence map
for the sake of consistency in the testing.
A subset of the available catalogue parameters were used for comparison.
These include the following:
Parameter |
Description |
X coordinate |
The X coordinate of the object (pixels) |
Y coordinate |
The Y coordinate of the object (pixels) |
Isophotal flux |
The flux with the detection |
Total flux |
The total flux of the object (Kron style) |
Core flux |
The flux through an aperture of radius
Rcore |
Core 4 flux |
The flux through an aperture of radius
2*sqrt(2)*Rcore |
Object Ellipticity |
An estimate of the ellipticity for each
object |
Gaussian Width |
The equivallent Gaussian width estimate for each object
(pixels) |
Note that imcore generates object brightness in terms of data numbers
rather than magnitudes. Magnitudes are generated for these comparisons
by 2.5*log10(counts), so that brighter objects have a higher magnitude.
In the residuals below the sense is always
CASU minus Standard. In what follows, for the sake of brevity, we restrict
the analysis to objects on the first image stack that was reduced (group
number 36). The other groups give similar results.
Parameter |
Mean Residual |
Standard Deviation |
X coordinate |
1.000 |
0.227 |
Y coordinate |
1.070 |
0.146 |
Isophotal Magnitude |
0.044 |
0.066 |
Total Magnitude |
-0.032 |
0.134 |
Core Magnitude |
0.014 |
0.033 |
Core 4 Magnitude |
0.032 |
0.068 |
Gaussian Width |
0.057 |
0.096 |
Object Ellipticity |
0.001 |
0.013 |
There is a significant mean residual in the x,y coordinates.
This is simply an artifact of the different the stacking algorithms used
and how the origin is assigned. The standard deviations in the coordinates
are also a remnant of this, in that the Standard pipeline rebins using
some form of interpolation, whereas the current CASU pipeline uses the nearest
neighbour.
All of the magnitude estimates reproduce to a good level of consistency.
The standard deviation of the larger aperture magnitudes increases due
to the larger contribution from the (noisy) background. The large mean
residual and standard deviation in the total magnitude estimate is caused
almost entirely by two deviant points (two fuzzy images). If these are
removed then they drop to 0.016 and 0.037 respectively.
Finally, the two shape parameters we have included, the Gaussian width
and the object ellipticity both agree very well. The CASU objects appear
to be slightly larger than the standard pipeline objects, which is to be
expected given the 'nearest neighbour' rebinning scheme used by the CASU
pipeline. The mean gaussian width for this particular tile was 2.20 pixels,
hence this residual corresponds to roughly 2.5%.
Below are comparison plots for the four magnitude estimates we've included.
A comparison of core magnitudes
A comparison of core4 magnitudes
A comparison of isophotal magnitudes
A comparison of total magnitudes
The background following algorithm in imcore
divides the map into cells approximately 64x64 pixels in dimension. A
robust iterative clipped
median background value is calculated for each cell and the final background
map is created by a bilinear interpolation between these cells for each
pixel in the image map. As a test of the flatness of the backgrounds
of the output image stacks we have created background maps for each image.
Disregarding some small regions on the outside of each frame where the
confidence is very low or zero (no coverage in the jitter pattern), the
maximum background variation in the images is roughly 0.15%. Below
is a display of the ratio of the two background maps. The range of values
between the lower left-hand corner and the upper right is approximately
0.1%.
Jim Lewis <jrl@ast.cam.ac.uk>
Last modified: Mon Feb 16 16:50:22 2004