HSC Data Reduction Tutorial

This tutorial is based on an English translation of one for the HSC data reduction school held in July 20th and 21st, 2016. Here, you will reduce HSC data using the HSC pipeline, create pseudo-color images, and make a color–magnitude diagram. Note that some of the command line options are optimized for the school.
You can also see the HSC pipeline manual and a tutorial on the HSC pipeline binary distribution for the details. For an advanced user, this link will be helpful. It describes more details of the pipeline, though you need to be careful as it contains information on older version of the pipeline.
In this tutorial, we will use science frames of HSC consisting of 5 shots of central 6 CCDs taken in 3 bands (i, z, Y) and corresponding dome flat frames.
The version 4.0.5 of HSC pipeline is used throughout the tutorial.

Installation of the HSC pipeline and astrometry data

Please install the pipeline following the instruction in the links below.

Prepare raw data

Let’s start from downloading raw data.

# Create a working directory
# Set properly as the environment
mkdir /XXX/YYY/ZZZ
export HSCHOME=/XXX/YYY/ZZZ

# Download raw data to the local storage
wget http://hsc.mtk.nao.ac.jp/HSC_Training_download/hdfn-6CCD.tar.gz -P $HSCHOME

# Extract raw data
cd $HSCHOME
tar xvf hdfn-6CCD.tar.gz

# Check the raw data
cd hdfn-6CCD/
ls | wc -l
# It must return 180

# Check an image of your choice with a FITS viewer
ds9 xxxxx.fits

Next step is to make a directory for data analysis and registry.

# Set HSC pipeline commands (the directory should be set properly as the environment)
source /opt/hscpipe/4.0.5/bashrc

# Run a set-up command to call the HSC pipeline (need to execute at each login)
setup-hscpipe

# Create a data analysis directory (parent directory of the repository)
mkdir $HSCHOME/HSC

# Create a _mapper file (Initialize the directory)
echo 'lsst.obs.hsc.HscMapper' > $HSCHOME/HSC/_mapper

Check if a file named _mapper exists under $HSCHOME/HSC . If you find it, you finished the preparation of the anaysis directory.

Locate raw data in the anaysis directory and register them.

# Move to the directory where raw data are stored
cd $HSCHOME/hdfn-6CCD

# Make a registry under the analysis directory
hscIngestImages.py $HSCHOME/HSC  HSC*.fits --mode=link --create
You will find DOMEFLAT and HDFN directories under $HSCHOME/HSC as well as a file, registry.sqlite3 .
Now we are going to check information of the registered data. Information in registry.sqlite3 can be searched by using a database language called SQL. Details of SQL is out of the scope of this tutorial.
cd $HSCHOME/HSC
sqlite3 $HSCHOME/HSC/registry.sqlite3

sqlite> .header on
sqlite> SELECT visit, filter, field, count(visit)
        FROM raw
        GROUP BY visit, filter, field;

# As seen below, the right-most column (the number of CCDs) should be all 6.
visit|filter|field|count(visit)
212|HSC-Y|DOMEFLAT|6
214|HSC-Y|DOMEFLAT|6
216|HSC-Y|DOMEFLAT|6
218|HSC-Y|DOMEFLAT|6
220|HSC-Y|DOMEFLAT|6
710|HSC-Y|HDFN|6
712|HSC-Y|HDFN|6
714|HSC-Y|HDFN|6
716|HSC-Y|HDFN|6
718|HSC-Y|HDFN|6
950|HSC-I|HDFN|6
952|HSC-I|HDFN|6
954|HSC-I|HDFN|6
...

# quit
sqlite> .q

Now the raw data are prepared.

Making calibration data

Making flat data

Create flat data from dome flat frames. You need to do the same procedure for all bands.

reduceFlat.py $HSCHOME/HSC \
        --rerun=calib_tutorial \
        --id field=DOMEFLAT filter=HSC-Y \ # This is for Y-band. Run 3 times with different filter names.
        --detrendId calibVersion=calib_tutorial \
        --config isr.doBias=False \
                isr.doDark=False \
        --batch-type=smp --cores=4

# rerun is a concept used in the HSC pipeline. It is a process to generate a reduced image with a common parameter set.
# Put search keywords of input data following --id. In this example, field and filter are specified. You can also specify visit number (e.g., from even visit numbers from 212 to 220 by using visit=212..220:2).
# Note that you need to be careful to set field (and some other keywords) as a search keyword as there are cases having wrong FITS header keywords.
# We set --config isr.doBias=False isr.doDark=False as we do not use bias and dark data.
# --batch-type=smp --cores=4 means that process will be executed in the local computer using 6 threads.
# We will explain these parameters below.

After the process above, Y-band flat data (FLAT-[ccd].fits) can be found in $HSCHOME/[dateObs]/[filter]/[calibVersion](calibVersion = rerun in this tutorial). Let’s look at one of the resulting images with ds9. Then repeat the process above to make flat data in i- and z-bands by changing filter=HSC-Y to HSC-I and HSC-Z, respectively.

Register the flat data to the registry, or you will fail the reduction process below.

# Put in the registry
genCalibRegistry.py \
        --root=$HSCHOME/HSC/CALIB \
        --camera=HSC \
        --create

Now you finished making flat data.

Making fringe data

Fringe data are created by stacking object frames. Fringe patterns have to be removed in Y-band in this tutorial. The pipeline can create fringe data only for Y-band and NB0921.

# fringe data (Y-band only)
reduceFringe.py $HSCHOME/HSC \
        --rerun=calib_tutorial \
        --id field=HDFN filter=HSC-Y \
        --detrendId calibVersion=calib_tutorial \
        --config isr.doBias=False \
                isr.doDark=False \
        --batch-type=smp --cores=4

# Set field=HDFN to use object frames.

You can find fringe data (FRINGE-[ccd].fits) in $HSCHOME/HSC/CALIB/FRINGE/[dateObs]/[filter]/[calibVersion] . Once you have checked the fringe data are properly created, register them to the registry with the same command as before. You can also see the images with ds9.

# Register to the registry
genCalibRegistry.py \
        --root=$HSCHOME/HSC/CALIB \
        --camera=HSC \
        --create

Now you finished making fringe data.

Reduction of individual CCDs

In this section, each object frame is reduced, and rough estimates of flux zero point and astrometry are performed as a preparation for the next step. First, we need to set-up a catalog for the astrometric calibration.

# Need to run every login
setup astrometry_net_data sdss-dr9-fink-v5b

Then we reduce each CCD. This process includes fringe pattern removal, flat-fielding, sky subtraction, and source detection. Detected objects are matched to the catalog to derive photometric and astrometric solutions (reduceFrames.py does the last process twice). Final products of this process are reduced images with associated sky images and catalogs.

reduceFrames.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN \
        --config processCcd.isr.doBias=False \
                processCcd.isr.doDark=False \
                doSolveTansip=False \
        --batch-type=smp --cores=4

# --rerun=tutorial means that we finished rerun to create calibration frames and started a new rerun for data reduction.
# --config doSolveTansip=False : This parameter determines whether the WCS solution is derived (True) or not (False). The default is False. In this tutorial, we skip it as the WCS solution is not like to be solved only with 6 CCDs.
# If you are going to do mosaicking, you can skip this process. However, if you are going to reduce all CCDs together, it is recommended not to skip it for better accuracy.
While reduceFrames is running, below we explain the command line options.

Go to this page -> Explanation of command line options

Finished reduceFrames? Let’s check the output images.
Output directories of processed images and thumbnail images are $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/corr and $HSCHOME/hsc/rerun/[rerun]/[pointing]/[filter]/thumbs, respectively. Here [rerun] is a rerun value specified by –rerun=tutorial, [pointing] is a pointing ID at the observation. In this example, the pointing IDs for Y-band is 00815 and for i- and z-bands is 00816.
Therefore, you can find Y-band images in $HSCHOME/HSC/rerun/tutorial/00815/HSC-Y/corr.
  • Reduced object images (sky-subtracted) : CORR-[visit]-[ccd].fits
  • Sky images used for sky subtraction : BKGD-[visit]-[ccd].fits
  • Thumbnails of overscan subtracted images : oss-[visit]-[ccd].png
  • Thumbnails of flat-fielded object images : flattened-[visit]-[ccd].png

The catalogs used for the rough estimates of photometric and astrometric calibrations can be found in $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/output .

  • Catalogs of relatively bright objects used for the first path of photometric and astrometric calibrations : ICSRC-[visit]-[ccd].fits
  • List of matched objects in the first path (i.e., objects matched with the astrometry catalog) : MATCH-[visit]-[ccd].fits
  • Extracted MATCH-[visit]-[ccd].fits catalogs : ML-[visit]-[ccd].fits
  • Catalogs of objects used for the second path of photometric and astrometric calibrations : SRC-[visit]-[ccd].fits
  • List of matched objects in the second path : SRCMATCH-[visit]-[ccd].fits
  • Extracted SRCMATCH-[visit]-[ccd].fits catalogs : SRCML-[visit]-[ccd].fits

Information in the MATCH-ML- files is basically identical. Object IDs used in MATCH- can only be referenced in the HSC pipeline, while ML- can be referenced from other applications. These catalogs are in FITS format, but you can not browse them with ds9 as they are not in an image format.

How to read the catalogs. —> Here

Now we finished reducing each CCD.

Image alignment and flux scaling (mosaicking)

Define tract

Define tract by using all object data. A region of the celestial sphere is called tract and a tract is projected into a plane. If data spans about one FoV of HSC, tract is defined to include all images.

makeDiscreteSkyMap.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN

If you see the message, “tract 0 has corners (189.998, 61.807), (188.417, 61.807), (188.398, 62.553), (190.018, 62.553) (RA, Dec deg) and 4 x 4 patches”, it succeeded. Here, patch is a segment of the tract. By default, a truct contains 100 patches and a patch is a square with 4200 x 4200 pixels. Now tract 0 is defined.

Mosaicking

In the mosaicking process, more accurate image alignment and flux scaling of each [visit, ccd] are derived by considering multiple visits.

mosaic.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN filter=HSC-Y tract=0 # Run 3 times by changing the filter parameter.
Mosaicking process produces flux scaling and WCS coordinate files. No image is generated here. Repeat the process for i- and z-band by specifying filter=HSC-I and HSC-Z.
Let’s check the output files. You will find under $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/corr/[tract] fcr-[visit]-[ccd].fits and wcs-[visit]-[ccd].fits. They are flux scaling files and WCS coordinate files, respectively. [tract] is tract ID (0000 in this example).

Stacking object data

Outputs from mosaicking is used to make a projection of celestial coordinates into plane coordinates (warp) and an integration of all shots (coadd).

stack.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id filter=HSC-Y tract=0 \ # filter is HERE
        --selectId field=HDFN filter=HSC-Y \ # and HERE!!
        --batch-type=smp --cores=4
Rerun the process by changing filter to HSC-I and HSC-Z. **Note that there are two places to specify the filter name in this command.
Coadded image can be found in $HSCHOME/HSC/rerun/[rerun]/deepCoadd/[filter]/[tract]/[patch] as calexp-[filter]-[tract]-[patch].fits . Warp images (warp-[filter]-[tract]-[patch]-[visit].fits) are also stored in the same folder.
Object catalogs (det-[filter]-[tract]-[patch].fits) and sky subtraction patterns (bkgd-[filter]-[tract]-[patch].fits) can be found in $HSCHOME/HSC/rerun/[rerun]/deepCoadd-results/[filter]/[tract]/[patch] . The det- catalogs will be used in the multi-band analysis in the next step.
You can view the stacked images at once by the following command for i-band. (Note that it will use a lot of RAM if you try this with 104 CCDs)
cd $HSCHOME/HSC/rerun/tutorial/deepCoadd/HSC-I/0
ds9 -mosaic wcs */calexp-*.fits

While we used the central 6 CCDs, it looks like the following when all CCDs are used.

–> Here

Now we finished the stacking process.

Multi-band analysis

Object catalogs generated during the stacking are merged and a new catalog will be generated by multi-band photometry.

multiBand.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id tract=0 filter=HSC-Z^HSC-I^HSC-Y \ # Select ALL filters to be used
        --batch-type=smp --cores=4
# Specify all filters to be used in the multi-band analysis in --id .
First, a merged catalog of object positions from all input catalogs (mergeDet). Deblending (a process to separate objects so close that they look overlapping with each other) is performed in this process. Based on the mergeDet catalog, photometry is performed for stacked images of each filter (meas). Then meas catalogs from each filter are used to generate a catalog of objects found in all filters (ref). The object positions in the ref catalog are used for photometry in each filter, which generates forced_src- catalogs.
Here is a summary of output catalogs.

$HSCHOME/HSC/rerun/[rerun]/deepCoadd-results/merged/[tract]/[patch] ;

  • Merged catalogs of object positions : mergeDet-[tract]-[patch].fits
  • Catalogs of object positions based on meas catalogs : ref-[tract]-[patch].fits

$HSCHOME/HSC/rerun/[rerun]/deepCoadd-results/[filter]/[tract]/[patch] ;

  • Object catalogs based on mergeDet : meas-[filter]-[tract]-[patch].fit
  • Final object catalogs : forced_src-[filter]-[tract]-[patch].fits

Photometry on each CCD referencing the stacked image

Using positions of objects in the stacked image (merged catalog of multiple filters), source detection and photometry are performed to re-create a catalog for each [visit, ccd].

forcedCcd.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN tract=0 \
        --batch-type=smp --cores=4

After the process, FORCEDSRC-[visit]-[ccd].fits can be found in $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/tract[tract] .

Re-calibration of each CCD image and catalog

As a last step, apply magnitude zeropoints and WCS determined from the mosaicking to each CCD image and catalog. If you do not need WCS and zeropoint calibrated images/catalogs, you do not need to run this process, but let’s do it as it is a lecture.

# Re-calibrate image
calibrateExposure.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN tract=0 \
        -j 4
# -j : Number of threads.  Use 4 threads when "-j 4" is set.


# Re-calibrate catalog
calibrateCatalog.py $HSCHOME/HSC \
        --rerun=tutorial \
        --id field=HDFN tract=0 \
        --config doApplyCalib=False \
        -j 4
# --config doApplyCalib=False : Do not convert flux to magnitude
After calibrateExposure.py finishes, CALEXP-[visit]-[ccd].fits are stored in $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/corr/[tract] .
Catalogs re-calibrated by calibrateCatalog.py can be found as CALSRC-[visit]-[ccd].fits in $HSCHOME/HSC/rerun/[rerun]/[pointing]/[filter]/output/[tract] .

Congratulations! All reduction processes have been finished.

Make pseudo-color image

Let’s make a pseudo-color image from the reduced images.

  1. Copy a script

    cd $HSCHOME/
    wget http://hsc.mtk.nao.ac.jp/HSC_Training_download/script/stitchPatches.py
    # Check inside the script.
    
    # When texts are not properly encoded, select "Terminal/Set Character Encoding/Unicode (UTF-8)" from the menu and execute "export LC_ALL=ja_JP.UTF-8" .
    
  2. Make one large image from patched images

    python stitchPatches.py -o HSC-Y.fits -x 5000 11000 -y 5000 11000 HSC/rerun/tutorial/deepCoadd/HSC-Y/0/*/calexp-*.fits
    python stitchPatches.py -o HSC-Z.fits -x 5000 11000 -y 5000 11000 HSC/rerun/tutorial/deepCoadd/HSC-Z/0/*/calexp-*.fits
    python stitchPatches.py -o HSC-I.fits -x 5000 11000 -y 5000 11000 HSC/rerun/tutorial/deepCoadd/HSC-I/0/*/calexp-*.fits
    
    # or
    
    for f in Y Z I ; do python stitchPatches.py -o HSC-$f.fits -x 5000 11000 -y 5000 11000 HSC/rerun/tutorial/deepCoadd/HSC-$f/0/*/calexp-*.fits ; done
    
  3. View with a color-mode of ds9

    ds9 -rgb -red HSC-Y.fits -green HSC-Z.fits -blue HSC-I.fits
    
  4. Adjust colors

  • Adjust colors in each filter as follows.
    1. From menu, select Frame/RGB... .
      • Select a channel from “R” “G” “B” .
    2. From menu, select Scale/Scale Parameters... .
      • Adjust minimum and maximum of the scaling.
    3. From menu, select Color/Colormap Parameters... .
      • Adjust Contrast and Bias.
  • (Recommended way)
    • For each filter
      1. Select Scale/Log from menu.

      2. Select Scale/Scale Parameters from menu.
        • Set minimum as 0 and maximum 100.
  1. Save as png format
    1. Select File/Export/png from menu.
    2. Save as HDFN.png .
  2. Check the saved image.

    eog HDFN.png
    
    # or
    
    firefox HDFN.png
    

Create color–magnitude diagram

Let’s make a color–magnitude diagram using the catalogs.
Here, we will use a Pipeline tool called butler which can be used to search or load various types of data such as images and catalogs. For the details of butler, please refer this page .
  1. Copy scripts.

    cd $HSCHOME
    wget http://hsc.mtk.nao.ac.jp/HSC_Training_download/script/catalog_all_{I,Y,Z}.py
    wget http://hsc.mtk.nao.ac.jp/HSC_Training_download/script/catalog.sh
    wget http://hsc.mtk.nao.ac.jp/HSC_Training_download/script/fig_izy.py
    
  2. Use butler to extract object IDs and PFS fluxes in each band and merge them.

    python catalog_all_I.py $HSCHOME/HSC/rerun/tutorial 0 HSC-I > HSC-I.txt
    python catalog_all_Z.py $HSCHOME/HSC/rerun/tutorial 0 HSC-Z > HSC-Z.txt
    python catalog_all_Y.py $HSCHOME/HSC/rerun/tutorial 0 HSC-Y > HSC-Y.txt
    sh catalog.sh
    
  3. Make color–magnitude diagram using the merged file.

    python fig_izy.py
    
  4. Save the figure.

    1. Click “save the figure”
    2. Click “save” after setting the name.
  5. Check the saved figure.

    eog hoge.png
    
    # or
    
    firefox hoge.png
    

Check inside the scripts.

That’s it. Thank you.