Machine environment for hscPipe7

We introduce several machines for hscPipe.

1. HSC data analysis machine for open use(hanaco)

2. The open-use computer system maintained by Astronomy Data Center (ADC)/NAOJ

3. Large-scale data analysis system (LSC)

4. Batch Processing other than PBS

1. HSC data analysis machine for open use(hanaco)

We have a machine for HSC open user to analyze HSC data. The basic specification is shown below;

  Spec
CPU x86_64
Cores 32
Memory 256 GB
HDD 36 TB x2, 30TB x 1

Application

Submit the application via hanaco application form . Your account information will be sent by e-mail within 3 working days. If not, please contact helpdesk@hsc-software.mtk.nao.ac.jp.

Login to hanaco

As described in hanaco user registration mail, you can login to hanaco as follows. If you access to hanaco from outside NAOJ, you need to register yourself on VPN .

# Login to hanaco
ssh -X -A xxxx@hanaco.ana.nao.ac.jp

Operating precautions

  • Basic Concept
    hanaco is installed for users who cannot construct analysis environment for HSC data. If you don’t have enough environment, you can use hanaco for data reduction and image/catalog creation via hscPipe. After catalog creation, please move these output data to your own machine and use it to analyze them. Basically you can use hanaco for 6 months. After this period, your data might be deleted (※1). So please prepare for data storage to backup your data. In case you cannot do that, you can use 2. The open-use computer system maintained by Astronomy Data Center (ADC)/NAOJ in NAOJ.
    (※1 We will inform you a few weeks before the date of data deletion or disk cleanup activity.)
  • Expiration Date
    As mentioned above, hanaco is valid for 6 months. You can use longer in case that there is enough disk space on hanaco or you have not completed your analysis. Please contact helpdesk@hsc-software.mtk.nao.ac.jp if you will not finish data reduction in 6 months.
  • Working Directory
    You have to perform all analysis via HSC pipeline under /data, /data2, or /data3, not home directory.
    # Create your working directory in /data
    mkdir /data/[User name]
    # Please change access right of the directory if needed.
    
    You can check the available disk space using “df -h” command.
    df -h
    # output
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sdc1        63G   40G   21G  66% /
    tmpfs           127G   24M  127G   1% /dev/shm
    /dev/sdc3        16G   44M   16G   1% /work
    /dev/sdd1        33T   13T   21T  37% /data
    /dev/sda1        28T   18T  9.4T  66% /data2
    /dev/sdb1        33T  1.8T   31T   6% /data3
    
  • Cores
    We don’t have any limitation of available cores per one user. However, please monitor the output of top command and the number of login users during your processing.

Reminder for hscPipe on hanaco

hscPipe 3, 4, 5, and 6 are installed in /data/ana/hscpipe on hanaco. The latest one is 6.7.

You can check which version is loaded when you open .bashrc in your home directory. The following example shows how to change loaded hscPipe version from 4.0.5 to 6.7;

# .bashrc
...
# source /data/ana/hscpipe/4.0.5/bashrc <- comment out
 source /data/ana/hscpipe/6.7/bashrc

Then load .bashrc and execute setup command.

# setup hscPipe
setup-hscpipe

2. The open-use computer system maintained by Astronomy Data Center (ADC)/NAOJ

You can also use the open-use PCs in ADC/NAOJ, called Multi-wavelength Data Analysis System (MDAS) for HSC data reduction. Some tips to analyze data using the PCs are shown below. Please read Data Analysis System User’s Guide first.

Note

The following information is reprinted from the previous pipeline manual. Now the data analysis system is replaced to new one, so the latest information will be updated after replacement.

Preparation

You need to have your account for open-use system in ADC/NAOJ. You can apply from user application form .

Work Disk

We provide 16 TB capacity hard disk named /lfs[01-04] for your work. Please make your own directory under this disk and perform data reduction in it. You can check the amount of disk space or status Working (Operational) Status of Data Analysis System .

# Example
mkdir /lfs01/[user name]

Warning

In order to perform analysis via HSC pipeline, large capacity of disk is required. Please check the amount of disk space before data reduction Disk use status and then execute on the one with enough space.

Installation of HSC pipeline

OS Red Hat Enterprise Linux 7 is used in open-use PCs. Binary package of HSC pipeline for Red Hat Enterprise Linux 7 (hscPipe7 installation) should be installed.

We verified that the following binary works stable on the open use computer system of NAOJ. https://hscdata.mtk.nao.ac.jp:4443/hsc_bin_dist/hscpipe/7.9.1/hscPipe-7.9.1-openblas-centos-7-x86_64.tar.xz

You can download the astrometry catalog from https://hscdata.mtk.nao.ac.jp/hsc_bin_dist/index-ja.html

However, you cannot access to Binary Distribution server from open-use PCs. So please download the package and catalog file for astrometry with the following way;

  1. Download to your own PC, then copy them to open-use PC by scp command, or

  2. Download directly to open-use PC via your own PC.

# For Case 2, you can use the following command;
#
# Using wget
wget https://[pipeline URL] --no-check-c -O - | ssh [user name]@kaim01.ana.nao.ac.jp:/lfs01/hoge/ 'cat > pipe.tar.xz'

# Using curl
curl https://[pipeline URL] --insecure https://[pipeline URL] | ssh [user name]@kaim01.ana.nao.ac.jp:/lfs01/hoge/ 'cat > pipe.tar.xz'

In case of open-use PCs, you need to unset LD_LIBRARY_PATH. Unless this command, hscpipe sets up but scripts do not run properly.

# set up hscpipe.
usnet LD_LIBRARY_PATH
source [your_path_for_hscpipe]/bashrc
setup-hscpipe

Server and Queue information for HSC pipeline execution

On open-use PCs, the processing will finish in a relatively short time when you use q16 que and /var/tmp/ region. /var/tmp/ is referred from analysis server as /lfs[01-06] with “Read-Only”. Please put the files on /var/tmp/, then execute batch process with specifying the files under /var/tmp/ from your working directory using PBS script. The detailed specification of configuration or architecture is described here

PBS Batch Processing

PBS batch processing is built in open-use system. Using q16 queue which has maximum number of core, batch processing is executed with 16 cores per node. You need to prepare PBS batch script to execute batch processing in HSC pipeline.

For HSC pipeline, batch processing is available for constructBias.py, constructDark.py, constructFlat.py, constructFringe.py, constructSky.py, singleFrameDriver.py, skyCorrection.py, coaddDriver.py, and multiBandDriver.py. The following example is PBS batch script for singleFrameDriver.py.

# Preparing batch script using dry-run.
singleFrameDriver.py /lfs01/hoge/hsc --calib /lfs01/[user name]/hsc/CALIB --id filter=HSC-I visit=902798..902808:2 --config processCcd.isr.doFringe=False --time 600 --nodes 1 --procs 16 --dry-run --clobber-config

# Options:
#   --dry-run     :Dry run to create PBS script.
#   --clobber-config :Execute command without using same rerun information.

When you add –dry-run to the command, the result of batch script will be output under /var/tmp/. You can use the script after copying this result to your own directory and edit it.

# Copy (or move) the --dry-run result to your woking directory
cp /var/tmp/tmph0gE /lfs01/[user name]/work/


# Edit tmph0gE file (PBS batch sfcript).
# In the batch script, there are some default comments.
# Please delete them all, then add below comments.
:
:
#!/bin/bash
#PBS -m abe
#PBS -q q16
#PBS -N test
#PBS -l walltime=336:00:00
#PBS -o hoge.out
#PBS -e hoge.err
#PBS -M hoge@nao.ac.jp


# To make log file for tracking batch process,
# please add below line to the end of the above PBS commens.
:
:
{

        (pipeline commands)

} &> /lfs01/hoge/work/tmph0gE.log

Please refer to Data Analysis System User’s Guide for detailed PBS option.

Note that;

  • Specify queue to q16 with -q q16.

  • Set maximum of actual time in case that a job is in a run state to max by -l walltime=336:00:00.

After preparing PBS batch script, run the following commands to perform it.

# Run PBS batch script
qsub -V /lfs01/[user name]/work/tmphOgE

The progress of this script is logged on tmphOgE.log, and you can check it by the following command;

# Output appended data as the file grows
tail -f tmphOgE.log

3. Large-scale data analysis system (LSC)

LSC is an extension to the existing data analysis system (MDAS), which is developed by ADC and is cooperated by Subaru Telescope at NAOJ. This system will be the main data analysis machine for HSC open-use in future.

The open-use PI/CoIs have a higher privilege to use LSC resources for one-year term from the start of the processing. After the priority period ends, the users are automatically transitioned to a general user privilege with smaller available resources.

We have started a test operation from 2019 autumn and PIs of S19B may have recieved the user instruction.

Please see the information e-mail (only send to PIs) and below link.

https://www.adc.nao.ac.jp/LSC/users_guide_e.html

If you have any questions about “hscPipe” or HSC data reduction, please contact following address: helpdesk [at-mark] hsc-software.mtk.nao.ac.jp ([at-mark] replaces “@”)

If you have any questions about the system except for “hscPipe”, please contact following address: lsc-consult [at-mark] ana.nao.ac.jp ([at-mark] replaces “@”)

4. Batch Processing other than PBS

There are some batch systems other than PBS. Though the default system is PBS in HSC pipeline, you can set other ones by specify it in option command. Please use the best system operating on your machine.

# In case of useing SLURM.
singleFrameDriver.py ~/hsc --calib ~/hsc/CALIB --rerun test --id filter=HSC-I visit=902798..902808:2 --config processCcd.isr.doFringe=False --batch-type=slurm

# Option:
#   --batch-type :Specify batch system. You can select from {slurm,pbs, or smp}.