hirs2nc

The Glue Code

After package delivery we can access package information from ipython using the delivered_software method of the flo.sw.lib.glutil module:

from flo.sw.lib.glutil import delivered_software
delivered_software.lookup('hirs2nc', delivery_id='20180410-1')

which gives the output

Delivery(id='20180410-1', name='hirs2nc', version='1.0.0', path='/mnt/deliveredcode/deliveries/hirs2nc/20180410-1')

We also need to install the following packages:

pip install -i https://sips.ssec.wisc.edu/eggs -U flo rados
pip install -i https://sips.ssec.wisc.edu/eggs -U sipsprod glutil
pip install -i https://sips.ssec.wisc.edu/eggs -U timeutil
pip install -i https://sips.ssec.wisc.edu/eggs -U simple

Local Deployment of python glue code

The glue code for hirs2nc can be started by creating the file source/flo/__init__.py in ~/code/PeateScience/packages/hirs2nc. The glue code can then be linked into the local execution directory:

cd ~/code/PeateScience/local/dist/hirs2nc
ln -s ../../../packages/hirs2nc/source/flo ./

Local Processing

repo=$HOME'/git'
work_dir='/data/'$USER'/HIRS_processing/Work/local_processing/hirs2nc'

satellite='metop-b'

str_import_hirs2nc="import os; from timeutil import TimeInterval, datetime, timedelta; os.chdir('$repo/hirs2nc'); import example_local_prepare; os.chdir('$work_dir')"

hirs2nc_id='20180410-1'

python -c "$str_import_hirs2nc; granule = datetime(2015, 2, 1, 0, 15); interval = TimeInterval(granule, granule+timedelta(seconds=0)); example_local_prepare.local_execute_example(interval, '$satellite', '$hirs2nc_id', skip_prepare=False, skip_execute=False, verbosity=2)"

python -c "$str_import_hirs2nc; granule = datetime(2015, 2, 1, 0, 15); interval = TimeInterval(granule, granule+timedelta(minutes=0)); example_local_prepare.print_contexts(interval, '$satellite', '$hirs2nc_id', verbosity=2)"

Cluster Processing

Deploy the glue code to the user account cluster

We import the flo3 interface python code for hirs2nc into the software tree /mnt/software/geoffc by running rsync...

sudo su - flo
cd /mnt/software/geoffc
mv hirs2nc hirs2nc_old
rsync -urLv /home/geoffc/code/PeateScience/local/dist/hirs2nc . --progress --exclude=.*.sw*

Deploy the glue code to the development (flo) account cluster

We import the flo3 interface python code for hirs2nc into the software tree /mnt/software/flo by changing to the flo account and running rsync...

sudo su - flo
cd /mnt/software/flo/
mv hirs2nc hirs2nc_old
rsync -urLv /home/geoffc/code/PeateScience/local/dist/hirs2nc . --progress --exclude=.*.sw*

Commit glue code to PeateScience repo

The actual glue code was copied to /mnt/software in the last step, but pushing the hirs2nc python code to the PeateScience repo will provide the submission scripts example_local_prepare.py and submit_hirs2nc.py for use on condor.

cd ~/code/PeateScience
git pull
git add ~/code/PeateScience/packages/hirs2nc
git commit hirs2nc -m "Initial commit of the hirs2nc package."
git push

Running the hirs2nc code on the cluster

We can now submit hirs2nc to the cluster from condor, on the development (flo) account:

sudo su - flo
cd /home/geoffc/hirs2nc/work/

$ python /home/geoffc/code/PeateScience/packages/hirs2nc/submit_hirs2nc.py
(INFO):submit_hirs2nc.py:<module>:30:  Submitting intervals...
(INFO):submit_hirs2nc.py:<module>:32:  Submitting interval 2015-04-17 14:36:00 -> 2015-04-17 14:36:59
(INFO):submit_hirs2nc.py:<module>:36:     There are 1 contexts in this interval
{'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:42:     First context: {'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:43:     Last context:  {'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:44:     xrange(86694864, 86694865)

We can keep track of running jobs by doing the various incantations:

sudo su - flo
condor_q -autoformat FloClusterComputations | sort | uniq -c
condor_q -constraint 'FloClusterComputations=="flo.sw.hirs2nc:HIRS2NC"' -constraint 'Owner=="flo"'
condor_q -autoformat FloClusterComputations Owner ClusterID ProcID
condor_q -format '%d' ClusterId -format '.%d\n' ProcId
condor_q -constraint 'FloClusterComputations=="flo.sw.hirs2nc:HIRS2NC"' -format '%d' ClusterId -format '.%d\n' ProcId

To look at the log files of a particular job(s)

run -e /home/geoffc/git/sips_utils/snippets.py
job_range = (86694864, 86694865)
job_file_branches = [job_number_to_dir('/scratch/flo/jobs',job) for job in range(*job_range)]
if len(job_file_branches)>1:
    job_stdout_files = list(np.squeeze([glob(dir+'-stdout') for dir in job_file_branches]))
    job_stderr_files = list(np.squeeze([glob(dir+'-stderr') for dir in job_file_branches]))
else:
    job_stdout_files = list([glob(dir+'-stdout') for dir in job_file_branches][0])
    job_stderr_files = list([glob(dir+'-stderr') for dir in job_file_branches][0])

In order to check the database for the hirs2nc output

flo_user="-d postgresql://flo3@ratchet.sips/flo3"
satellite='metop-b'
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and  context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name;"
    job    |  size   |                                                        context                                                        |                   file_name                   
-----------+---------+-----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------
 105198323 | 1834536 | "granule"=>"datetime.datetime(2013, 5, 20, 0, 29)", "satellite"=>"'metop-b'", "hirs2nc_delivery_id"=>"'20180410-1'"   | NSS.HIRX.M1.D13140.S0029.E0127.B0347172.SV.nc

To group granules by day/month etc...

timeunit='days'
psql $flo_user -c "SELECT date_trunc('$timeunit',pydt(context->'granule')) as m,count(*) from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and  context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' group by m order by m"

To select granules which match or are between certain dates:

psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2015-01-01' order by file_name;"
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))>'2015-01-01' and date_trunc('days',pydt(context->'granule'))<'2015-01-03' order by file_name;"

To remove old files:

psql $flo_user -c "SELECT job, size, context, file_name FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name" | less
psql $flo_user -c "DELETE FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1'''"

Other Database Querys

psql $flo_user -c "SELECT pydt(context->'granule') as d,count(*) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' group by d order by d order by file_name;" | less

psql $flo_user -c "SELECT pydt(context->'granule') as d,count(*) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2014-01-01' group by d order by d;" | less

psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name;" | less

psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2014-01-01' order by file_name;" | less

# List files keys
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;"

# List file keys and status   
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX  rados -p dev --id  flo stat XX

# List file key basenames
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX basename   XX

# List the rados commands to download files using the database file keys.
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX echo rados -p dev --id flo get XX "~/hirs2nc/work/links/"$(basename XX)

# rados commands
rados -p dev --id flo get flo3/91069111/VNP02FSN.A2015091.0000.001.2018025170339.nc VNP02FSN.A2015091.0000.001.2018025170339.nc

Comparing Cluster Results with Test database

Running in Forward Stream

Examining log files of failed jobs

Generate a list of jobnumbers for failed jobs:

psql $flo_user -c "SELECT job, context FROM failed_jobs WHERE head_computation='flo.sw.hirs2nc:HIRS2NC' and context->'version'='''1.0dev1''' and timestamp > '2018-01-30' order by context;" | grep granule | gawk '{print $1}' > hirs2nc_v1.0dev1_failed_granules.txt

file_obj = open('hirs2nc_v1.0dev1_failed_granules.txt','r')
jobnums = file_obj.readlines()
file_obj.close()
jobnums = [int(x) for x in jobnums]

run -e /mnt/sdata/geoffc/git/sips_utils/snippets.py

job_file_branches = [job_number_to_dir('/scratch/flo/jobs',job) for job in jobnums]
job_stdout_files = list(np.squeeze([glob(dir+'-stdout') for dir in job_file_branches]))
job_stderr_files = list(np.squeeze([glob(dir+'-stderr') for dir in job_file_branches]))

for files in job_stdout_files:
    result = search_logfile_for_string(files, 'input sounder_0')
    if result != []:
        result = search_logfile_for_string(files, 'Dateline granule')
        if result != []:
            print(result[0].replace('\n',''))
        else:
            print(files)
    else:
        pass

for stdout_file, stderr_file in zip(job_stdout_files,job_stderr_files):
    try:
        if os.path.isfile(stderr_file) and (os.stat(stderr_file).st_size > 0):
            print('\n>>> stderr_file = {}'.format(stderr_file))
            file_obj = open(stdout_file,'r')
            for line in file_obj.readlines():
                searchObj = re.search( r'Dateline granule', line, re.M)
                if searchObj:
                    line = line.replace('\n','')
                    print('Checking {}: {}'.format(stdout_file, line))
                else:
                    print('Checking {}:'.format(stdout_file))

            file_obj.seek(3)
            line = file_obj.readline()
            line = os.pathbasename(line.replace('\n','').split(' ')[-1])
            print(line)

            file_obj.close()
        else:
            pass
            #print('stderr_file {} does not exist or has zero size.'.format(stderr_file))
    except Exception:
        file_obj.close()
        print('There was a problem with stderr_file {}'.format(stderr_file))
        print(traceback.format_exc())

    print('stdout_file = {}'.format(stdout_file))

for stderr_file in [glob(dir+'-stdout') for dir in [job_number_to_dir('/scratch/flo/jobs',job) for job in range(77666696, 77667532)]]: check_call('tail -n 1 {}'.format(stderr_file[0]).split(' '))

        logfile_obj = open(logpath,'w')

        # Write the geocat output to a log file, and parse it to determine the output
        # HDF4 files.
        hdf_files = []
        for line in exe_out.splitlines():
            logfile_obj.write(line+"\n")
            searchObj = re.search( r'geocat[LR].*\.hdf', line, re.M)
            if searchObj:
                hdf_files.append(string.split(line," ")[-1])
            else:
                pass

        logfile_obj.close()

Downloading Results from the Cluster

export satellite="noaa-19"
flo_dbase='postgresql://flo3@ratchet.sips/flo3'
satellite='noaa-17'; psql -d $flo_dbase -tA -c "select job from stored_products where context->'satellite'='''$satellite''' and computation='flo.sw.hirs_csrb_monthly:HIRS_CSRB_MONTHLY' and output='zonal_means' order by context" | xargs -n 1 flo_fetch -j