hirs2nc
The Glue Code
After package delivery we can access package information from ipython
using the delivered_software
method of the
flo.sw.lib.glutil
module:
from flo.sw.lib.glutil import delivered_software
delivered_software.lookup('hirs2nc', delivery_id='20180410-1')
which gives the output
Delivery(id='20180410-1', name='hirs2nc', version='1.0.0', path='/mnt/deliveredcode/deliveries/hirs2nc/20180410-1')
We also need to install the following packages:
pip install -i https://sips.ssec.wisc.edu/eggs -U flo rados
pip install -i https://sips.ssec.wisc.edu/eggs -U sipsprod glutil
pip install -i https://sips.ssec.wisc.edu/eggs -U timeutil
pip install -i https://sips.ssec.wisc.edu/eggs -U simple
Local Deployment of python glue code
The glue code for hirs2nc
can be started by creating the file source/flo/__init__.py
in ~/code/PeateScience/packages/hirs2nc
. The glue code can then be linked into the local execution directory:
cd ~/code/PeateScience/local/dist/hirs2nc
ln -s ../../../packages/hirs2nc/source/flo ./
Local Processing
repo=$HOME'/git'
work_dir='/data/'$USER'/HIRS_processing/Work/local_processing/hirs2nc'
satellite='metop-b'
str_import_hirs2nc="import os; from timeutil import TimeInterval, datetime, timedelta; os.chdir('$repo/hirs2nc'); import example_local_prepare; os.chdir('$work_dir')"
hirs2nc_id='20180410-1'
python -c "$str_import_hirs2nc; granule = datetime(2015, 2, 1, 0, 15); interval = TimeInterval(granule, granule+timedelta(seconds=0)); example_local_prepare.local_execute_example(interval, '$satellite', '$hirs2nc_id', skip_prepare=False, skip_execute=False, verbosity=2)"
python -c "$str_import_hirs2nc; granule = datetime(2015, 2, 1, 0, 15); interval = TimeInterval(granule, granule+timedelta(minutes=0)); example_local_prepare.print_contexts(interval, '$satellite', '$hirs2nc_id', verbosity=2)"
Cluster Processing
Deploy the glue code to the user account cluster
We import the flo3
interface python code for hirs2nc
into the software tree
/mnt/software/geoffc
by running rsync
...
sudo su - flo
cd /mnt/software/geoffc
mv hirs2nc hirs2nc_old
rsync -urLv /home/geoffc/code/PeateScience/local/dist/hirs2nc . --progress --exclude=.*.sw*
Deploy the glue code to the development (flo) account cluster
We import the flo3
interface python code for hirs2nc
into the software tree
/mnt/software/flo
by changing to the flo
account and running rsync
...
sudo su - flo
cd /mnt/software/flo/
mv hirs2nc hirs2nc_old
rsync -urLv /home/geoffc/code/PeateScience/local/dist/hirs2nc . --progress --exclude=.*.sw*
Commit glue code to PeateScience repo
The actual glue code was copied to /mnt/software
in the last step, but pushing the hirs2nc
python code to the PeateScience
repo will provide the submission scripts
example_local_prepare.py
and submit_hirs2nc.py
for use on condor.
cd ~/code/PeateScience
git pull
git add ~/code/PeateScience/packages/hirs2nc
git commit hirs2nc -m "Initial commit of the hirs2nc package."
git push
Running the hirs2nc code on the cluster
We can now submit hirs2nc
to the cluster from condor, on the development (flo
) account:
sudo su - flo
cd /home/geoffc/hirs2nc/work/
$ python /home/geoffc/code/PeateScience/packages/hirs2nc/submit_hirs2nc.py
(INFO):submit_hirs2nc.py:<module>:30: Submitting intervals...
(INFO):submit_hirs2nc.py:<module>:32: Submitting interval 2015-04-17 14:36:00 -> 2015-04-17 14:36:59
(INFO):submit_hirs2nc.py:<module>:36: There are 1 contexts in this interval
{'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:42: First context: {'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:43: Last context: {'satellite': 'snpp', 'version': '1.0dev0', 'granule': datetime.datetime(2015, 4, 17, 14, 36)}
(INFO):submit_hirs2nc.py:<module>:44: xrange(86694864, 86694865)
We can keep track of running jobs by doing the various incantations:
sudo su - flo
condor_q -autoformat FloClusterComputations | sort | uniq -c
condor_q -constraint 'FloClusterComputations=="flo.sw.hirs2nc:HIRS2NC"' -constraint 'Owner=="flo"'
condor_q -autoformat FloClusterComputations Owner ClusterID ProcID
condor_q -format '%d' ClusterId -format '.%d\n' ProcId
condor_q -constraint 'FloClusterComputations=="flo.sw.hirs2nc:HIRS2NC"' -format '%d' ClusterId -format '.%d\n' ProcId
To look at the log files of a particular job(s)
run -e /home/geoffc/git/sips_utils/snippets.py
job_range = (86694864, 86694865)
job_file_branches = [job_number_to_dir('/scratch/flo/jobs',job) for job in range(*job_range)]
if len(job_file_branches)>1:
job_stdout_files = list(np.squeeze([glob(dir+'-stdout') for dir in job_file_branches]))
job_stderr_files = list(np.squeeze([glob(dir+'-stderr') for dir in job_file_branches]))
else:
job_stdout_files = list([glob(dir+'-stdout') for dir in job_file_branches][0])
job_stderr_files = list([glob(dir+'-stderr') for dir in job_file_branches][0])
In order to check the database for the hirs2nc output
flo_user="-d postgresql://flo3@ratchet.sips/flo3"
satellite='metop-b'
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name;"
job | size | context | file_name
-----------+---------+-----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------
105198323 | 1834536 | "granule"=>"datetime.datetime(2013, 5, 20, 0, 29)", "satellite"=>"'metop-b'", "hirs2nc_delivery_id"=>"'20180410-1'" | NSS.HIRX.M1.D13140.S0029.E0127.B0347172.SV.nc
To group granules by day/month etc...
timeunit='days'
psql $flo_user -c "SELECT date_trunc('$timeunit',pydt(context->'granule')) as m,count(*) from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' group by m order by m"
To select granules which match or are between certain dates:
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2015-01-01' order by file_name;"
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))>'2015-01-01' and date_trunc('days',pydt(context->'granule'))<'2015-01-03' order by file_name;"
To remove old files:
psql $flo_user -c "SELECT job, size, context, file_name FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name" | less
psql $flo_user -c "DELETE FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1'''"
Other Database Querys
psql $flo_user -c "SELECT pydt(context->'granule') as d,count(*) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' group by d order by d order by file_name;" | less
psql $flo_user -c "SELECT pydt(context->'granule') as d,count(*) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2014-01-01' group by d order by d;" | less
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name;" | less
psql $flo_user -c "SELECT job,size,context,file_name from stored_products where computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' and date_trunc('days',pydt(context->'granule'))='2014-01-01' order by file_name;" | less
# List files keys
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;"
# List file keys and status
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX rados -p dev --id flo stat XX
# List file key basenames
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX basename XX
# List the rados commands to download files using the database file keys.
psql $flo_user -tA -c "SELECT format ('flo3/%s/%s',job,file_name) FROM stored_products WHERE computation='flo.sw.hirs2nc:HIRS2NC' and context->'satellite'='''$satellite''' and context->'hirs2nc_delivery_id'='''20180410-1''' order by file_name limit 5;" | xargs -n1 -IXX echo rados -p dev --id flo get XX "~/hirs2nc/work/links/"$(basename XX)
# rados commands
rados -p dev --id flo get flo3/91069111/VNP02FSN.A2015091.0000.001.2018025170339.nc VNP02FSN.A2015091.0000.001.2018025170339.nc
Comparing Cluster Results with Test database
Running in Forward Stream
Examining log files of failed jobs
Generate a list of jobnumbers for failed jobs:
psql $flo_user -c "SELECT job, context FROM failed_jobs WHERE head_computation='flo.sw.hirs2nc:HIRS2NC' and context->'version'='''1.0dev1''' and timestamp > '2018-01-30' order by context;" | grep granule | gawk '{print $1}' > hirs2nc_v1.0dev1_failed_granules.txt
file_obj = open('hirs2nc_v1.0dev1_failed_granules.txt','r')
jobnums = file_obj.readlines()
file_obj.close()
jobnums = [int(x) for x in jobnums]
run -e /mnt/sdata/geoffc/git/sips_utils/snippets.py
job_file_branches = [job_number_to_dir('/scratch/flo/jobs',job) for job in jobnums]
job_stdout_files = list(np.squeeze([glob(dir+'-stdout') for dir in job_file_branches]))
job_stderr_files = list(np.squeeze([glob(dir+'-stderr') for dir in job_file_branches]))
for files in job_stdout_files:
result = search_logfile_for_string(files, 'input sounder_0')
if result != []:
result = search_logfile_for_string(files, 'Dateline granule')
if result != []:
print(result[0].replace('\n',''))
else:
print(files)
else:
pass
for stdout_file, stderr_file in zip(job_stdout_files,job_stderr_files):
try:
if os.path.isfile(stderr_file) and (os.stat(stderr_file).st_size > 0):
print('\n>>> stderr_file = {}'.format(stderr_file))
file_obj = open(stdout_file,'r')
for line in file_obj.readlines():
searchObj = re.search( r'Dateline granule', line, re.M)
if searchObj:
line = line.replace('\n','')
print('Checking {}: {}'.format(stdout_file, line))
else:
print('Checking {}:'.format(stdout_file))
file_obj.seek(3)
line = file_obj.readline()
line = os.pathbasename(line.replace('\n','').split(' ')[-1])
print(line)
file_obj.close()
else:
pass
#print('stderr_file {} does not exist or has zero size.'.format(stderr_file))
except Exception:
file_obj.close()
print('There was a problem with stderr_file {}'.format(stderr_file))
print(traceback.format_exc())
print('stdout_file = {}'.format(stdout_file))
for stderr_file in [glob(dir+'-stdout') for dir in [job_number_to_dir('/scratch/flo/jobs',job) for job in range(77666696, 77667532)]]: check_call('tail -n 1 {}'.format(stderr_file[0]).split(' '))
logfile_obj = open(logpath,'w')
# Write the geocat output to a log file, and parse it to determine the output
# HDF4 files.
hdf_files = []
for line in exe_out.splitlines():
logfile_obj.write(line+"\n")
searchObj = re.search( r'geocat[LR].*\.hdf', line, re.M)
if searchObj:
hdf_files.append(string.split(line," ")[-1])
else:
pass
logfile_obj.close()
Downloading Results from the Cluster
export satellite="noaa-19"
flo_dbase='postgresql://flo3@ratchet.sips/flo3'
satellite='noaa-17'; psql -d $flo_dbase -tA -c "select job from stored_products where context->'satellite'='''$satellite''' and computation='flo.sw.hirs_csrb_monthly:HIRS_CSRB_MONTHLY' and output='zonal_means' order by context" | xargs -n 1 flo_fetch -j