Skip to content
Snippets Groups Projects
Alan De Smet's avatar
Alan De Smet authored
Provides a summary of itself
ef1e7ca6
History
Name Last commit Last update
csppfetch
.gitignore
README.rst
test-csppfetch.py

csppfetch

csppfetch is a Python 3 module for downloading dynamic ancillary data. It supports downloading individual files needed for processing a given time step (and optionally maintaining a local cache as it does) as well as explicitly maintaining a local cache. It uses fcntl.lockf to allow multiple copies to safely run at the same time.

Requires Python 3.6 or later.

Example

>>> import csppfetch
>>> import datetime
>>> sst_filename = "avhrr-only-v2.%Y%m%d_preliminary.nc"
>>> sst = csppfetch.Downloader(
...     name = "Sea Surface Temperature",
...     package_env_id = 'CSPP_GEO_AITF_',
...     url_base = "https://geodb.ssec.wisc.edu/ancillary/",
...     url_relative = "%Y_%m_%d_%j/"+sst_filename,
...     local = sst_filename,
...     period = datetime.timedelta(hours=24),
...     epoch_start = datetime.datetime(2010,1,1,3,0,0)
...     )
>>> import tempfile, os
>>> #
>>> # Download files needed to process a scan at 2019-5-27 12:00:00
>>> with tempfile.TemporaryDirectory() as example_dir:
...     sst.download_for_time(datetime.datetime(2019,5,27, 12,0,0), example_dir)
...     os.path.exists(example_dir+"/avhrr-only-v2.20190527_preliminary.nc")
True
>>> #
>>> # Download files last 7 days of files
>>> with tempfile.TemporaryDirectory() as example_dir:
...     sst.mirror(example_dir)

Usage

csppfetch.Downloader.download_for_time implements the CSPP Geo ancillary download behavior guidelines. It defaults to to abandoning a download after 30 seconds, will retry 3 failures, and waits 20 seconds between retry attempts. These can be overridden with the environment variables <prefix>_TIMEOUT, <prefix>_RETRIES, and <prefix>_RETRY_WAIT, where <prefix> is whatever is specified by package_env_id in the csppfetch.Downloader constructor. packages_env_id should probably end in "_ANCIL_" to best match the guidelines, so you might use package_env_id="CSPP_GEO_AITF_ANCIL_" instead of package_env_id="CSPP_GEO_AITF". Similarly, you can replace the default URL in url_base with a comma seperated list using the <prefix>_URL environment variable.

By default download_for_time seeks the nearest preceeding time relative to the time passed in.