csppfetch
Examples
Setup
>>> import csppfetch >>> import datetime >>> sst_filename = "avhrr-only-v2.%Y%m%d_preliminary.nc" >>> sst = csppfetch.Downloader( ... name = "Sea Surface Temperature", ... package_env_id = 'CSPP_GEO_AITF_', ... url_base = "https://geodb.ssec.wisc.edu/ancillary/", ... url_relative = "%Y_%m_%d_%j/"+sst_filename, ... local = sst_filename, ... period = datetime.timedelta(hours=24), ... epoch_start = datetime.datetime(2010,1,1,3,0,0) ... )
(These imports are for these examples. They're not required to use the module!)
>>> from tempfile import TemporaryDirectory >>> import os
Download files needed to process a scan at 2019-5-27 12:00:00:
>>> scan_time = datetime.datetime(2019,5,27, 12,0,0) >>> with TemporaryDirectory() as example_dir: ... sst.download_for_time(scan_time, example_dir) ... os.path.exists(example_dir+"/avhrr-only-v2.20190527_preliminary.nc") True
Download last 7 days of files
>>> with TemporaryDirectory() as example_dir: ... sst.mirror(example_dir) ... len(os.listdir(example_dir)) > 3 True
(The number of files is compared to 3 because due to vaguaries of creation and mirroring, we are likely to get fewer, sometimes many fewer, than the ideal 6 or 7.)
Usage
See import csppfetch; help(csppfetch.Downloader.__init__)
for details on
the arguments to Downloader.
TODO: Expand on WHY you'd use various options.
csppfetch.Downloader.download_for_time
implements the
CSPP Geo ancillary download behavior guidelines.
It defaults to to abandoning a download after 30 seconds, will retry 3
failures, and waits 20 seconds between retry attempts. These can be overridden
with the environment variables <prefix>_TIMEOUT
, <prefix>_RETRIES
, and
<prefix>_RETRY_WAIT
, where <prefix>
is whatever is specified by
package_env_id
in the csppfetch.Downloader
constructor. packages_env_id
should probably end in "_ANCIL_" to best match the guidelines, so you might use
package_env_id="CSPP_GEO_AITF_ANCIL_"
instead of package_env_id="CSPP_GEO_AITF"
. Similarly,
you can replace the default URL in url_base
with a comma seperated list using
the <prefix>_URL
environment variable.
By default download_for_time seeks the nearest preceeding time relative to the time passed in.