yori merge requestshttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests2020-03-04T16:20:34Zhttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/15Reduce aggr mem usage2020-03-04T16:20:34ZGreg Quinngreg.quinn@ssec.wisc.eduReduce aggr mem usageDenis has been running some VIIRS CLDPROP aggregations in the SIPS that are getting up over 100GB in memory usage and take over 12 hours to run.
This MR includes two primary changes to aggregation:
1. Instead of handling all output gro...Denis has been running some VIIRS CLDPROP aggregations in the SIPS that are getting up over 100GB in memory usage and take over 12 hours to run.
This MR includes two primary changes to aggregation:
1. Instead of handling all output groups at once it handles them in batches (of 2 groups at a time currently). This means we no longer have to allocate all output arrays simultaneously which was the primary reason for the large memory footprint.
2. Last time I investigated aggregation performance I put in an improvement that utilized `Pixel_Counts` to limit the read size on gridded granule files. This didn't help, however, when `Pixel_Counts` is not available which is the case when only histograms are requested. This MR adds a global attribute to gridded granule files (called `granule_edges`) that indicates the subset of the global grid that actually contains data.
I tested this change on a daily aggregation and memory usage was down under 10GB and run time was about 1.5 hours.Paolo VeglioPaolo Vegliohttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/14WIP: Resolve "Specifying histogram without bin edges causes error in aggregate"2020-03-04T16:21:14ZPaolo VeglioWIP: Resolve "Specifying histogram without bin edges causes error in aggregate"Closes #21Closes #21https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/13Lower exponent on fill values; fixes #242023-05-03T20:27:07ZEthan Nelsonethan.nelson@ssec.wisc.eduLower exponent on fill values; fixes #24This could be one fix for #24This could be one fix for #24https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/12Always calculate histogram edges; fixes #212023-05-03T20:26:52ZEthan Nelsonethan.nelson@ssec.wisc.eduAlways calculate histogram edges; fixes #21See pveglio/yori#21 for information on the basic background. When specifying a variable's histogram using `(start, stop, step)` in the gridding program, the `edges` parameter is not included in the metadata. This causes problems when the...See pveglio/yori#21 for information on the basic background. When specifying a variable's histogram using `(start, stop, step)` in the gridding program, the `edges` parameter is not included in the metadata. This causes problems when the aggregate code wants to read in the steps for a histogram. Note this behavior only appears when 1) specifying that a variable should be histogramed, 2) you only specify the start, stop and step instead of the edges, and 3) you run the aggregator code on the gridded files.
This MR solves the problem by 1) saving bin edges to the local settings dictionary after calculating them in the gridding code and 2) recalculating edges in the aggregate code.
Just (1) does not solve the problem because the metadata written out to the gridded files is actually the YAML file contents that were read in as opposed to a dump of the settings dictionary.https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/11Expand configuration options2023-05-03T20:26:41ZEthan Nelsonethan.nelson@ssec.wisc.eduExpand configuration optionsFor your consideration....
I am presently importing yori's functions into a script as opposed to calling it from the command line. Though it is possible to write a config file when running the script, I thought defining a dictionary and...For your consideration....
I am presently importing yori's functions into a script as opposed to calling it from the command line. Though it is possible to write a config file when running the script, I thought defining a dictionary and passing it to the function would be easier--especially if I want to recursively test different configurations.
This MR expands the `ConfigReader` capability by allowing either a filename, dictionary, or YAML string to be passed as input. Depending on the input, the function either reads the file and parses the YAML, takes the dictionary as-is, or parses the YAML string. In any case, it then passes the resulting objects to the validation as before. Scripts that call the `ConfigReader` have been modified to accommodate these changes as well.
I'm not sure if this is desired in the codebase, and if it is, whether this strategy is preferred.
Regarding testing, I ran py.test locally and it worked. I also ran the scripts I was using Yori in and they continued to reproduce the same results both a) when reading a YAML file and b) when using a supplied dictionary.https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/10Add indent to unit attr check block2019-09-12T14:32:04ZEthan Nelsonethan.nelson@ssec.wisc.eduAdd indent to unit attr check blockIt looks like in https://gitlab.ssec.wisc.edu/pveglio/yori/commit/6da82ab05f5513cf72c74dd55d1deb7bce00ab72 the code block re-added to `yori/run_yori.py` was under-indented one level. This can lead to an error:
```
$ yori-grid sample_tes...It looks like in https://gitlab.ssec.wisc.edu/pveglio/yori/commit/6da82ab05f5513cf72c74dd55d1deb7bce00ab72 the code block re-added to `yori/run_yori.py` was under-indented one level. This can lead to an error:
```
$ yori-grid sample_test_config.yml SAMPLE_INPUT_VNPCLDPROP.A2014033.0000.2017205215553.nc test
Traceback (most recent call last):
File "/mnt/software/support/yori/1.3.9/bin/yori-grid", line 10, in <module>
sys.exit(main())
File "/mnt/software/support/yori/1.3.9/lib/python3.6/site-packages/yori/tools/grid.py", line 27, in main
compression=args.compression)
File "/mnt/software/support/yori/1.3.9/lib/python3.6/site-packages/yori/run_yori.py", line 186, in callYori
Yori.runYori(debug, compression)
File "/mnt/software/support/yori/1.3.9/lib/python3.6/site-packages/yori/run_yori.py", line 174, in runYori
self.grid_size, self.ymlSet, self.fv, edges=myGrid.edges)
File "/mnt/software/support/yori/1.3.9/lib/python3.6/site-packages/yori/run_yori.py", line 214, in write_output
if (attr['name'] == 'units'):
UnboundLocalError: local variable 'attr' referenced before assignment
```
using the config file:
```
grid_settings:
gridsize: 1
projection: conformal
lat_in: latitude
lon_in: longitude
lat_out: latitude
lon_out: longitude
fill_value: -9999
variable_settings:
- name_in: Cloud_Effective_Radius
name_out: EFFR_water
masks:
- Cloud_Mask_Water
- CER_Mask
```https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/9Switch prints to function call2019-09-11T14:19:07ZEthan Nelsonethan.nelson@ssec.wisc.eduSwitch prints to function callThis MR converts a few print statements from keyword to function calls for Python 3 compatibility.This MR converts a few print statements from keyword to function calls for Python 3 compatibility.https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/8WIP: Aggregation performance improvements2019-02-14T21:55:21ZGreg Quinngreg.quinn@ssec.wisc.eduWIP: Aggregation performance improvementsFor Denis' aggregation test case, the changes in this branch brought run time down from about 30 hours to under 40 minutes. Paolo should look over the changes and test this code against the original to make sure no unexpected differences...For Denis' aggregation test case, the changes in this branch brought run time down from about 30 hours to under 40 minutes. Paolo should look over the changes and test this code against the original to make sure no unexpected differences have been introduced.Paolo VeglioPaolo Vegliohttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/7Initial CI config. Needs verification2018-11-15T16:35:22ZBruce Flynnbruce.flynn@ssec.wisc.eduInitial CI config. Needs verificationThe `.gitlab-ci-yaml` is the driver for the tests. It will run CI as long as there is a configured runner to run it on.
The `before_script` will ensure Miniconda is installed and will create a sub-environment to use for testing, deletin...The `.gitlab-ci-yaml` is the driver for the tests. It will run CI as long as there is a configured runner to run it on.
The `before_script` will ensure Miniconda is installed and will create a sub-environment to use for testing, deleting it if it already exists. It will install the packages required for testing, currently just `libnetcdf` for the `ncgen` binary.
You may need to tweak and futz a few things to get it working.
Let me know if you run into issues.Paolo VeglioPaolo Vegliohttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/6Fix v1.2.2 attr errors and missing var2018-04-26T16:47:02ZBruce Flynnbruce.flynn@ssec.wisc.eduFix v1.2.2 attr errors and missing varhttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/5fix: missing units_var2018-04-30T18:58:13ZBruce Flynnbruce.flynn@ssec.wisc.edufix: missing units_varhttps://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/4ensure grid vars do not have nan values2018-04-19T14:33:26ZBruce Flynnbruce.flynn@ssec.wisc.eduensure grid vars do not have nan valuesThis shoud fix issue #15. It is not a particularly elegant solution, but it does seem to work.This shoud fix issue #15. It is not a particularly elegant solution, but it does seem to work.https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/3yori-grid fails with relative output file2017-10-21T00:30:33ZBruce Flynnbruce.flynn@ssec.wisc.eduyori-grid fails with relative output fileI get a permission error when using a relative path for the output filename.
```
yori-grid \
/mnt/deliveredcode/deliveries/viirstpw_preyori/20171019-1/dist/day_night_separation.yaml \
TMP.VNPWATVP_D3.A2014108.0006.001.20172...I get a permission error when using a relative path for the output filename.
```
yori-grid \
/mnt/deliveredcode/deliveries/viirstpw_preyori/20171019-1/dist/day_night_separation.yaml \
TMP.VNPWATVP_D3.A2014108.0006.001.2017209093446.nc \
VNPWATVP_D3.A2014108.0006.001.2017209093446.nc
```
results in the following traceback
```
Traceback (most recent call last):
File "/home/brucef/.local/miniconda2/envs/flo/bin/yori-grid", line 11, in <module>
load_entry_point('yori', 'console_scripts', 'yori-grid')()
File "/home/brucef/code/yori/yori/tools/grid.py", line 22, in main
debug=args.debug)
File "/home/brucef/code/yori/yori/run_yori.py", line 171, in run_yori
outInst.createFile()
File "/home/brucef/code/yori/yori/ioutils.py", line 85, in createFile
ncdf = nc.Dataset(self.path + self.name, mode='w', format='NETCDF4')
File "netCDF4/_netCDF4.pyx", line 1859, in netCDF4._netCDF4.Dataset.__init__
File "netCDF4/_netCDF4.pyx", line 1556, in netCDF4._netCDF4._ensure_nc_success
IOError: Permission denied
```
This merge request fixes this issue by using the python `os.path` utilities for path (de)construction.https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/2fixes to installation process2017-08-07T14:34:10ZBruce Flynnbruce.flynn@ssec.wisc.edufixes to installation processThese are fixes I had to make to install.
1. Use standard `ruamel.yaml` package rather than `ruamel_yaml` to fix `DistributionNotFound` error
2. Use `python-hdf4` rather than `pyhdf`. I don't believe `pyhdf` is maintained.
These change...These are fixes I had to make to install.
1. Use standard `ruamel.yaml` package rather than `ruamel_yaml` to fix `DistributionNotFound` error
2. Use `python-hdf4` rather than `pyhdf`. I don't believe `pyhdf` is maintained.
These changes allowed me to easily install using the following:
```
git clone git@gitlab.ssec.wisc.edu:pveglio/yori.git
conda create -y -p $PWD/env "python<3" hdf4
./env/bin/pip install numpy
INCLUDE_DIRS=$PWD/env/include ./env/bin/pip install ./yori
```https://gitlab.ssec.wisc.edu/pveglio/yori/-/merge_requests/1Fix aggr2017-06-27T19:59:33ZPaolo VeglioFix aggrproblems with the aggregator solvedproblems with the aggregator solvedPaolo VeglioPaolo Veglio