yori issueshttps://gitlab.ssec.wisc.edu/pveglio/yori/-/issues2023-07-26T14:52:02Zhttps://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/33Incorrect number of bins for Median Distribution2023-07-26T14:52:02ZPaolo VeglioIncorrect number of bins for Median DistributionThe Mediam Distribution is computing the bins using the `range()` function. When the user defines the `stop`, the last bin edge is actually `stop - interval` because of how the `range()` function works. This needs to be corrected.
This ...The Mediam Distribution is computing the bins using the `range()` function. When the user defines the `stop`, the last bin edge is actually `stop - interval` because of how the `range()` function works. This needs to be corrected.
This shouldn't affect the computation of the median.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/32Name of attribute Median_Bins to be changed for consistency2023-07-26T14:47:48ZPaolo VeglioName of attribute Median_Bins to be changed for consistency`Median_Bins` attribute in the Median_Distribution variable should be changed to `Median_Bin_Boundaries` for consistency with the attribute `Histogram_Bin_Boundaries` in the histograms`Median_Bins` attribute in the Median_Distribution variable should be changed to `Median_Bin_Boundaries` for consistency with the attribute `Histogram_Bin_Boundaries` in the histogramshttps://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/31Median units attribute is not defined correctly2023-07-24T17:48:46ZPaolo VeglioMedian units attribute is not defined correctlyFrom Paul Hubanks (check email from 7/20/2023):
> Hey Paolo, can you make a note to yourself, (Maybe stick it in your Calendar for a few weeks or more out) when you have a break in your schedule to fix this tiny bug in Yori. Just to kee...From Paul Hubanks (check email from 7/20/2023):
> Hey Paolo, can you make a note to yourself, (Maybe stick it in your Calendar for a few weeks or more out) when you have a break in your schedule to fix this tiny bug in Yori. Just to keep things to your high standards!
>
> For the Median Statistic, specified in the YAML file through a keyword, can you copy the local attribute “units” from the Mean stat to the Median stat.
> Shown below is a dump of a Yori L3 file where I specified Median
> Note that the Mean shows units of “percent”
> But there are no units specified for Median.
>
> Everything else looks perfect
> I did visualize some Mean vs Median global stats and the numbers look right.
> So nice job coding Median.
>
> Just fix the one tiny bug (whenever you have time).
> This local attribute for units being specified for median helps our automated visualization codes produce the best, most useful images, which people can interpret properly.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/30Production of aggregated files with user-defined sub-domains2023-07-13T19:05:39ZPaolo VeglioProduction of aggregated files with user-defined sub-domainsFrom a conversation with Rajeev Jain (jain@anl.gov) at SciPy 2023.
Consider implementing an option to allow users to define sub-domains. For example if I'm only interested in a small region I could define a boundary box and my gridded/a...From a conversation with Rajeev Jain (jain@anl.gov) at SciPy 2023.
Consider implementing an option to allow users to define sub-domains. For example if I'm only interested in a small region I could define a boundary box and my gridded/aggregated L3 files would be only confined to that boundary box instead of spanning the whole -90/+90;-180/+180 map.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/29min number of pixels with overlapping granules2023-05-03T20:28:07ZPaolo Vegliomin number of pixels with overlapping granulesMinimum number of pixels is applied for each granule. This causes an issue in some cases when multiple granules contribute to the same grid cell, see example below.
I set my `min_pixel_count=6` and I have two granules `A` and `B` tha...Minimum number of pixels is applied for each granule. This causes an issue in some cases when multiple granules contribute to the same grid cell, see example below.
I set my `min_pixel_count=6` and I have two granules `A` and `B` that contribute to the same grid cell.
`A` has 5 valid pixels; `B` has 4 valid pixels.
The aggregated granule should have that cell populated with a `Pixel_Count=9` but Yori discards each of the granules because they are below the threshold.
Solution:
need to move the evaluation of the `pixel_counts` for the purpose of filtering out grid cells at the end of the aggregation stage.Paolo VeglioPaolo Vegliohttps://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/28"daily" global attribute in aggregated files can be confusing2020-06-26T15:17:07ZPaolo Veglio"daily" global attribute in aggregated files can be confusingThe `daily` global attribute, that is True/False depending on the `--daily` flag being used to create the aggregated file, can be misunderstood. It could be a good idea to replace with something like `DefinitionOfDay` to avoid any possib...The `daily` global attribute, that is True/False depending on the `--daily` flag being used to create the aggregated file, can be misunderstood. It could be a good idea to replace with something like `DefinitionOfDay` to avoid any possible confusion.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/27Minimum Valid Pixel/Days in D3/M3 for L3 Aerosol Products2022-08-26T17:26:30ZPaolo VeglioMinimum Valid Pixel/Days in D3/M3 for L3 Aerosol Products# Request
A two-layer filtering is requested by the Aerosol Team to produce M3 products with Yori:
1. A minimum number of `Pixel_Counts` in a given 1x1 grid cell is required to populate any Aerosol-related group in the D3 product
2. A mi...# Request
A two-layer filtering is requested by the Aerosol Team to produce M3 products with Yori:
1. A minimum number of `Pixel_Counts` in a given 1x1 grid cell is required to populate any Aerosol-related group in the D3 product
2. A minimum number of "valid days" in a given 1x1 grid cell is required to populate any Aerosol-related group in the M3 product
**Note:**
the minimum value for both the `Pixel_Counts` and the "valid days" is subject to change and the user should be allowed to set it
# Implementation
The implementation of this change is split in two parts, in line with the two bullets listed above. All the changes will require the user to specify they want to use these features; this will allow to add functionalities to Yori while maintaining the same workflow for current users.
## Minimum `Pixel_Counts`
The minimum `Pixel_Counts` will be implemented in the gridding phase. As such, an additional, optional line will be added to the configuration file that will specify the minimum threshold.
The following example shows how this option can be added to the configuration file (names are not final):
```yaml
variable_settings:
- name_in: Aerosol_Variable
name_out: Aerosol_Variable_Out
attributes:
...
min_pixel_counts: 4
```
On the code side this should be a relatively easy feature to add. Yori at the moment does this filtering implicitly by populating grid cells with fill values whenever `Pixel_Counts = 0`. We can implement this functionality by passing the `min_pixel_counts` to the `ComputeVariables` class in the `gridtools` module. Again, this parameter will default to zero so `yori-grid` won't change unless explicitly requested by the user.
## Minimum "Valid Days"
*Valid Days* is defined as the number of D3 files with more than zero `Pixel_Counts` values over a month period for a given grid cell.
The hard requirement for making the M3 products filtered by minimum valid days is having a month worth of D3 products ready to be passed to `yori-aggr`.
During the aggregation from D3 to M3 a temporary 360x180 `valid_days` array is initialized with zeros for each variable that requires the "minimum valid days" filtering. Iterating over the D3 files, a one is added to every grid cell that is not empty. At the end of the 30-day aggregation each grid cell in the `valid_days` array will have a value between 0 and 30. This array is used to determine whether a given grid cell meets the criteria and deleted from the final M3 file. This idea is basically what Paul also suggested, but I'm still thinking of a better alternative.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/26QA-weighted Statistics2022-08-26T17:26:53ZPaolo VeglioQA-weighted Statistics# Goal
Compute QA-weighted means and standard deviations and propagate them during the aggregation process. Ideally the end result should look something like the sample ncdump below:
```
group: Aerosol_Variable {
variables:
double...# Goal
Compute QA-weighted means and standard deviations and propagate them during the aggregation process. Ideally the end result should look something like the sample ncdump below:
```
group: Aerosol_Variable {
variables:
double Mean(longitude, latitude) ;
double Standard_Deviation(longitude, latitude) ;
double Sum(longitude, latitude) ;
int Pixel_Counts(longitude, latitude) ;
double Sum_Squares(longitude, latitude) ;
double QA_Mean(longitude, latitude) ; # new variable
double QA_Standard_Deviation(longitude, latitude) ; # new variable
double Weighted_Sum(longitude, latitude) ; # new variable, required for aggregation
int Sum_of_Weights(longitude, latitude) ; # new variable, required for aggregation
double Weighted_Sum_Squares(longitude, latitude) ; # new variable, required for aggregation
...
} // group Aerosol_Variable
```
**Changes to Configuration File**
- The configuration file will have an additional option to turn on the computation of the QA-weighted quantities (default off)
```yaml
variable_settings:
- name_in: aerosol_variable
name_out: aerosol_var_out_with_qa
compute_QA: QA_weights
attributes:
...
```
If `compute_QA` is not defined the code runs as usual, if it is defined the name of the variable that contains the weights need to be specified (e.g. `QA_weights`). Names are not final.
**Changes to the Code:**
In `yori-grid` the QA-weighted quantities are computed like the normal statistics. Weighted sum, weighted sum of squares and sum of weights have to be saved for the aggregation. An additional function to compute these quantities will be added in `gridtools.py` and called when requested by the configuration file.
In `yori-aggr` the QA-weighted quantities are aggregated if the weighted sum, weighted sum of squares and sum of weights are found. A function to propagate the QA-weighted quantities from the gridded files will be added to `yori_aggregate.py`. A conditional instruction will be added as well, to run the new function when the "QA quantities" are found in the gridded files. Depending on how the code will end up looking at the end of this process I might restructure the `yori_aggregate` module to be more readable (e.g. move utility functions in a separate module and leave only the aggregation).
# On the Computation of the QA-weighted quantities
## Recap of the Math Behind Yori
Let's define the **sum** of a given variable, that is, the sum of its values $`v_j`$ within a grid cell $`c`$, as:
```math
S_c = \sum_j v_j
```
and similarly the **sum of squares**:
```math
\textit{SS}_c = \sum_j v_j ^2
```
where $`j`$ is the $`j`$-th pixel and $`v_j`$ is the quantity under consideration for that pixel.
With these quantities we can compute the **mean** $`M`$ as:
```math
M_c = \frac{\sum_i S_{c,i}}{\sum_i N_{c,i}}
```
with $`c`$ and $`i`$ the $`c`$-th cell and the $`i`$-th file respectively and $`N_{c,i}`$ is the number of points. With the same indexing logic, the **standard deviation** is derived from:
```math
\textit{STD}_c = \left[\frac{\sum_i \textit{SS}_{c,i}}{\sum_i N_{c,i}} - \left(\frac{\sum_i S_{c,i}}{\sum_i N_{c,i}}\right)^2\right]^{1/2}
```
## Computation of the QA-weighted Quantities
For the QA-weighted quantities we can use the same approach done before, with some minor changes to account for the weights.
First let's define the general formulas for the weighted mean and the standard deviation:
##### Weighted Mean:
```math
M_w = \frac{\sum_j w_j v_j}{\sum_j w_j}
```
##### Weighted Standard Deviation:
```math
\textit{STD}_w = \left( \frac{\sum_j w_j (v_j - \bar{v})^2}{\sum_i w_j} \right)^{1/2}
```
where $`w_j`$ is the weight of the $`j`$-th pixel and $`\bar{v}`$ is mean value of all $`v_j`$.
Now, in order to propagate the weighted mean and standard deviation we need three additional quantities in the same way we did for the unweighted mean and standard deviation. Let's define these three additional quantities as follows:
* **sum of weights**:
```math
W = \sum_j w_j
```
* **weighted sum**:
```math
S_w = \sum_j w_j v_j
```
* **weighted sum of squares**:
```math
\textit{SS}_w = \sum_j w_j (v_j - \bar{v})^2
```
Now we can use $`W`$, $`S_w`$ and $`\textit{SS}_w`$ to compute the weighted mean $`M_w`$ and standard deviation $`\textit{STD}_w`$:
```math
M_w = \frac{S_w}{W}
```
```math
\textit{STD}_w = \left[\frac{\sum_i \textit{SS}_{w,i}}{\sum_i W_i} - \left(\frac{\sum_i S_{w,i}}{\sum_i W_i}\right)^2\right]^{1/2}
```
Note that in the previous two equations the subscript $`c`$ has been dropped for readability but these are still per-cell quantities.
## Test
Here a code snippet is provided to verify the above formulas
```python
import numpy as np
def main():
# create the fake data
scale = np.random.randint(100)
data1 = np.random.random(500)*scale
data2 = np.random.random(300)*scale
# create the weight arrays
w1 = np.random.randint(4, size=len(data1))
w2 = np.random.randint(4, size=len(data2))
# merge the two arrays
alldata = np.append(data1, data2)
allW = np.append(w1, w2)
# compute sum of weights, weighted sum and weighted sum of squares
W1 = np.sum(w1)
W2 = np.sum(w2)
ws1 = np.sum(w1*data1)
ws2 = np.sum(w2*data2)
wss1 = np.sum(w1*data1**2)
wss2 = np.sum(w2*data2**2)
# reference values
reference_mean = np.sum(allW*alldata)/np.sum(allW)
reference_std = np.sqrt(np.sum(allW*(alldata - reference_mean)**2)/np.sum(allW))
# Weighted mean and standard deviation with the formulas defined above
qa_mean = (ws1 + ws2)/(W1 + W2)
qa_std = np.sqrt((wss1+wss2)/(W1+W2) - ((ws1+ws2)/(W1+w2))**2)
# compare results
print('Means are the same: {}'.format(np.isclose(reference_mean, qa_mean)))
print('Stds are the same: {}'.format(np.isclose(reference_std, qa_std)))
if __name__ == '__main__':
main()
```https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/25Provide flexibility for some vars missing in some files2020-02-11T18:08:10ZEthan Nelsonethan.nelson@ssec.wisc.eduProvide flexibility for some vars missing in some filesPerhaps this isn't a desired functionality use case?...
Let's say a user wants to aggregate into one file a variety of files that have different variables--call the aggregated file a smorgasbord of variables. Groups of files that have d...Perhaps this isn't a desired functionality use case?...
Let's say a user wants to aggregate into one file a variety of files that have different variables--call the aggregated file a smorgasbord of variables. Groups of files that have different variables will have their own settings files to grid those respective variables. However, when aggregation is called, yori could be flexible about whether a variable needs to be in every gridded file or not.
Of course, this introduces some issues where the user may not be aware of what is happening and may set their robotic lawnmower to actually run over the hedges _and_ the grass instead of just the grass. So, if this type of feature were to be included, there would need to probably be 1) user opt-in through some keyword in the aggregate function (`require_all_vars=False` or something like that), and 2) some indication in the metadata of the aggregated file.
In terms of implementing the functionality (barring the above considerations), I think it would mostly require changing https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/yori_aggregate.py#L204 (`var_in = var_tmp`) to be a loop that defines in a loop over keys instead of overwriting the full dictionary.
The user check I mentioned could then go somewhere near the top of the file loop (e.g. https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/yori_aggregate.py#L147 after `group_list = iou.netcdfStructure(fin)`):
```
if require_all_vars and sorted(group_list) != sorted(var_in.keys()): ## may have to be smarter about this comparison depending on the types of those two variables
raise Exception('variable list in file %s does not match prior: %s' % (fin, var_in.keys()))
```
### Example
| file_1 vars | file_2 vars |
| ------ | ------ |
| SST | SST |
| u-wind | |
| v-wind | |
| | rain |
`settings1.yaml` would define gridding for SST, u-wind, and v-wind. `settings2.yaml` would define gridding for SST and rain.
The user calls gridding on each one:
`callYori('settings1.yaml', 'file_1', 'gridded_file_1.nc')`
`callYori('settings2.yaml', 'file_2', 'gridded_file_2.nc')`
The user then aggregates the files:
`aggregate(['gridded_file_1.nc', 'gridded_file_2.nc'], 'smorgasbord.nc', require_all_vars=False)`
Then `ncdump('smorgasbord.nc')` would include gridded variables of SST, u-wind, v-wind, and rain. u-wind, v-wind would only be contributed to by `file_1`,rain would only be contributed to by `file_2`, and sst would be contributed to by `file_1` and `file_2`.
I can definitely work on adding this if it would make sense for this to be a feature.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/24Default fill value possibly too big2020-02-14T20:43:39ZEthan Nelsonethan.nelson@ssec.wisc.eduDefault fill value possibly too bigI started binning a set of files on Yori but hadn't specified a fill value in the YAML:
```
grid_settings:
gridsize: 1
projection: conformal
lat_in: Latitude
lon_in: Longitude
lat_out: Latitude
lon_out: Longitude
variable_s...I started binning a set of files on Yori but hadn't specified a fill value in the YAML:
```
grid_settings:
gridsize: 1
projection: conformal
lat_in: Latitude
lon_in: Longitude
lat_out: Latitude
lon_out: Longitude
variable_settings:
- name_in: 'MODIS25'
name_out: 'Grid_MODIS25'
- name_in: 'MODIS35'
name_out: 'Grid_MODIS35'
```
The result is this error:
```
Traceback (most recent call last):
----- truncated -----
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/run_yori.py", line 187, in callYori
Yori.runYori(debug, compression)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/run_yori.py", line 175, in runYori
self.grid_size, self.ymlSet, self.fv, edges=myGrid.edges)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/run_yori.py", line 203, in write_output
outInst.saveNewVar(fout, gridvar, k, fillvalue=fv, vartype='i4')
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/new_ioutils.py", line 236, in saveNewVar
complevel=self.clevel) #,
File "netCDF4/_netCDF4.pyx", line 2768, in netCDF4._netCDF4.Dataset.createVariable
File "netCDF4/_netCDF4.pyx", line 3896, in netCDF4._netCDF4.Variable.__init__
OverflowError: Python int too large to convert to C long
```
It seems the fill value (e.g. https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/run_yori.py#L15) is a bit too large.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/23Access histogram edges within netCDF groups2020-03-06T16:01:30ZEthan Nelsonethan.nelson@ssec.wisc.eduAccess histogram edges within netCDF groupsRight now as far as I can tell, the only way to discern information about the histograms in files generated through yori is to parse the full YAML_config metadata. It would be useful to have the edges saved either as a group variable or ...Right now as far as I can tell, the only way to discern information about the histograms in files generated through yori is to parse the full YAML_config metadata. It would be useful to have the edges saved either as a group variable or group metadata within each gridded variable's group so that histogram data can be easily obtained when reading individual groups.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/22Greater than 2-D data throws error on histogram2023-05-03T20:25:46ZEthan Nelsonethan.nelson@ssec.wisc.eduGreater than 2-D data throws error on histogramWhen trying to histogram data with more than two dimensions, yori raises an error:
```
Traceback (most recent call last):
File "monitor.py", line 71, in yori_aggregate
callYori('%s-settings.yaml' % product, new_f, new_grid_f)
Fi...When trying to histogram data with more than two dimensions, yori raises an error:
```
Traceback (most recent call last):
File "monitor.py", line 71, in yori_aggregate
callYori('%s-settings.yaml' % product, new_f, new_grid_f)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/run_yori.py", line 187, in callYori
Yori.runYori(debug, compression)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/run_yori.py", line 149, in runYori
tmp_gridvar = myGrid.compute_stats(masked_data_in, var_name_out)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/gridtools.py", line 100, in compute_stats
var = reformat_vector(var)
File "/home/enelson/venv/lib/python3.6/site-packages/yori-1.3.11.dev0+g7e044fb.d20190927-py3.6.egg/yori/gridtools.py", line 251, in reformat_vector
vec = np.reshape(vec, np.shape(vec)[0]*np.shape(vec)[1])
File "<__array_function__ internals>", line 6, in reshape
File "/home/enelson/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 301, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/home/enelson/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
return bound(*args, **kwds)
ValueError: cannot reshape array of size 1131200 into shape (2828,)
```
This error stems from a hard-coded assumption of data shape as 2-D in the `reformat_vector` function. (It would have actually been raised sooner in execution, https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/gridtools.py#L23, except I formatted my input lat/lon arrays incorrectly to be 2-D instead of 3-D.)
One path forward may be to change [this line](https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/gridtools.py#L251) in the `reformat_vector` function to [`numpy.flatten`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html) to handle arrays of higher dimension. The program will then make sure that lat/lon and data variables are of the same length after flattening (https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/gridtools.py#L103).
Another may be to enforce a dimension limit in the program by checking if all inputs are 2-D early in the execution. I don't know how that aligns with the program philosophy--I can see a case of someone wanting to grid data with shape `(time, lat, lon)`.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/21Specifying histogram without bin edges causes error in aggregate2020-02-19T20:44:53ZEthan Nelsonethan.nelson@ssec.wisc.eduSpecifying histogram without bin edges causes error in aggregateWhen running yori to grid files with a histogram specified using `(start, stop, end)` as opposed to `(edges)`, an error is thrown when running the aggregate code.
This is because when the yori grid code sees a histogram specified withou...When running yori to grid files with a histogram specified using `(start, stop, end)` as opposed to `(edges)`, an error is thrown when running the aggregate code.
This is because when the yori grid code sees a histogram specified without edges in the settings, it calculates the edges internally but never modifies the YAML string variable. That string variable is then written out in the gridded file metadata.
When the aggregate code reads the metadata, it cannot find edges and throws an error.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/20daily aggregation vs C6?2019-03-27T18:32:45ZSteve Dutcherdaily aggregation vs C6?I believe this is the current logic we use for daily aggregation in Yori:
```
180W to 90W: use 3z on target day to 3z on next day
90W to 0W: use 21z on previous day to 21z on target day
0E to 90E: use 3z on target day to 3z on...I believe this is the current logic we use for daily aggregation in Yori:
```
180W to 90W: use 3z on target day to 3z on next day
90W to 0W: use 21z on previous day to 21z on target day
0E to 90E: use 3z on target day to 3z on next day
90E to 180E: use 21z on previous day to 21z on target day
```
on the PLAT-33 Jira ticket, Paul Hubanks was saying this is the logic they use for C6.
```
For Aqua
Longitude Zone [-180 to -90]
Aqua Late Day Only: 24 hours starting at 03:00 GMT
Longitude Zone [-90 to -0]
Aqua Standard Day: 24 hours starting at 00:00 GMT
Longitude Zone [0 to 90]
Aqua Late Day Only: 24 hours starting at 03:00 GMT
Longitude Zone [90 to 180]
Aqua Standard Day: 24 hours starting at 00:00 GMT
For Terra
Longitude Zone [-180 to -90]
Terra Standard Day: 24 hours starting at 00:00 GMT
Longitude Zone [-90 to -0]
Terra Early Day Only: 24 hours ending at 21:00 GMT
Longitude Zone [0 to 90]
Terra Standard Day: 24 hours starting at 00:00 GMT
Longitude Zone [90 to 180]
Terra Early Day Only: 24 hours ending at 21:00 GMT
```
So Aqua they use the same zones as we do for 1 & 3 and for Terra they use the same zones we do for 2 & 4. The question we have is does our aggregation produce a similar result? In other words does using all 4 zones for both satellites cause differences in the product. I know there will be differences on granules near the poles but for granules near the equator I suspect not.
What I would suggest that you do is make a test branch of yori with 2 new command lines arguments, something like aqua_daily and terra_daily where it uses the logic above. Then create a L3 product for both Aqua and Terra using both our daily aggregation and then the corresponding new aqua_daily and terra_daily. You should be able to validate the result by gridding up observation time and then comparing the two datasets.
Paolo,
Do you have the free cycles to look into this?https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/19Version 1.3.1 crashes2018-11-09T19:47:15ZGreg Quinngreg.quinn@ssec.wisc.eduVersion 1.3.1 crashesI tried out the newly-installed Yori version 1.3.1 with the following config:
```yaml
grid_settings:
gridsize: 0.25
projection: conformal
lat_in: MODIS_Latitude
lon_in: MODIS_Longitude
variable_settings:
- name_in: MO...I tried out the newly-installed Yori version 1.3.1 with the following config:
```yaml
grid_settings:
gridsize: 0.25
projection: conformal
lat_in: MODIS_Latitude
lon_in: MODIS_Longitude
variable_settings:
- name_in: MODIS_SensorZenith
name_out: modis_sat_zen
attributes:
- name: long_name
value: MODIS satellite zenith angle
- name: units
value: degrees
```
Running Yori gave me the following:
```
[gregq@sipsdev yori]$ time /mnt/software/support/yori/1.3.1/bin/yori-grid -c 5 yori.yaml match_l1.modis_20180409T091500.viirs_20180409T090600.nc out.nc
Traceback (most recent call last):
File "/mnt/software/support/yori/1.3.1/bin/yori-grid", line 11, in <module>
sys.exit(main())
File "/mnt/software/support/yori/1.3.1/lib/python3.6/site-packages/yori/tools/grid.py", line 27, in main
compression=args.compression)
File "/mnt/software/support/yori/1.3.1/lib/python3.6/site-packages/yori/run_yori.py", line 186, in callYori
Yori = SetupYori(config_file, input_file, output_file)
File "/mnt/software/support/yori/1.3.1/lib/python3.6/site-packages/yori/run_yori.py", line 43, in __init__
if self.ymlSet.grid_settings['master_masks']:
File "/mnt/software/support/yori/1.3.1/lib/python3.6/site-packages/ruamel/yaml/comments.py", line 747, in __getitem__
return ordereddict.__getitem__(self, key)
KeyError: 'master_masks'
```
I also tried defining `master_masks` to an empty list in my config but that resulted in a crash as well.
The input file I am using is `/mnt/dawg/products/intercal/modis-viirs/1.0dev6/2018/099/match_l1.modis_20180409T091500.viirs_20180409T090600.nc`.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/18yori aggregation2018-10-17T18:57:39ZSteve Dutcheryori aggregationYori is attempting to run on Paul's files, the files have been compressed. Anyways, its trying to run on 6 hours of data right now, so not even a full day, this is what it look like in top:
> PID USER PR NI VIRT RES SHR...Yori is attempting to run on Paul's files, the files have been compressed. Anyways, its trying to run on 6 hours of data right now, so not even a full day, this is what it look like in top:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10528 flo 20 0 69.1g 54.4g 4144 D 22.9 86.6 636:35.33 yori-aggr
I can't imagine that it is going to finish and even if it does its using 86.6% of the memory to run on 6 hours of data so I don't see how we can make it work on a whole day. Can you review your code to see if there are any memory optimization. I.e. not holding on to everything in memory, possibly writing out a variable as its produced and so on.
You can find sample data for testing here:
>[steved@sipsdev ~]$ du -sh /mnt/dawg/dev/viirs/snpp/VNPCLDPROP_G3/0.1dev4/2014/*
3.9G /mnt/dawg/dev/viirs/snpp/VNPCLDPROP_G3/0.1dev4/2014/032
31G /mnt/dawg/dev/viirs/snpp/VNPCLDPROP_G3/0.1dev4/2014/033
4.0G /mnt/dawg/dev/viirs/snpp/VNPCLDPROP_G3/0.1dev4/2014/034
That is one full day plus 3 hours before and 3 hours afterhttps://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/17yori granule level output file size2018-10-17T18:56:28ZSteve Dutcheryori granule level output file sizeSo Paul's most recent attempt at using Yori is putting out 23 GB granule files. Yes, he is gridding a lot of stuff, by my count its 103 variables and 122 JHistos:
`[steved@sipsdev 20180912-1]$ pwd
/mnt/deliveredcode/deliveries/cldprop_...So Paul's most recent attempt at using Yori is putting out 23 GB granule files. Yes, he is gridding a lot of stuff, by my count its 103 variables and 122 JHistos:
`[steved@sipsdev 20180912-1]$ pwd
/mnt/deliveredcode/deliveries/cldprop_preyori/20180912-1
[steved@sipsdev 20180912-1]$ grep name_out dist/config_v0.0.3.yml | grep -v JHisto | wc -l
103
[steved@sipsdev 20180912-1]$ grep name_out dist/config_v0.0.3.yml | grep JHisto | wc -l
122`
He takes the original 447 MB L2 file, writes out a 167 MB filtered file and then when it runs through Yori it becomes a 23 GB file. Now, if I apply compression to that Yori output file it drops it to 137 MB.
In my opinion that output size of 23 GB is a little big. I'm kind of the opinion that we should have compression and chunking turned on by default. So, you might want to consider this.
Also one other thing I noticed, the n_points variables are all doubles. Is there a reason that these are not integers (possibly unsigned integers with a fill value)?https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/16Alexa - Warning Message2018-07-25T19:12:32ZPaolo VeglioAlexa - Warning MessageAlexa got this warning message while running `yori-grid`:
```
/data/pveglio/software/yori/1.2.4/lib/python2.7/site-packages/yori/run_yori.py:119: RuntimeWarning: invalid value encountered in greater
tmp_data_in = data_in[var_name_in]...Alexa got this warning message while running `yori-grid`:
```
/data/pveglio/software/yori/1.2.4/lib/python2.7/site-packages/yori/run_yori.py:119: RuntimeWarning: invalid value encountered in greater
tmp_data_in = data_in[var_name_in][:]
```
The test files are on `globemaster:/data/aross/testyori/for_paolo/`https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/15grid vars nan values breaking aggr2018-04-20T17:28:00ZBruce Flynnbruce.flynn@ssec.wisc.edugrid vars nan values breaking aggrAt the gridding step the array math is resulting in NaN value that cause issues at the aggregation step. This may be due to a change in numpy, I'm not sure.
Evidence of this issue can be seen in a warning printed to stderr:
```
/data/br...At the gridding step the array math is resulting in NaN value that cause issues at the aggregation step. This may be due to a change in numpy, I'm not sure.
Evidence of this issue can be seen in a warning printed to stderr:
```
/data/brucef/code/yori/yori/gridtools.py:133: UserWarning: Warning: converting a masked element to nan.
varmean[sgrididx[grid_box_idx[i]]] = np.mean( var[sortidx[grid_box_idx[i]:grid_box_idx[i+1]]] )
```
`var` is a masked array and `np.mean` is resulting in a masked value. However, `varmean` is not a masked array, so setting the masked value result of `np.mean` to a non-masked array numpy sets it to `nan` resulting in the warning.https://gitlab.ssec.wisc.edu/pveglio/yori/-/issues/14Multiple overpasses2017-11-07T19:39:59ZPaolo VeglioMultiple overpassesBryan asked if there's a way of dealing with multiple overpasses.
Specifically, how to solve the problem of having data from multiple orbits in the same grid cell.
The goal would be to have a grid cell filled with only one overpass per...Bryan asked if there's a way of dealing with multiple overpasses.
Specifically, how to solve the problem of having data from multiple orbits in the same grid cell.
The goal would be to have a grid cell filled with only one overpass per day, so that it isn't filled with data hours apart where the conditions might have changed.
edit:
just noticed that this is the same issue #5 that Steve opened a while ago. I'll leave it here for now, so that we know that there's more than one person asking for this feature