Provide flexibility for some vars missing in some files
Perhaps this isn't a desired functionality use case?...
Let's say a user wants to aggregate into one file a variety of files that have different variables--call the aggregated file a smorgasbord of variables. Groups of files that have different variables will have their own settings files to grid those respective variables. However, when aggregation is called, yori could be flexible about whether a variable needs to be in every gridded file or not.
Of course, this introduces some issues where the user may not be aware of what is happening and may set their robotic lawnmower to actually run over the hedges and the grass instead of just the grass. So, if this type of feature were to be included, there would need to probably be 1) user opt-in through some keyword in the aggregate function (require_all_vars=False
or something like that), and 2) some indication in the metadata of the aggregated file.
In terms of implementing the functionality (barring the above considerations), I think it would mostly require changing https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/yori_aggregate.py#L204 (var_in = var_tmp
) to be a loop that defines in a loop over keys instead of overwriting the full dictionary.
The user check I mentioned could then go somewhere near the top of the file loop (e.g. https://gitlab.ssec.wisc.edu/pveglio/yori/blob/master/yori/yori_aggregate.py#L147 after group_list = iou.netcdfStructure(fin)
):
if require_all_vars and sorted(group_list) != sorted(var_in.keys()): ## may have to be smarter about this comparison depending on the types of those two variables
raise Exception('variable list in file %s does not match prior: %s' % (fin, var_in.keys()))
Example
file_1 vars | file_2 vars |
---|---|
SST | SST |
u-wind | |
v-wind | |
rain |
settings1.yaml
would define gridding for SST, u-wind, and v-wind. settings2.yaml
would define gridding for SST and rain.
The user calls gridding on each one:
callYori('settings1.yaml', 'file_1', 'gridded_file_1.nc')
callYori('settings2.yaml', 'file_2', 'gridded_file_2.nc')
The user then aggregates the files:
aggregate(['gridded_file_1.nc', 'gridded_file_2.nc'], 'smorgasbord.nc', require_all_vars=False)
Then ncdump('smorgasbord.nc')
would include gridded variables of SST, u-wind, v-wind, and rain. u-wind, v-wind would only be contributed to by file_1
,rain would only be contributed to by file_2
, and sst would be contributed to by file_1
and file_2
.
I can definitely work on adding this if it would make sense for this to be a feature.